
Senior Data Engineer with 10+ years of experience designing scalable data platforms and distributed systems in cloud-native environments. Expertise in PySpark, SQL, and real-time data processing, with a proven track record of migrating complex on-premise systems to Azure, Snowflake, and Databricks—delivering up to 75% cost reduction and 60% performance improvement. Experienced in building enterprise data lakehouse architectures, streaming pipelines (Kafka), and governed data platforms. Strong track record of leading end-to-end data engineering initiatives and translating business requirements into high-impact, data-driven solutions across healthcare and financial domains.
Data Lakehouse (Delta Lake), Medallion Architecture, Data Modeling, Distributed Systems, Data Mesh, Unity Catalog
Apache Spark (PySpark), Azure Databricks, Snowflake (Snowpipe, Tasks), Batch & Stream Processing
Python (Advanced), SQL (Expert – Performance Tuning), Java/Scala (Familiar)
Apache Kafka, Apache Airflow, Kubernetes (K8s), Docker, Terraform (IaC), CI/CD, Git
Data Governance, Data Lineage, Unity Catalog, Automated Data Quality & Reconciliation Frameworks, Power BI (DAX)