Results-driven Azure Data Engineer with expertise in Azure Data Factory, Databricks, and cloud-based ETL development. Skilled in building scalable pipelines, optimizing Spark transformations, and streamlining data flows to improve reporting speed and accuracy. Adept at ensuring data integrity, governance, and delivering actionable insights for business decision-making.
1. Designed and developed end-to-end ETL pipelines in Azure Data Factory (ADF) to extract on-premises data, transform it in Databricks (PySpark/Spark SQL), and load curated datasets into Azure Analysis Services for Power BI reporting.
2. Implemented a Medallion Architecture (Bronze → Silver → Gold) in Delta Lake, standardizing ingestion, cleansing, and business-ready layers.
3. Developed control tables and incremental load strategies (full vs delta) to handle high-volume records while maintaining accuracy and consistency.
4. Developed and optimized 50+ Databricks notebooks using PySpark and Spark SQL to process large volumes of data with high accuracy.
5. Configured parallel notebook execution in Databricks, reducing pipeline runtime to under one hour and ensuring timely data availability for reporting.
6. Created metadata, data lineage, and documentation for all pipelines, ensuring compliance and audit readiness.
7. Integrated Azure Key Vault with Databricks and ADF for secure credential and connection management.
8. Introduced Azure DevOps CI/CD pipelines for version control and automated deployments of notebooks and ADF pipelines across dev, test, and prod.
9. Acted as the sole Data Engineer, responsible for end-to-end design, development, deployment, and support of all cloud data solutions.
10. Designed and deployed semantic models in Power BI and Azure Analysis Services, enabling dashboards, reporting, and clinical decision support.
1. Built scalable and fault-tolerant ETL/ELT pipelines using Azure Data Factory and Databricks to process structured and semi-structured datasets.
2. Created PySpark and SparkSQL scripts to transform and prepare datasets for downstream consumption.
3. Implemented data quality checks and validation scripts to improve accuracy before applying transformations.
4. Worked extensively with Parquet, CSV, and JSON file formats, handling schema evolution and ingestion scenarios.
5. Developed generic Databricks notebooks to standardize repetitive tasks, reducing code redundancy and improving reusability.
6. Designed and configured Azure Data Factory pipelines to ingest and combine raw data from multiple on-prem and cloud sources.
7. Integrated ADF with Databricks notebooks, triggering PySpark and SparkSQL transformations as part of end-to-end workflows.
1. Designed and executed complex SQL queries to extract, clean, and transform large datasets from multiple relational sources, ensuring accuracy and consistency in reporting.
2. Developed SQL-based views and tables to support reporting and analytical needs, reducing redundancy and improving performance for recurring queries.
3. Assisted in building Power BI dashboards connected to SQL queries, providing visualization of KPIs while ensuring the backend queries returned consistent results.
4. Built ad-hoc queries for stakeholders to answer operational and financial questions, enabling faster decision-making.
5. Performed data validation and profiling in SQL to detect anomalies, improve data integrity, and ensure accuracy in business-critical reports.
1. Earned Microsoft Certified: Azure Data Engineer Associate (DP-203), validating expertise in designing and implementing scalable data pipelines on Azure.
2. Achieved Databricks Certified Data Engineer Associate, demonstrating ability to optimize Spark-based transformations and manage Delta Lake for analytics.
3. Earned Lean White Belt Certification from the University of Oklahoma Lean Institute, demonstrating knowledge of Lean principles and process improvement strategies to optimize workflows and reduce inefficiencies.
4. Designed and deployed production-ready ETL pipelines in Azure Data Factory and Databricks, reducing reporting cycle time by over 40% and ensuring timely insights for business teams.
Microsoft Certified: Azure Data Engineer Associate (DP-203)
link: https://learn.microsoft.com/api/credentials/share/en-us/31632010/D09000EF84CD8ADC?sharingId=A8B6C6B3C015ED83
Databricks Certified Data Engineer Associate
link: https://credentials.databricks.com/e6c8855a-1839-4690-97ce-2e166c22b1a7#acc.1PjtwEDb
Microsoft Certified: Azure Data Fundamentals (DP-900)
link: https://learn.microsoft.com/api/credentials/share/en-us/31632010/9AFFE4E49E4145BC?sharingId=A8B6C6B3C015ED83
Microsoft Certified: Azure Fundamentals (AZ-900)
link: https://learn.microsoft.com/api/credentials/share/en-us/31632010/65DBE0A98E65B61?sharingId=A8B6C6B3C015ED83