
Data Engineer with 10+ years of experience in designing and delivering large-scale data applications leveraging Big Data technologies and Cloud based Data engineering platforms. Hands on experience on Unified Data Engineering with Azure Data Factory, AWS, GCP, Snowflake, Databricks, Designed and developed Databricks Notebooks leveraging Python, Apache Spark, and SQL to build scalable data pipelines. Designed and implemented end-to-end data processing pipelines using Azure Data Factory, Databricks, Azure Functions, Triggers and Azure Key Vault to address complex business use cases and enable real-time analytics and monitoring. Good understanding of Spark Architecture with Databricks, Structured Streaming. Setting Up Integration Azure Data Factory, AWS and GCP with Databricks Workspace for Business Analytics, Manage Clusters in Databricks and managing the Machine Learning Lifecycle. Implemented Data Encryption and Access Control Measures using Azure Key Vault, Azure AD, and Microsoft Defender for Cloud, securing sensitive data and credentials. Built a robust and scalable data pipeline using AWS services including S3, Glue, Redshift, Lambda, CloudWatch, Kinesis, Step Functions, and Secrets Manager to orchestrate ETL workflows, manage secure data access, and support business intelligence reporting. Designed and delivered fact and dimension tables to support BI reporting and executive-level visualization in Power BI and Tableau, enabling data-driven decision-making for key management. Skilled in query optimization, data workflows using Alteryx and data visualization presentations using Power BI, and Quick Sight. Built pipelines in ADF using Datasets/Linked Services/Pipeline to extract, load and transform data from various sources such as Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool, and backwards. Developed scalable ETL pipelines on GCP using Dataflow, Big Query, and Cloud Storage to ingest, process, and transform data from sources like Cloud SQL, on-premises databases, Pub/Sub, and SaaS platforms automated workflows with Cloud Functions and Cloud Composer. Built robust data pipelines in Snowflake using Streams, Tasks, and Snowpipe to extract, load, and transform data from diverse sources such as AWS S3, Azure Blob Storage, on-prem SQL systems, and third-party APIs; enabled near real-time ingestion and transformation for analytics and reporting. End-to-end data pipelines using AWS Glue, Lambda, and Step Functions to extract, transform, and load data from Amazon S3, RDS, Redshift, and external APIs, ensured data quality and orchestration for both batch and near real-time processing scenarios. Migrated on-premises data from SQL and Oracle to Azure Data Lake Storage (ADLS Gen2) using Azure Data Factory ensuring scalable and efficient cloud data integration. Good knowledge of batch and historical data processing using Hive, Pig, and Databricks, enabling retrospective trend analysis, advanced feature engineering, and generation of ML-ready datasets for predictive modeling. Proficient in implementing CI/CD frameworks for data pipelines using tools like Jenkins, ensuring efficient automation and deployment. Configured data quality checks using Great Expectations, providing real-time monitoring and automatic alerting when validation rules were violated. Proficient in cloud migration strategies with a focus on Azure, AWS, Snowflake and GCP, including Azure Migrate, Lift and Shift methodologies. Managed end-to-end deployment workflows for data processing jobs, ensuring smooth transitions from development to production environments using Bamboo. With experience in delivering business intelligence solutions, developing executive dashboards and self-service BI using Tableau, Power BI, Looker, and Looker Studio to visualize KPIs, operational metrics, and compliance outcomes. Proficient in sorting, analyzing, and integrating data using GCP (Big Query, Cloud Dataflow), AWS (Glue, Redshift), and Snowflake (SQL, Streams & Tasks, Snow pipe). Proficient in configuring Azure DevOps pipelines and GitHub for continuous integration, automated testing, and streamlined code delivery. Experienced in dimensional modeling, data migration, cleansing, and ETL processes for data warehousing. Skilled in statistical modeling, machine learning, and decision trees, with a strong foundation in ETL processes and data transformation in AWS, GCP, Snowflake and Azure. Experienced in developing, maintaining, and supporting data pipelines using Azure Data Factory (ADF), Azure Data Lake Storage (ADLS), Azure Synapse Analytics, IBM DataStage jobs, custom frameworks, and related technologies. Worked in the areas of analysis, design, development, production support and implementation phases of Datawarehouse and BI application. Extensively worked on Datawarehouse concepts like Data Modeling, ETL Jobs, Data Marts and ETL Framework. Have extensive experience with Power BI, encompassing data visualization, report creation, and dashboard development. Hands-on experience in designing and implementing ETL/ELT pipelines to ingest semi-structured and nested JSON data from MongoDB and MongoDB Atlas into cloud data warehouses like Snowflake. Proficient in utilizing Spark Core and Spark SQL scripts using Scala to accelerate data processing capabilities. Good Experience in SQL (SSIS, SSAS, SSRS), T-SQL, PLSQL, Unix, Microsoft Power BI, Office 365.