Summary
Overview
Work History
Education
Skills
Certification
Languages
Timeline
Generic
Sanjeev Pandey

Sanjeev Pandey

New York City,NY

Summary

Experienced Data Engineer with 10+ years in designing and implementing scalable data pipelines, ETL processes, and cloud-based solutions. Proven expertise in Azure and AWS services, Python, SQL, and data modeling across diverse industries, including healthcare, telecom, retail, and education. Adept at leveraging cutting-edge technologies like Snowflake, Azure Databricks,dbt, and Apache Airflow to deliver data-driven solutions that optimize performance and reduce costs.

Overview

13
13
years of professional experience
1
1
Certification

Work History

Senior Analyst

CGI
05.2024 - Current
  • Designed and implemented a data pipeline leveraging Apache Airflow and Azure Databricks and Azure Data Factory using Python to process and extract healthcare eligibility data for members and loaded it to Snowflake.
  • Designed and implemented SCD Type 2 (Slowly Changing Dimensions) using DBT to process “life cycle of member health plan” data. We stored data in Snowflake, and orchestrated workflows with Apache Airflow.
  • Written DBT test cases to check integrity of data in snowflake warehouse.

Senior Data Engineer (Specialist)

McKinsey & Company
07.2022 - 03.2024
  • Developed a robust, scalable data pipeline using Apache Airflow and Azure Databricks for conflict resolution.
  • Automated ETL workflows with AWS Glue, PySpark, and Step Functions, Python to transform and analyze large datasets.
  • Reduced storage costs by 60% through log management and compressed file formats (e.g., Parquet).
  • Migrated infrastructure to serverless architecture using AWS Lambda(Python) and Aurora Serverless, reducing complexity and cost.
  • Designed over 20 Tableau dashboards for real-time insights, leveraging Tableau Prep for data preparation.
  • Developed a serverless system using AWS Lambda(Python) to store and manage patient health information. Implemented robust data security with AWS KMS for encryption and ensured compliance by logging all data retrieval activities with AWS CloudWatch and CloudTrail. Designed for scalability, reliability, and data confidentiality in a healthcare context.

Lead Data Engineer

Accenture
03.2019 - 12.2021
  • Led serverless data reporting projects using AWS Lambda(Python), Step Functions, and PySpark for seamless scalability.
  • Designed data models and forecasting systems for workforce analytics, improving resource allocation accuracy.
  • Delivered end-to-end data ingestion pipelines leveraging AWS Glue, S3, and Redshift.
  • Streamlined reporting processes, reducing turnaround time from hours to minutes by optimizing database models.
  • Worked on DBfit test cases for unit testing of stored procedures.

Senior Software Engineer

GGK Technology
05.2015 - 03.2019
  • Migrated legacy ETL workflows from on-premises systems to AWS glue , scala, reducing runtime by 40%.
  • Built serverless APIs and automated data ingestion pipelines for web and mobile applications.
  • Implemented business intelligence dashboards using Tableau, enabling actionable insights for telecom clients to reduce churn.

Software Engineer

Igate (Capgemini)
09.2012 - 04.2015
  • Designed and implemented ETL workflows using SSIS to consolidate multi-source data into a centralized warehouse.
  • Enhanced SQL procedures, improving performance by 70% and saving 20+ hours monthly in manual reporting tasks.

Education

B.Tech - Computer Science

College of Engineering Roorkee
04-2011

Skills

  • Experienced in technical mentoring and knowledge sharing
  • Strong sense of ownership and accountability in project execution
  • Skilled in client communication and stakeholder management

Certification

  • AWS Professional Solutions Architect (and 8 additional AWS certifications)
  • Azure Data Engineer Associate
  • Databricks certified Engineer
  • Terraform Associate
  • Apache Airflow Certified
  • TensorFlow Developer Associate
  • Credential authenticity: Credly Badges, Databricks and Tensorflow

Languages

English
Full Professional
Hindi
Native or Bilingual

Timeline

Senior Analyst

CGI
05.2024 - Current

Senior Data Engineer (Specialist)

McKinsey & Company
07.2022 - 03.2024

Lead Data Engineer

Accenture
03.2019 - 12.2021

Senior Software Engineer

GGK Technology
05.2015 - 03.2019

Software Engineer

Igate (Capgemini)
09.2012 - 04.2015

B.Tech - Computer Science

College of Engineering Roorkee
Sanjeev Pandey