Summary
Overview
Work History
Education
Skills
Timeline
Generic

Vipul Srivastava

Redmond,United States

Summary

Accomplished Data Engineer with a proven track record at Microsoft, specializing in scalable data infrastructure and advanced SQL optimization. Expert in ML feature engineering and cross-functional collaboration, Enhanced pipeline performance, reducing latency by 65% while ensuring privacy compliance. Passionate about delivering high-quality, model-ready datasets for impactful analytics.

Overview

14
14
years of professional experience

Work History

Data Engineer 2

Microsoft
Redmond, United States
02.2022 - Current
  • Built scalable, low-latency data infrastructure for Microsoft’s ad monetization platform, enabling the ingestion and delivery of billions of telemetry events daily, which powered analytics, billing, and ML workflows across multiple teams.
  • Developed Cosmos-based pipelines with filtration and enrichment logic for clickstream data, integrating with staging areas, campaign marts, and billing systems.
  • Privacy & Compliance:Implemented privacy-preserving mechanisms including salted hashing and identifier swapping to ensure compliance with executive orders and internal data governance policies—enhancing user data protection and regulatory alignment across data pipelines.
  • Collaborated with cross-functional teams, including data science, privacy, and compliance, to deliver experimentation-ready datasets with built-in governance, lineage tracking, and SLA adherence.
  • Optimized SCOPE scripts and C# user-defined operators (UDOs) to enhance the performance and reliability of ad monetization pipelines, processing over 10B+ telemetry events daily. Refactored SCOPE logic to reduce job latency by 65% (4.2 hrs to 1.5 hrs), and cut Cosmos compute token usage by 2×, contributing to a projected storage saving.
  • Built a Python-based ScopePipelineValidationTool to automate validation of ad monetization workflows, featuring CLI modules for Cosmos job orchestration, data diffing with Pandas/DeepDiff, and checkpoint recovery-cutting validation time by 70%.
  • Drove operational excellence by leading incident response for critical telemetry pipelines, resolved high-impact issues, implemented automation and alert tuning, resulting in a 35% reduction in MTTR, and a 40% decrease in alert noise.

Senior Consultant

Deloitte Consulting
11.2013 - 02.2022
  • Led ETL modernization by evaluating data analytics capabilities across AWS, Azure, Snowflake, and OCI, producing a cloud selection playbook that improved performance and cut batch completion time by 30%.
  • Partnered with Business leadership to drive requirements, documented business rules, and designed the ETL Architecture of Data integration project worth $5M for supply chain domain of leading life sciences & health care company.

Software Engineer

HSBC Technology
07.2012 - 11.2013
  • Led design & implementation for ETL modules of a Data warehouse migration project to streamline the Profitability and cost management of large HSBC France which resulted in creating seamless user experience for customers and cost savings by decommissioning legacy systems.
  • Understood & analyzed the functional data to define the business logics/rules for project design.
  • Developed the Test cases for the analysis in HP Quality Centre (QC).
  • Worked on DataStage as ETL tool to develop jobs to load huge amount of data from multiple data sources into Teradata.

Systems Engineer

Infosys Limited
12.2010 - 07.2012
  • Worked on development and testing tasks using DataStage as an ETL tool to create jobs for loading large volumes of data from various sources into Teradata, enhancing the repository for Transportation & Logistics.
  • Fixed the design issues related to Logical Table Sources to reduce the consistency check time of the existing transportation repository from 1 hour to less than 5 minutes.

Education

Bachelor of Technology - Electricals and Electronics Engineering

Uttar Pradesh Tech. University(UPTU)
Lucknow, UP, India

Skills

  • ML feature engineering pipelines, model-ready datasets
  • Dimensional modeling, OLAP schema design
  • ETL/ELT orchestration
  • Advanced SQL optimization, analytical SQL
  • Big data infrastructure
  • Distributed data processing (eg, Cosmos, Spark)
  • Production-grade scripting (Python, C#)
  • Performance optimization
  • Cross-functional collaboration
  • Privacy compliance
  • AWS Tech stack - EC2, EMR, RedShift, Athena, Glue, S3, Kinesis, DynamoDb, Lambda

Timeline

Data Engineer 2

Microsoft
02.2022 - Current

Senior Consultant

Deloitte Consulting
11.2013 - 02.2022

Software Engineer

HSBC Technology
07.2012 - 11.2013

Systems Engineer

Infosys Limited
12.2010 - 07.2012

Bachelor of Technology - Electricals and Electronics Engineering

Uttar Pradesh Tech. University(UPTU)