Summary
Overview
Work History
Education
Skills
LEADERSHIP & COMMUNITY
Timeline
background-images

SHAIVY MITTAL

San Francisco,CA

Summary

Senior Data Engineer with 5+ years of experience building scalable AWS data platforms, PySpark/Glue ETL pipelines, and analytics solutions. Skilled in integrating AI/LLM systems to automate insights, reduce operational overhead, and drive measurable business impact.

Overview

8
8
years of professional experience

Work History

Senior Data Engineer (Acting Scope)

Amazon Web Services (AWS)
10.2024 - Current

Project IRIS  AI-Powered Analytics & Executive Summaries (Top Impact)

  • Designed and launched an AI summarization pipeline for 200+ WBR/DBR reports consumed by L8+ leaders across ~15 AWS services.
  • Built a Python Lambda summarizer invoking AWS Bedrock LLMs and integrated it into existing report-generation flows.
  • Created a prompt library, tuned model context windows, and executed end-to-end accuracy testing to finalize production prompts.
  • Impact: Reduced PM analysis + summary creation effort by ~90%, saving 5-6 hours per PM per week across services.


SageMaker Unified Studio - Cross-Service Data Platform

  • Onboarded data for a 2024-launched service; ingested operational, revenue, and clickstream signals into a multi-layer S3 data lake.
  • Built PySpark Glue ETL pipelines (CDK-deployed) supporting ingestion, schema evolution, validation, partitioning, and $200k influenced revenue attribution across 8 AWS services.
  • Developed QuickSight dashboards, JSON reports, and AI-assisted analysis scripts, reducing PM analysis workload by 4-5 hours weekly.
  • Leveraged Cedric and Quicksuite to generate automated customer funnel, usage, and revenue stories cutting PM effort by ~50%.

BI Engineer II / I

Amazon Web Services (AWS)
08.2021 - 09.2024
  • Refactored telemetry pipelines; redesigned 60+ datasets, 8 DBRs, and 4 WBRs using S3 compaction, partition pruning, and Parquet optimization.
  • Achieved 83% query runtime reduction through CDK-driven infra improvements, Redshift materialized views, and SQL rewrites.
  • Built cross-service revenue attribution ETLs using PySpark Glue, reducing analyst manual work by 4060%.
  • Created a reusable Glue + Athena ETL framework for standardized ingestion and metric computation, cutting investigation time by 50%.
  • Designed EMR usage + revenue ETLs and a QuickSight dashboard enabling leadership insights and reducing PM analysis by 4-5 hours weekly.

Data Engineering & Analytics

Qualcomm Inc.
01.2018 - 01.2021
  • Built ETL pipelines using Informatica, Oracle SQL, and Python; automated data validation frameworks.
  • Migrated OBIEE reporting to SAP Concur and improved ETL scheduling performance.

Education

M.S. - Information Technology & Management

University of Texas at Dallas

B.E. - Computer Science

R.G.P.V University
India

Skills

  • Cloud/Data Platforms: AWS (S3, Glue, Redshift, Athena, EMR, Lambda, Step Functions, Lake Formation, IAM)
  • Data Engineering: PySpark, Glue ETL, CDK, Data Lake Architecture, Fact/Dim Modeling, Parquet/Hudi, ETL CI/CD
  • Programming & Automation: Python, SQL, Spark, API ingestion, event-driven pipelines
  • Analytics & BI: QuickSight, Tableau, Presto/Athena, Cedric, Quicksuite
  • AI/ML: LLM automation, Prompt Engineering, AWS Bedrock/SageMaker
  • Certifications: Tableau Desktop Specialist, SAS-UTD BI & Data Mining

LEADERSHIP & COMMUNITY

Board Member, Asians at Amazon Bay Area — Lead cultural & community programs

Timeline

Senior Data Engineer (Acting Scope)

Amazon Web Services (AWS)
10.2024 - Current

BI Engineer II / I

Amazon Web Services (AWS)
08.2021 - 09.2024

Data Engineering & Analytics

Qualcomm Inc.
01.2018 - 01.2021

B.E. - Computer Science

R.G.P.V University

M.S. - Information Technology & Management

University of Texas at Dallas
SHAIVY MITTAL