Summary
Overview
Work History
Education
Skills
Timeline
Work status
Generic

AMIT SHARMA

Austin,TX

Summary

Results-driven Senior Data Engineer with extensive experience designing and scaling cloud-based data platforms, machine learning pipelines, and feature stores. Proven record of transforming complex data ecosystems into performant, reliable, and cost-efficient systems serving millions of users. Skilled in architecting data models, optimizing pipelines, and building ML-ready data infrastructure leveraging AWS, Spark, and modern orchestration tools. Passionate about bridging data engineering and ML, driving automation, and delivering business impact through clean, reliable, and scalable data systems.

Overview

17
17
years of professional experience

Work History

Senior Data/Software Engineer

Amazon
07.2024 - Current
  • Redesigned ingestions module for RED-certified data platform serving 500+ scientists, reducing ingestions time by 80% and cost by 70% (AWS CDK, Glue, Step Functions, DynamoDB).
  • Developed a CLI-based ingestion tool with security and proper data governance, enabling self-service ingestion and saving 1,000+ engineering hours annually (Python, Click, AWS IAM, Lambda).
  • Partnered with scientists, analysts and PMs to define data modelling strategy for platform powering 1,200+ datasets, improved query performance by 40% and 99% reduction in performance incidents (Redshift, Spark, Glue, Athena).
  • Built a feature platform ingesting hundreds of machine learning features into a centralized store, powering attrition forecasting for 600K+ employees (AWS CDK, Glue, PySpark, Sagemaker, Feature Store).

Staff Data Engineer, Data Platform

RetailMeNot
08.2019 - 06.2024
  • Redesigned the machine learning pipeline, delivering 60% annual cost savings and 90% faster onboarding of curated offers (Spark, Scala, ETL, snowflake, APIs, PostgreSQL, Python, Kubernetes).
  • Built a near real-time ranking reprocessing solution, resolving compliance issues, improving data accuracy, and driving a +1% revenue lift (Kinesis, Lambda, PostgreSQL).
  • Led personalization initiative processing huge data to generate merchant to merchant and user to merchant affinity, achieved 30% higher click through rate for email clicks (Spark, Python, Scala).
  • Designed and implemented a click-events data model, enabling accurate attribution by linking impressions, clicks, and transactions, resulting in a 50% reduction in data quality issues (PostgreSQL, Redshift, Glue).
  • Implemented best practice and data strategy to ensure data quality, storage optimization and efficiently retrieval of events data, processing hundreds of millions of records achieved 40% less processing cost.
  • Led migration of 200+ ETL pipelines from Luigi to Airflow and GitLab to GitHub Actions, improving CI/CD and orchestration process.

Big Data Engineer

TiVo
08.2016 - 08.2019
  • Built a configurable ETL pipeline ingesting thousands of household attributes by modifying configuration entries, eliminating manual processes and reducing effort by 70% (Python, Spark, Scala, ETL).
  • Migrated the catalog-matching application to a distributed architecture, improving maintainability, reducing costs by 40%, and enhancing SLAs by 50% (Spark, Hadoop, Scala).
  • Implemented an ad-exposure pipeline leveraging TiVo’s viewership data to compute ad exposure counts across the US, enabling advertisers with precise insights (Python, Spark, Scala, Presto, ETL).

Lead Software Engineer, Data Engineering

Cognizant
11.2012 - 08.2016
  • Developed ETL Process to generate Probable maximum loss based on natural disaster in USA, resulted in 95%-time reduction comparing to legacy manual process.
  • Developed Application to Analyze data based on catastrophic model Python, SQL server, resulted in reducing data analysis efforts by 90%.

System Engineer

Tata Consultancy Services
09.2008 - 11.2012
  • Built system to perform message transfer using IBM MQ resulting in 20% reduction in missed SLA.
  • Design and developed Software application for customer query resolutions system using Python and Oracle.
  • Design data model for data warehouse application.

Education

Bachelor of Engineering (B.E.) - Electronics & Communications

Rajiv Gandhi Technical University
India
01.2008

Skills

  • Languages & Tools: Python, SQL, Scala, Spark, Airflow, Glue, Kafka, Snowflake, Hive, Iceberg
  • Cloud & DevOps: AWS (S3, EMR, Athena, Lambda, Kinesis, CDK, Bedrock), GCP, Terraform, Docker, Kubernetes, GitHub Actions
  • Data & ML: Data Modeling, Batch & Streaming Pipelines, Feature Store, Data Quality & Governance, Distributed Systems, SageMaker, Feature Engineering, Data-driven Personalization
  • Databases: PostgreSQL, MySQL, DynamoDB, SQL Server, Redshift

Timeline

Senior Data/Software Engineer

Amazon
07.2024 - Current

Staff Data Engineer, Data Platform

RetailMeNot
08.2019 - 06.2024

Big Data Engineer

TiVo
08.2016 - 08.2019

Lead Software Engineer, Data Engineering

Cognizant
11.2012 - 08.2016

System Engineer

Tata Consultancy Services
09.2008 - 11.2012

Bachelor of Engineering (B.E.) - Electronics & Communications

Rajiv Gandhi Technical University

Work status

H1-B
AMIT SHARMA