Summary

Overview

Work History

Education

Skills

Accomplishments

Timeline

Rohan Gupta

Redmond,WA

Summary

Engineering leader with 8+ years of experience building scalable, cost-efficient data platforms and leading org-wide initiatives across cloud, analytics, and infrastructure. Known for modernizing legacy systems, driving automation, and delivering measurable impact—cutting costs, reducing dev time, and enabling data-driven decision-making at scale.

Overview

years of professional experience

Work History

Data Engineer Manager

Amazon

Seattle, WA

10.2023 - Current

Led the development of a centralized Metric Marketplace, reducing the average time to create and launch new analytics metrics from 4 weeks to under 7 days—improving speed-to-insight by over 75%.
Reduced monthly infrastructure costs by ~$44K (29%) through targeted optimizations: lowered IO-optimized storage by 41%, scaled down oversized clusters by 33%, migrated workloads from on-demand to provisioned instances, removed obsolete snapshots, and archived logs to cost-efficient Glacier storage.
Driving a company-wide data platform strategy to reduce data lake footprint by 46% by enabling cross-platform querying (e.g., Redshift, AWS Glue), while establishing a unified, legally compliant data governance and access control framework across a large organization.
Implemented lean, agile operating models that cut internal meeting time by 30%, enhanced cross-team collaboration, and improved productivity through streamlined communication practices.
Defined and executed multi-quarter technical roadmaps, aligning teams across product, engineering, and leadership to deliver business-critical initiatives on schedule.
Excelled in navigating ambiguity, managing shifting executive priorities, and making strategic resourcing decisions to ensure delivery of high-impact goals under tight constraints.

Data Engineer II

Amazon

Seattle, WA

04.2021 - 09.2023

Led migration from a monolithic legacy system to a modular, cloud-native architecture on AWS, implementing solutions like a SQL Generation Framework, Metadata Control Store, and event-driven design—resulting in a 77% reduction in legacy jobs and significant operational overhead savings.
Developed reusable AWS CDK infrastructure constructs, enabling developers to rapidly deploy standardized environments and reducing infrastructure setup time by 15%, while driving consistency across codebases.
Built a custom ETL framework that reduced data pipeline development time by over 50%, streamlining onboarding of new data sources and accelerating feature delivery.
Redesigned the Intraday Metrics Architecture for the Operations domain, achieving 99.76% metric availability and expanding real-time insights to additional regions, including Mexico and Brazil.
Engineered an integrated data pipeline to ingest, process, and consolidate critical datasets (Connections, COVID, Clarity), delivering real-time employee risk signals used by HR VPs for strategic workforce decisions.

Data Engineer

Amazon

Seattle, WA

11.2019 - 03.2020

Refactored a legacy Clarity data unload pipeline, improving maintainability, modularity, and scalability by parameterizing metrics, generalizing SQL logic, and modularizing code—reducing failure rates to less than 2.4% and enabling uninterrupted operation for 4+ years across multiple tenants and data flows.
Designed and launched a Clarity Data Distribution Framework to automate cross-team data sharing of 600+ HR metrics, replacing manual SQL maintenance with a config-driven system. Enabled teams like Amazon Air and Perfect Mile to export historical and real-time data with minimal overhead—processing jobs in parallel and optimizing SQLs to reduce production load.
Led the decoupling of Clarity from a shared monolithic codebase, creating a standalone Python package and building a new CI/CD pipeline. Consolidated 80+ scripts into a single dynamic SQL executor, automated complex flow creation, and reduced manual dev effort by 90%, while improving system resilience and reducing dependency risks.

Big Data Engineer

ThoughtSpot

Sunnyvale, CA

02.2019 - 11.2019

Deployed distributed database clusters, bringing in a significant volume of data from multiple ETL pipelines to verify database performance through cluster operations.
Extrapolated insights by deducing distribution profile of checking velocity metrics for each branch separately and flagged instances of long lead times in release cycle.
Developed and implemented a longevity data consistency pipeline for an in-memory database that involved sharding tables, executing constant DML operations, and creating query traffic on the tables.
Developed an independent analytical platform that enables analysis of the code reviewing process for millions of code check-ins on the master branch, eliminating the need for reliance on Gerrit. The platform employs pagination to optimize data extraction from Gerrit servers and transform key-value storage into an efficient OLAP star schema. Additionally, the data is distributed across a multi-node cluster to ensure maximum parallel processing.
Contributed to enhancing efficiency by reducing the manual effort index of resources through automation scripts for broken pinboard objects, data consistency checks, and cluster restore/upgrade processes.

Data Analyst II

Larsen & Toubro Infotech, LTI

Mumbai, Maharashtra

10.2016 - 06.2017

Proficiently executed a multitude of responsibilities like analyzing requirements, designing tests/plans/executions as well as defect management within Agile/Scrum Projects. Demonstrated knowledge of different methodologies like Continuous Integration & Test Driven Development.
Developed an effective predictive model using Python for identifying faulty drilling/manufacturing equipment.
Developed PL/SQL packages to handle client-generated incidents.
Designed and implemented complex SQL queries to create quick reports, providing valuable insights into business metrics.
Developed java-based interface with SQL, PL/SQL, and HTTP-Tester to facilitate ETL process for client's database.

Data Analyst

Capgemini

Mumbai, Maharashtra

06.2014 - 10.2016

Analyzed vendor selection metrics using Python and SQL to derive insights.
Created python script to identify vendors frequently changing their prices after vendor selection.
Incorporated the impact of vendor price changes into a comprehensive and intuitive visualization dashboard for monitoring P2P cycle delays.
Utilized PL/SQL to create efficient background jobs that successfully extracted and removed duplicate invoices for fraud analysis purposes.
Devised efficient automation scripts that effectively handled log management tasks, leading to a 75% decrease in the need for manual upkeep.
Streamlined code review and testing process, resulting in a 10% increase in review efficiency.
Decreased Rework Effort Index by 30% through root cause analysis of critical software defects
Implemented triggers in the back-end to execute upon vendor transactions.

Education

Master of Science - Management Information Systems

University of Illinois At Chicago

Chicago, IL

12-2019

Bachelor of Science - Electronics And Communications Engineering

SRM University

Chennai

02-2014

Skills

Data Modeling
Machine Learning
Data Warehousing
Data Migration
Data Security
Performance Tuning
Scripting Languages

API Development
SQL and Databases
Software Development
Critical Thinking
Project Management
Data Mining

Accomplishments

Game Changer Award.
Innovation Maven Maverick.

Timeline

Data Engineer Manager

Amazon

10.2023 - Current

Data Engineer II

Amazon

04.2021 - 09.2023

Data Engineer

Amazon

11.2019 - 03.2020

Big Data Engineer

ThoughtSpot

02.2019 - 11.2019

Data Analyst II

Larsen & Toubro Infotech, LTI

10.2016 - 06.2017

Data Analyst

Capgemini

06.2014 - 10.2016

Master of Science - Management Information Systems

University of Illinois At Chicago

Bachelor of Science - Electronics And Communications Engineering

SRM University

Rohan Gupta

Summary

Overview

Work History

Data Engineer Manager

Data Engineer II

Data Engineer

Big Data Engineer

Data Analyst II

Data Analyst

Education

Master of Science - Management Information Systems

Bachelor of Science - Electronics And Communications Engineering

Skills

Accomplishments

Timeline

Data Engineer Manager

Data Engineer II

Data Engineer

Big Data Engineer

Data Analyst II

Data Analyst

Master of Science - Management Information Systems

Bachelor of Science - Electronics And Communications Engineering

Similar Profiles

Melissa Gaelle AudibertMelissa Gaelle Audibert

Veronica MahilumVeronica Mahilum

Santiago EspinozaSantiago Espinoza

Craig SmithCraig Smith

Kelsey RaeKelsey Rae