Engineering leader with 8+ years of experience building scalable, cost-efficient data platforms and leading org-wide initiatives across cloud, analytics, and infrastructure. Known for modernizing legacy systems, driving automation, and delivering measurable impact—cutting costs, reducing dev time, and enabling data-driven decision-making at scale.
Overview
11
11
years of professional experience
Work History
Data Engineer Manager
Amazon
Seattle, WA
10.2023 - Current
Led the development of a centralized Metric Marketplace, reducing the average time to create and launch new analytics metrics from 4 weeks to under 7 days—improving speed-to-insight by over 75%.
Reduced monthly infrastructure costs by ~$44K (29%) through targeted optimizations: lowered IO-optimized storage by 41%, scaled down oversized clusters by 33%, migrated workloads from on-demand to provisioned instances, removed obsolete snapshots, and archived logs to cost-efficient Glacier storage.
Driving a company-wide data platform strategy to reduce data lake footprint by 46% by enabling cross-platform querying (e.g., Redshift, AWS Glue), while establishing a unified, legally compliant data governance and access control framework across a large organization.
Implemented lean, agile operating models that cut internal meeting time by 30%, enhanced cross-team collaboration, and improved productivity through streamlined communication practices.
Defined and executed multi-quarter technical roadmaps, aligning teams across product, engineering, and leadership to deliver business-critical initiatives on schedule.
Excelled in navigating ambiguity, managing shifting executive priorities, and making strategic resourcing decisions to ensure delivery of high-impact goals under tight constraints.
Data Engineer II
Amazon
Seattle, WA
04.2021 - 09.2023
Led migration from a monolithic legacy system to a modular, cloud-native architecture on AWS, implementing solutions like a SQL Generation Framework, Metadata Control Store, and event-driven design—resulting in a 77% reduction in legacy jobs and significant operational overhead savings.
Developed reusable AWS CDK infrastructure constructs, enabling developers to rapidly deploy standardized environments and reducing infrastructure setup time by 15%, while driving consistency across codebases.
Built a custom ETL framework that reduced data pipeline development time by over 50%, streamlining onboarding of new data sources and accelerating feature delivery.
Redesigned the Intraday Metrics Architecture for the Operations domain, achieving 99.76% metric availability and expanding real-time insights to additional regions, including Mexico and Brazil.
Engineered an integrated data pipeline to ingest, process, and consolidate critical datasets (Connections, COVID, Clarity), delivering real-time employee risk signals used by HR VPs for strategic workforce decisions.
Data Engineer
Amazon
Seattle, WA
11.2019 - 03.2020
Refactored a legacy Clarity data unload pipeline, improving maintainability, modularity, and scalability by parameterizing metrics, generalizing SQL logic, and modularizing code—reducing failure rates to less than 2.4%and enabling uninterrupted operation for 4+ years across multiple tenants and data flows.
Designed and launched a Clarity Data Distribution Framework to automate cross-team data sharing of 600+ HR metrics, replacing manual SQL maintenance with a config-driven system. Enabled teams like Amazon Air and Perfect Mile to export historical and real-time data with minimal overhead—processing jobs in parallel and optimizing SQLs to reduce production load.
Led the decoupling of Clarity from a shared monolithic codebase, creating a standalone Python package and building a new CI/CD pipeline. Consolidated 80+ scripts into a single dynamic SQL executor, automated complex flow creation, and reduced manual dev effort by 90%, while improving system resilience and reducing dependency risks.
Big Data Engineer
ThoughtSpot
Sunnyvale, CA
02.2019 - 11.2019
Deployed distributed database clusters, bringing in a significant volume of data from multiple ETL pipelines to verify database performance through cluster operations.
Extrapolated insights by deducing distribution profile of checking velocity metrics for each branch separately and flagged instances of long lead times in release cycle.
Developed and implemented a longevity data consistency pipeline for an in-memory database that involved sharding tables, executing constant DML operations, and creating query traffic on the tables.
Developed an independent analytical platform that enables analysis of the code reviewing process for millions of code check-ins on the master branch, eliminating the need for reliance on Gerrit. The platform employs pagination to optimize data extraction from Gerrit servers and transform key-value storage into an efficient OLAP star schema. Additionally, the data is distributed across a multi-node cluster to ensure maximum parallel processing.
Contributed to enhancing efficiency by reducing the manual effort index of resources through automation scripts for broken pinboard objects, data consistency checks, and cluster restore/upgrade processes.
Data Analyst II
Larsen & Toubro Infotech, LTI
Mumbai, Maharashtra
10.2016 - 06.2017
Proficiently executed a multitude of responsibilities like analyzing requirements, designing tests/plans/executions as well as defect management within Agile/Scrum Projects. Demonstrated knowledge of different methodologies like Continuous Integration & Test Driven Development.
Developed an effective predictive model using Python for identifying faulty drilling/manufacturing equipment.
Developed PL/SQL packages to handle client-generated incidents.
Designed and implemented complex SQL queries to create quick reports, providing valuable insights into business metrics.
Developed java-based interface with SQL, PL/SQL, and HTTP-Tester to facilitate ETL process for client's database.
Data Analyst
Capgemini
Mumbai, Maharashtra
06.2014 - 10.2016
Analyzed vendor selection metrics using Python and SQL to derive insights.
Created python script to identify vendors frequently changing their prices after vendor selection.
Incorporated the impact of vendor price changes into a comprehensive and intuitive visualization dashboard for monitoring P2P cycle delays.
Utilized PL/SQL to create efficient background jobs that successfully extracted and removed duplicate invoices for fraud analysis purposes.
Devised efficient automation scripts that effectively handled log management tasks, leading to a 75% decrease in the need for manual upkeep.
Streamlined code review and testing process, resulting in a 10% increase in review efficiency.
Decreased Rework Effort Index by 30% through root cause analysis of critical software defects
Implemented triggers in the back-end to execute upon vendor transactions.
Education
Master of Science - Management Information Systems
University of Illinois At Chicago
Chicago, IL
12-2019
Bachelor of Science - Electronics And Communications Engineering
SRM University
Chennai
02-2014
Skills
Data Modeling
Machine Learning
Data Warehousing
Data Migration
Data Security
Performance Tuning
Scripting Languages
API Development
SQL and Databases
Software Development
Critical Thinking
Project Management
Data Mining
Accomplishments
Game Changer Award.
Innovation Maven Maverick.
Timeline
Data Engineer Manager
Amazon
10.2023 - Current
Data Engineer II
Amazon
04.2021 - 09.2023
Data Engineer
Amazon
11.2019 - 03.2020
Big Data Engineer
ThoughtSpot
02.2019 - 11.2019
Data Analyst II
Larsen & Toubro Infotech, LTI
10.2016 - 06.2017
Data Analyst
Capgemini
06.2014 - 10.2016
Master of Science - Management Information Systems
University of Illinois At Chicago
Bachelor of Science - Electronics And Communications Engineering