Lead Global Supply and Inventory Planning Teams with 3 enterprise-level initiatives - SEC (Strategic Enterprise Capabilities) for Greater China and North America regions, CDA (Consumer Direct Acceleration) successfully
Spearheaded the initiative to design and implement medallion architecture using the Unified Lakehouse approach
Developed advanced data solutions using Databricks and its features - Delta Lake, collaborative notebooks
Communicate with Product owners, stakeholders, and cross-functional teams to understand data needs.
Sr Data Engineer
Caterpillar
08.2020 - 06.2022
Worked in the STU (Sales to Users) team, decoupled complex SQL logic and created a pipeline for incremental data loads instead of full refresh which reduced the runtime by 50%
Designed and implemented effective solutions using Snowflake and SnowSQL to store and retrieve data
Orchestrated the pipelines using Airflow - call the SQL scripts using Snowflake Operator
Prepared documentation and analytic reports, delivering summarized results, analysis, and conclusions to stakeholders.
Data Engineer
NIKE
10.2018 - 07.2020
Data Engineer for Supply Planning team, developed and maintained 50+ ETL (Extract, Transform, and Load) pipelines for supply planning data using Python, Spark, Hive, Athena, EMR, SQS, SNS
Ingested and processed 200 TB of data every month from data sources like Teradata, SQL Server into AWS S3 and Snowflake using Spark API, Airflow, GitHub, CI/CD Jenkins pipelines
Designed and Implemented 10+ domain objects which were used by 700+ users across different organizations
Worked on Tuning spark applications for better performance which improved execution time by 25% - 50%.
Data Engineer
WALMART
01.2018 - 10.2018
Created Pipelines using Azure Data Factory to read/copy csv files to Azure SQL Database and from Teradata, SQL Server to Azure Data Lake storage
Using Spark data frame API and Spark SQL API run code in Azure Databricks notebooks, jobs, and interactive clusters
Partnered with business analysts/data analysts to understand stakeholder/user requirements and design solutions.
Designation – Hadoop Developer
Staples
06.2017 - 01.2018
Developed Spark application using Scala, which parses data in HDFS and ingests only filtered data into HBase and Solr
Created table in Hbase with Column Families and Column Qualifiers and inserted data using REST API.
Systems Engineer
INFOSYS
08.2014 - 12.2015
Successfully completed training provided by Infosys Education and Research Department
Education
Master of Science Computer Science -
University of New Haven
New Haven, CT
05.2017
Bachelor of Science Computer Science -
Osmania University
INDIA
06.2014
Skills
Cloud Data Technologies: AWS – S3, EMR, Athena, Airflow and Snowflake, SQL, Azure – Data Factory, ADLS, Databricks, Delta Lake, Lakehouse