Summary

Overview

Work History

Education

Skills

Academic Projects

Timeline

Abhimanyu Abhinav

Seattle,USA

Summary

Accomplished Data Engineer with extensive experience at Amazon, specializing in AWS cloud services and big data processing. Successfully led data migrations and optimized ETL frameworks, achieving a 25% increase in query performance. Adept at analytical problem solving and developing secure data infrastructures, ensuring robust data management across diverse platforms.

Overview

years of professional experience

Work History

Data Engineer

Amazon

Seattle, USA

07.2021 - Current

Project NAWS: Led data migration effort from server hosted MySQL instance to AWS cloud along with BI applications
Optimized the migrated tables to improve on data skewness and data partitioning to fasten the BI application
Created an end-to-end ETL framework using AWS cloud services like S3, Redshift, Aurora MYSQL to move Lager TB size data across different data storage solutions
Migrated schemas of 8+ different teams from on-premise server hosted MYSQL Database to AWS Cloud MYSQL along with setting up infrastructure for secure access for users
Leveraged masking of IP addresses, configuring proper access for worldwide usage and setting up self - maintainable active user group
Worked developing a Database loader tool called Bigfoot to handle big data loads using AWS lambda, Glue and integrating it with scalable S3 service making the ETLs more seamless and efficient by reducing the data hops in the processes and having auto trigger loading
Configured monitoring controls and alarms for ETLs
Project Candidata: Revamped Data model to handle huge candidate data and their application activity
Leveraged AIRFLOW scheduling python ETLs catering to various ML models and BI applications
Collaborated with Data Science/ML teams to develop the data infrastructure servicing the ML processes
Redesigning and improving on existing data model leading to 25% increase in capturing web activity of a candidate and improving query performance up to 2X-6X along with reducing the scanned data size, thus improving the margins on our SLAs
Also, made dimension tables agnostic to source data to improve upon upstream failures
Used surrogate keys to mask the business logic from end users as part of making the data more secure and improving on confidentiality of PII data
Python ETL jobs to handle petabyte size batch data processing
Leveraged distributed processing in Spark and tuning the parameters to handle S3 throttling limitations from API consumption
Worked developing Data Quality checks for a data consumption framework using hashing for integrity check and row count checks for completeness

Data Engineer

Tata Consultancy Services

Mumbai, India

04.2016 - 05.2019

Scraping data from diverse platform’s and applying business logics and loading data to Snowflake (Cloud) database
Scripting Python programs to analyze, clean complex data from varied sources and extracted large datasets from PADB (Paracel DB) and load it in Snowflake cloud database via s3
Created facts and multiple dimension tables Incorporate the SCD to create the surrogate keys for the warehouse
Migrating close to 400 tables from existing platform to Snowflake improving query performance by 90% - 97% for medium to large data warehouses and reduced the query failures to zero for 50 concurrent users spanning across 300 queries

Education

Master’s Degree - Computer Science

Clemson University

05.2021

Bachelor’s Degree - Engineering (Information Technology)

Oriental College of Technology

05.2015

Skills

AWS cloud services
SPARK
Python, SQL, Scala
Data processing frameworks
Workflow orchestration
Data modeling
Data warehousing
Relational databases

NoSQL databases
Cloud data platforms
Distributed processing, AWS Glue, EMR
Scripting and automation
Analytical problem solving
Continuous integration and delivery
Version control systems
Infrastructure as code

Academic Projects

Worked on a developed novel algorithm which finds an exponential and logistic pattern in big data for improving its execution time and efficiency and benchmarking it on distinct computation platforms such as (GCP,AWS and Apache Spark/MapReduce framework). Implemented sampling with MapReduce to reduce execution time. Predicted exponential patterns drawn by the algorithm for a larger dataset based on coefficients calculated by the exponential function.

Timeline

Data Engineer

Amazon

07.2021 - Current

Data Engineer

Tata Consultancy Services

04.2016 - 05.2019

Master’s Degree - Computer Science

Clemson University

Bachelor’s Degree - Engineering (Information Technology)

Oriental College of Technology

Abhimanyu Abhinav

Summary

Overview

Work History

Data Engineer

Data Engineer

Education

Master’s Degree - Computer Science

Bachelor’s Degree - Engineering (Information Technology)

Skills

Academic Projects

Timeline

Data Engineer

Data Engineer

Master’s Degree - Computer Science

Bachelor’s Degree - Engineering (Information Technology)

Similar Profiles

Ruben Frausto JrRuben Frausto Jr

Sreeram TRSreeram TR

Yashvvi JangidYashvvi Jangid

Vivian MaxwellVivian Maxwell

NAYARA XISTONAYARA XISTO