Data Engineer with 2+ years of experience in designing, building, and maintaining scalable data pipelines and infrastructure, ensuring efficient data processing and analysis. Skilled in optimizing workflows and supporting data-driven decision-making through well-architected solutions.
Overview
4
4
years of professional experience
1
1
Certification
Work History
Data Engineer
Trellis IT INC
07.2024 - 07.2025
Designed and implemented an incremental data loading process to efficiently transfer data from AWS S3 to Amazon Redshift using AWS Glue.
Implemented least privilege access by performing thorough permissions analysis and defining IAM policies with precise access to only approved S3 buckets, Redshift clusters, Glue jobs.
Implemented complex data transformations using PySpark in AWS Glue, including filtering, aggregation, and joins, significantly improving data quality and consistency for downstream analytics.
Designed and developed a scalable ETL pipeline using AWS Glue, PySpark, and S3 to extract, transform, and load large datasets, improving data processing efficiency by 30%.
Senior Analyst
Capgemini
03.2022 - 08.2022
Handled Performance Tuning by creating partitions on Tables with strong analytical and troubleshooting skills for quick issue resolution in large-scale production environments located globally.
Developed and maintained data processing and transformation scripts using Python and SQL, ensuring data accuracy and consistency.
Conducted feature engineering to create meaningful features from raw healthcare data, enhancing model performance by 20%.
Participated in project meetings to discuss progress, issues, and potential solutions, ensuring alignment with project timelines and objectives.
Employed JIRA for managing project workflows and tracking issues effectively.
Data Engineer
Manuh Solutions India Pvt Ltd
04.2021 - 03.2022
Developed and maintained SQL queries and scripts to extract, transform, and load data from various sources.
Developed data pipelines to ingest large volumes of healthcare data from databases into S3 data lake.
Developed AWS Glue workflows to transform, enrich, and load healthcare data from S3 buckets into Redshift data warehouse.
Leveraged SNS and SQS to build notification and alerting system for healthcare fraud predictions enabling real-time actions.
Developed custom Athena queries using SQL to explore claims data and uncover trends around utilization, networking, denials, and high-cost procedures. This enabled pivoting analysis in new directions.
Implemented advanced SQL techniques like CTEs, window functions, lateral joins, and subqueries to develop a highly scalable analytics platform handling millions of rows.
Involved in designing and developing SQL server objects such as Tables, Views, Indexes (Clustered and Non-Clustered), Stored Procedures and Functions in Transact-SQL.