DEEPIKA CHAMARTHI

Strongsville

Summary

A dedicated data engineer with extensive experience in building scalable data pipelines and optimizing data ingestion processes is eager to contribute expertise in big data technologies. Proven success in reducing data processing times and enhancing query performance aligns with the commitment to driving efficiency and innovation.

Overview

years of professional experience

Work History

Data Engineer (Client:PNC Bank)

Indotronix International Corporation (IIC)

Strongsville, OH

02.2022 - 03.2025

Analyze, Design and develop data ingestion into HDFS from legacy Oracle/Teradata/db2 using Python,Pyspark, Sqoop, Hive, and oozie
Work with different file formats (csv, zip, mainframe and pipe delimited) and build ingestion pipeline to load these files into Hive tables using CA7 scheduler
Build distributed, reliable and scalable data pipeline framework using python to ingest and process data from multiple sources (files, databases, HTTPS) define and model schemas to create Hive tables
Develop and deliver test plan documentation containing scenarios, test cases, and expected results
Supporting Integrated/Independent releases, software/hardware upgrades, server upgrades

Data Engineer

Mitrayu Solutions Pvt Ltd

Hyderabad, India

08.2016 - 07.2019

Worked on building a Centralized-Data-Lake by loading data into Apache Hive and Impala from heterogeneous databases - DB2, Oracle, and Teradata using Apache SQOOP
Developed PySpark scripts to extract and process large-scale, sensitive user/client data from Hive tables for downstream risk analytics and modeling
Build distributed, reliable and scalable data pipeline framework to ingest and process data from multiple sources (files, databases) define and model schemas to create Hive tables
Performed Tuning on Hive Queries
Write workflow which runs daily to perform data load and create final transformed tables
Taking ownership of escalations and perform troubleshooting, analysis, research and resolution using Sqoop, Hive, Oozie and Hadoop skills

Education

B.Tech - Chemical Engineering

MVGR College of Engineering

Vizianagaram, India

01-2010

Skills

Hadoop
ETL
PySpark
Python

Shell Scripting
SQL
DataBricks

Websites

www.linkedin.com/in/deepika-chamarthi

Accomplishments

Pipeline Optimization Success - Optimized data pipeline increasing speed by 30% and reducing errors by 15%.
Data Ingestion Efficiency - Reduced data ingestion time from 3 hours to 45 minutes.
Spark Job Development - Developed 20+ Spark/Hive jobs for data transformation and processing.

LANGUAGES

English - Proficient

Timeline