Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic

Prudvi Gadi

Plano,Texas

Summary

Accomplished Senior Data Engineer with a proven track record at JP Morgan Chase & Co., expert in Programming, Data Engineering, Data Warehousing and ETL by leveraging PySpark and Spark with Java, Python, Big Data/Hadoop ecosystem, AWS Cloud Services, Ab initio ETL tool etc, demonstrating exceptional problem-solving skills and a knack for optimizing data processes for financial services.

Overview

12
12
years of professional experience

Work History

Senior Data Engineer

M3 Global Inc
Plano, Texas
10.2023 - Current

Client: Capital One – Cognizant Technology Solutions

Project: Capital One Retail bank

  • Develop and optimize ETL processes to transform and load large volumes of financial data from various sources into centralized data storage solutions.
  • Ensure data quality, integrity, and accuracy through validation, cleansing, and monitoring processes.
  • Worked with financial analysts and data scientists to understand data needs and provide data access solutions for business intelligence and analytics.
  • Automated the data workflows, integrate with existing data systems and performing the data transformations using AWS tools like Glue, lambda, Step Functions and CloudWatch.
  • Collaborate with cross-functional teams to implement data solutions that adhere to industry best practices and regulatory requirements in the financial services domain.

Senior Data Engineer

M3 Global Inc
Plano, Texas
05.2023 - 07.2024

Client: J.P. Morgan Chase & Co. – Virtusa

Project: Hadoop Exit/AWS Cloud Migration

  • Modernization of the on-prem ETL application to AWS Cloud, which involves rewriting the Ab Initio ETL graphs that run on the Hadoop Hortonworks cluster to Spark.
  • Complex ab initio graphs that are running on-premises Hadoop are converted to Spark applications suitable for running on AWS EMR.
  • Created AWS Glue catalog tables for the S3 files in Parquet format, with the technical metadata.
  • Contributed to building the common libraries and functionalities, such as CDC (Change Data Capture) Type-2, TDQ (Technical Data Quality Check) validations, and Spark common libraries and transformations.
  • Implemented CI/CD pipelines end-to-end using Jules and Jenkins by following the Git/Bitbucket branching strategy.
  • Expert in writing unit test cases using JUnit for the Java/Spark code, with positive and negative scenarios, including functionality testing. Mocked up the data with all scenarios using libraries like Mockito, etc.
  • Rewriting the code compatible to AWS EMR and building the reusable components like S3 Read/write, Glue catalogue table read/write, adding a partition and common script to launch a Spark job
  • I automated the AWS EMR jobs using the Control-M automation tool and efficiently orchestrated the dependencies of the jobs.

Associate Software Engineer

JP Morgan Chase & Co.
Hyderabad, India
11.2018 - 04.2023

Project: Home Mortgage Disclosure Act (HMDA) – Mortgage Banking.

The Home Mortgage Disclosure Act (or HMDA, pronounced HUM-duh) is a United States federal law that requires certain financial institutions to provide mortgage data to the public.

  • Strong experience in designing data pipelines, such as data ingestion, data processing (transformations, enrichment, and aggregations), data provisioning, and reporting.
  • Good knowledge and experience in developing interactive reports using Tableau.
  • Experience in converting Ab Initio graphs into Java Spark/PySpark applications.
  • Experience in cleansing and transforming the data from source systems using most of the Ab Initio components, such as Join, Dedup Sorted, Normalize, Reformat, Filter-by-Expression, and Rollup.
  • Designed, developed, tested, and implemented Ab Initio ETL graphs from end to end.
  • Strong knowledge of data warehousing concepts and dimensional modeling, such as star schema and snowflake schema.
  • Strong experience in developing jobs using Apache Spark and reading/writing the data to AWS S3.
  • Extensive knowledge in programming with Resilient Distributed Datasets (RDDs).
  • Experience in tuning and improving the performance of Spark jobs by exploring various options.
  • Strong experience in working with core Hadoop components, like HDFS, YARN, Hive, and MapReduce.
  • Strong knowledge and experience in developing Spark applications using Java.
  • Good understanding and knowledge of NoSQL databases, like Cassandra.
  • Big data/Hadoop experience and capable of processing large sets of structured, semi-structured, and unstructured data, and supporting systems application architecture.
  • Able to assess business rules, collaborate with stakeholders, and perform source-to-target data mapping, design, and review.
  • Able to apply appropriate transformation rules using Apache Spark and storing the desired data to on-premises storage, HDFS.
  • Proficient in developing Apache HIVE databases and tables, and loading high-volume data into them using HIVE and the Spark HIVE API.
  • Experienced in writing Java and SQL-based transformations using Apache Spark, and experience in working with different file formats like PARQUET, ORC, and AVRO.

Software Engineer

JP Morgan Chase & Co.
Hyderabad, India
09.2016 - 11.2018

Project: Mortgage Loan Origination.

  • Ingesting the mortgage loan information into the Hadoop ecosystem from various loan origination systems in different formats, frequencies, and layouts, which are used for the downstream reporting requirements.
  • Experience in developing generic Spark code for data cleansing, data validation, and data transformation.
  • Implemented technical data quality on the inbound files at the staging area.
  • Proficient in developing Apache HIVE databases and tables, and loading data into them using HIVE and the Spark HIVE API.
  • Capable of applying partitioning and clustering on HIVE tables based on the data reading requirements and frequencies to improve performance.
  • Strong knowledge in developing Spark applications using Java.
  • Experience in testing the code with JUnit and automatic builds using the Jenkins pipeline.

System Analyst

Value Labs Pvt. Ltd
Hyderabad, India
11.2015 - 08.2016
  • Project: Ascender (Australia)
  • Technologies/Tools: Oracle PL/SQL, SQL, Ab Initio, and Unix.

Software Engineer

Kewill India Pvt. Ltd (Currently known as E2 Open)
Hyderabad, India
07.2013 - 11.2015

Project: DHL Netherlands and HAVI Logistics (Singapore)

Technologies/Tools: Oracle PL/SQL, SQL, Oracle Forms & Reports, and Unix.

Education

Bachelor of Engineering - Computer Science and Engineering

Anna University
05-2011

Skills

  • PySpark/Spark with Java
  • Ab initio ETL
  • Python
  • AWS (S3, EC2, Athena, Glue, EMR, Step Functions, Lambda, RDS, etc)
  • Big Data (Hadoop, HDFS, Sqoop, Hive)
  • Control-M job automation tool
  • Apache Airflow
  • Teradata
  • Oracle Exadata
  • Oracle SQL, PL/SQL
  • Snowflake
  • Unix

Timeline

Senior Data Engineer

M3 Global Inc
10.2023 - Current

Senior Data Engineer

M3 Global Inc
05.2023 - 07.2024

Associate Software Engineer

JP Morgan Chase & Co.
11.2018 - 04.2023

Software Engineer

JP Morgan Chase & Co.
09.2016 - 11.2018

System Analyst

Value Labs Pvt. Ltd
11.2015 - 08.2016

Software Engineer

Kewill India Pvt. Ltd (Currently known as E2 Open)
07.2013 - 11.2015

Bachelor of Engineering - Computer Science and Engineering

Anna University
Prudvi Gadi
Bold.pro uses cookies as well as our third-party affiliates. When you use our website, you understand that we collect personal data to improve your experience. Learn more
×