Summary
Overview
Work History
Education
Skills
Timeline
Generic
ARAVIND REDDY

ARAVIND REDDY

Charlotte,NC

Summary

Experienced Data Engineer: Transforming Data with 5+ Years of Expertise Dedicated and results-driven data engineer with over 5 years of hands-on experience in data analysis and ETL processes. Proficient in an extensive array of technologies and tools, including Python, SQL, Cloud services, and Spark API. Adept at designing, developing, and optimizing data pipelines for both real-time and batch processing, with a strong background in Hadoop, Spark, and cloud platforms like Azure and AWS.

Overview

5
5
years of professional experience

Work History

Data Engineer

Fifth Third Bank
11.2022 - Current
  • Led end-to-end data pipeline development in AWS, coordinating team tasks for data ingestion and transformation
  • Proficient in Informatica, Impala-based ETL, and automated data migration using AWS Lambda and Step Functions
  • Developed real-time data processing apps in Scala and Python, integrating Kafka and Spark Streaming
  • Designed ETL pipelines for data warehousing, automated workflows, and ensured data accuracy.

Data Engineer

Merck Pharma
10.2021 - 10.2022
  • Developed pipelines to extract data from various sources using Sqoop and Linux shell scripts, loading it into HDFS Data Lake
  • Utilized Spark for efficient data processing
  • Employed Python and PySpark for large dataset analysis, enhancing data insights
  • Utilized Pandas for statistical analysis and designed data models for Redshift
  • Designed and implemented ETL pipelines for data ingestion from multiple sources using Spark and Hive
  • Utilized Informatica for data integration across systems
  • Managed AWS resources implemented batch processing using Airflow for Snowflake, and participated in application migration to AWS.

Data Engineer

Tiger Analytics
06.2018 - 08.2021
  • Developed pipelines to extract data from various sources using Sqoop and Linux shell scripts, loading it into an HDFS Data Lake. Utilized Spark for efficient data processing.
  • Employed Python and PySpark for large dataset analysis, enhancing data insights. Utilized Pandas for statistical analysis and designed data models for Redshift.
  • Designed and implemented ETL pipelines for data ingestion using Spark and Hive from multiple sources. Utilized Informatica for data integration across systems.
  • Managed AWS resources implemented batch processing using Airflow for Snowflake, and participated in application migration to AWS

Education

Master of Science - computer science

Pace University
New York, NY
2023

Skills

  • Big Data: Hadoop, HDFS, PIG, Hive, HBase, Oozie, Kafka, Yarn, Apache Spark
  • Databases: Oracle, MySQL, SQL Server, MongoDB
  • Programming: Scala, Python, SQL, PL/SQL, HiveQL, Unix, Shell Scripting
  • Cloud Platforms: Azure, AWS
  • Automation/Orchestration: Jenkins, Apache Airflow
  • Data Services/Tools: Azure Data Factory, Data Dicks, Azure Synapse, Azure Data Lake, Snowflake, Tableau
  • Additional Skills: PySpark, Scala, Data Warehousing, Data Preparation, ETL, Agile, MS-SQL

Timeline

Data Engineer

Fifth Third Bank
11.2022 - Current

Data Engineer

Merck Pharma
10.2021 - 10.2022

Data Engineer

Tiger Analytics
06.2018 - 08.2021

Master of Science - computer science

Pace University
ARAVIND REDDY