Summary
Overview
Work History
Education
Skills
Tools And Technologies
Timeline
Generic

Sripal Reddy

Summary

Dynamic Sr. Data Engineer with extensive experience at Tech Mahindra, specializing in building robust data pipelines and optimizing ETL processes using Python and Airflow. Proven ability to enhance data quality and streamline workflows, showcasing strong analytical skills and effective collaboration with cross-functional teams to drive impactful data solutions.

Overview

10
10
years of professional experience

Work History

Sr. Data Engineer

Tech Mahindra Americas / BrightSpeed
Charlotte
01.2023 - Current
  • Designed and developed end-to-end data pipelines on Google Cloud Platform using Kafka for real-time streaming and Python for ingestion.
  • Created custom Kafka producers and consumers to integrate external data sources with internal systems, enabling real-time analytics.
  • Developed Python scripts for ETL processes from TXT and CSV files into Google BigQuery.
  • Implemented partitioning and clustering strategies in BigQuery to enhance query performance.
  • Leveraged Airflow to orchestrate complex ETL workflows, including data extraction from MySQL into BigQuery.
  • Developed Cloud Functions to trigger Airflow workflows upon file delivery to GCS storage.
  • Participated in designing Airflow DAGs for efficient scheduling and orchestration of data workflows.
  • Collaborated with cross-functional teams to gather requirements, analyze data needs, and design effective models.

Environment: Python, BigQuery, Cloud Composer/Airflow, MySQL, Cloud Storage, GitHub.

Sr. Data Engineer

HCL/Google/Walmart
Bentonville
08.2022 - 12.2022
  • Automated ETL scripts using Python to enhance data processing efficiency.
  • Designed and migrated data solutions from Hadoop on-premises to Google Cloud Platform.
  • Conducted Exploratory Data Analysis for model fitting, hypothesis testing, and data transformation.
  • Transformed various data formats, including XML and JSON, to parquet format using PySpark.
  • Developed shell scripts for pipeline automation and error handling in data processes.
  • Scheduled jobs efficiently in Composer/Airflow to streamline workflows.
  • Utilized GCP, SQL, PySpark, and Hive to optimize data operations.
  • Collaborated on GitHub for version control and project management.

Environment: GCP, SQL, PySpark, Spark SQL, Astronomer/Airflow, Hive, GitHub.

Sr. Data Engineer

Clover Health
Nashville
10.2021 - 08.2022
  • Transformed, Cleansed, and backfill data, created models in BigQuery for the business use case to create reports for the EHRs
  • Shell Scripts to automate pipelines and error handling for data
  • Scheduled jobs in composer/airflow/Tidal to run jobs
  • Implemented One-time Data Migration of Multi-state level data from SQL Server to Snowflake using Python and SnowSQL
  • Utilized Informatica MDM to manage and maintain master data consistency across the enterprise
  • Worked with Informatica Data Quality (IDQ) for profiling, cleansing, and standardizing source data to ensure high data quality

Environment: GCP, SQL, PySpark, Logstash, Teradata, GitHub

Sr. Data Engineer

DaVita HealthCare Partners, Inc.
Nashville
02.2017 - 09.2021
  • Designed and implemented solutions for migrating the existing Clinical platform to cloud (GCP) with Zero Downtime Deployment Strategies
  • Integrated Big Data into traditional ETL, accelerating the extraction, transformation and loading of massive structured and semi-structured data
  • Worked on ETL using GCP pub-sub for streaming real time data updates to Mobile and Web Applications
  • Worked with Python Pandas and NumPy for doing Data Cleaning
  • Utilized SparkSQL to extract and process data by parsing using Datasets or RDDs with transformations and actions
  • Worked on source code Migration using Git

Environment: GCP, SQL, Spark- PySpark, Spark SQL, Logstash, Kibana, Jenkins, GitHub

Data Engineer

MetLife
Cary
04.2016 - 01.2017
  • Involved in review of functional and non-functional requirements (NFR)
  • Developed ETL jobs using DSE Sqoop to migrate from Oracle to Cassandra tables.
  • Experience in designing Kafka for multi data center cluster and creating monitoring alerts
  • Created Views on top of Oracle DB’s for huge data sets
  • Created Views for specific columns on a table to maintain privacy of a customer data
  • Developed the ETL jobs to load the data into a data warehouse, which is coming from various data sources like Mainframes, flat file

Environment: Hive, Zookeeper, Talend, and GitHub..

Data Engineer

General Motors
Detroit
06.2015 - 03.2016
  • Developed ETL mappings and workflows to extract data from Oracle, SQL Server, and flat files into enterprise data warehouse.
  • Designed complex transformations using expressions, lookups, aggregators, and joiners for efficient data processing.
  • Scheduled and monitored ETL workflows with workflow manager and automated job alerts for proactive management.
  • Implemented performance tuning for long-running ETL jobs and optimized mapping logic to enhance efficiency.
  • Ensured data integrity through pre-load and post-load validations along with reconciliation processes.
  • Integrated IDQ checks within ETL workflows to maintain clean data entry into data warehouse.
  • Utilized Informatica Analyst and Developer tools for effective rule creation and comprehensive data profiling.

Environment: IDQ Developer & Analyst, PowerCenter Designer, Workflow Manager , MS SQLserver

Education

M.S - Electrical Engineering

Texas A&M University
Kingsville, TX
05-2015

Bachelor of Technology - Electrical and Electronics Engineering

VCEG-JNTU-Hyderabad
INDIA
05-2013

Skills

  • Data visualization tools
  • Version control systems
  • Continuous integration tools
  • Database management systems
  • Scripting languages
  • Cloud services
  • Operating systems

Tools And Technologies

Tableau, Custom Shell Scripts, Splunk, Grafana, Maven, Git, SVN, Jenkins, SQL, JavaScript, Shell Scripting, Python, HiveQL, Oracle, MY SQL, MS SQL Server, Teradata, Postgres SQL, Informatica PowerCenter, Infoworks, Linux, Unix, Windows 8, Windows 7, Windows Server 2008/2003, S3, Redshift, EMR, Lambda, Setup, configuration, data streaming, integration , Informatica MDM  , Informatica IDQ

Timeline

Sr. Data Engineer

Tech Mahindra Americas / BrightSpeed
01.2023 - Current

Sr. Data Engineer

HCL/Google/Walmart
08.2022 - 12.2022

Sr. Data Engineer

Clover Health
10.2021 - 08.2022

Sr. Data Engineer

DaVita HealthCare Partners, Inc.
02.2017 - 09.2021

Data Engineer

MetLife
04.2016 - 01.2017

Data Engineer

General Motors
06.2015 - 03.2016

M.S - Electrical Engineering

Texas A&M University

Bachelor of Technology - Electrical and Electronics Engineering

VCEG-JNTU-Hyderabad
Sripal Reddy