Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic

Shiva Nanda Reddy Chamala

Summary

Experienced Lead Data Engineer with a proven track record in designing and implementing end-to-end analytical solutions, scalable ETL pipelines, and data warehousing. Adept at optimizing query performance, ensuring data quality, and leveraging a strong data science background to drive impactful business decisions.

Overview

6
6
years of professional experience
1
1
Certification

Work History

Lead Data Engineer

Ford Motors
11.2021 - Current
  • Spearheaded the design and implementation of high-quality ETL/ELT pipelines to support diverse business requirements
  • Leveraging Data Ingestion to Cloud Storage and Processing the data in DataProc running in Spark to optimize data processing and reduce manual efforts by 80% and a 30% increase in data accessibility
  • Collaborated with business stakeholders and product owners to modernize and automate the data pipelines using GCP cloud functions enabling seamless integration across complexity Reduction, resulting in streamlined data flows, enhanced accuracy, and improved financial reporting capabilities
  • Collaborated with the Data Science team to develop highly efficient data pipelines using Pyspark, Python, BigQuery and Pandas, resulting in a 40% reduction in data processing time, ensuring seamless data flow
  • Led the assessment of the architectural roadmap, providing insights and recommendations for the successful implementation of new technologies and strategies
  • Develop and optimize fault-tolerant ETL pipelines, ensuring high data quality and reliability
  • Currently working in GCP to migrate on-perm pipelines to DataProc, BigQuery, and Cloud SQL for cloud integration while leveraging Terraform
  • Oversee data governance initiatives, including master data management and the creation of data catalogs
  • Engineered migrating existing pipeline to airflow scheduler to reduce time processing by 90% in loading PySpark transformation data into a big query
  • Communicated fluently with cross-functional teams to gather requirements and design shared datasets
  • Designed and implemented a tiered architecture (Bronze, Silver, Gold) in Databricks on Azure, using Delta sharing as the source
  • Developed a logic to efficiently handle the deletion of records during incremental fetch, prioritizing the removal of records where the deletion feed_date exceeds the source feed_date before loading into the target gold table.

Senior Data Engineer

ModivCare
05.2021 - 11.2023
  • Led the design and development of data warehouse architecture, including ETL processes and dimensional modeling(OLAP) models such as star schemas, resulting in a scalable and high-performance solution for BI and reporting needs
  • Facilitated the gathering of functional requirements by collaborating with cross-functional product owners, stakeholders, and business teams, effectively communicating the results in a timely manner
  • Collaborated with Product Management and User Experience experts regarding product definition, schedule, scope, and project-related decisions
  • Employed SQL and Python extensively to develop robust data processing workflows
  • Experience in building and architecting multiple Data pipelines, end-to-end ETL, and ELT processes for Data ingestion and transformation in GCP
  • Implemented Apache Airflow for authoring, scheduling, and monitoring Data Pipelines dags for code reusability in cloud composer service in GCP along with Architecting several DAGs (Directed Acyclic Graphs) for automating ETL pipelines
  • Designed, built, and automated data pipelines in GCP and On-prem with robust data quality for data reporting and metrics integration
  • Collaborated closely with engineers and stakeholders to drive project requirements and design efficient data architectures
  • Participated in organization-wide initiatives, suggesting and implementing improvements to data processing and storage strategies.

Data Engineer

Florida Blue
11.2018 - 04.2021
  • Involved in the development of a Map based application for integral analysis purposes
  • Developed efficient data pipelines and minimized data redundancy by performing ETL processing through Jupiter notebook and loaded data into Hive
  • Used Spark-SQL to Load JSON data and create Schema RDD and loaded it into Hive Tables and handle Structured data using Spark SQL
  • Implemented Pyspark Python scripts to perform data validation and cleaning to ensure data quality
  • Written unit-test cases on the PySpark code to validate the implemented business requirements
  • Interacted with stakeholders to collect the requirements and build the BI dashboards
  • Designed well-structured Kimball style Facts and Dimensions for Data Warehouse
  • Designed denormalized data structures for MTM
  • Also developed T-SQL algorithms to generate fuzzy matching which allows the MTM dashboard to identify non-exact matches of the targeted item.

Education

Master of Science -

University of the Cumberlands
12.2019

Skills

  • Python
  • R
  • SQL
  • Scala
  • HTML
  • CSS
  • Shell Scripting
  • Azure DataFactory
  • Airflow
  • Dbt
  • Databricks
  • Data proc
  • Dataflow
  • Hadoop
  • Apache Spark
  • Hive
  • Apache Beam
  • Kafka
  • MySQL
  • MS SQL Server
  • MongoDB
  • PostgreSQL
  • AzureSQLDB
  • Amazon Redshift
  • S3
  • Big Query
  • PowerBI
  • Tableau
  • Excel
  • Plotly
  • Matplotlib
  • Azure
  • GCP
  • AWS

Certification

  • Astronomer Certification For Apache Airflow Fundamentals
  • Databricks Certified Data Engineer Associate, 97970773

Timeline

Lead Data Engineer

Ford Motors
11.2021 - Current

Senior Data Engineer

ModivCare
05.2021 - 11.2023

Data Engineer

Florida Blue
11.2018 - 04.2021

Master of Science -

University of the Cumberlands
  • Astronomer Certification For Apache Airflow Fundamentals
  • Databricks Certified Data Engineer Associate, 97970773
Shiva Nanda Reddy Chamala