Summary
Overview
Work History
Education
Skills
Timeline
Generic

Krishna Sai Budaraju

Harrison,NJ

Summary

Data Engineer specializing in AWS, with over six years of experience designing and optimizing data pipelines, cloud architectures, and real-time analytics solutions. Proficient in leveraging AWS services such as S3, EMR, Glue, Redshift, Athena, and Lambda to deliver high-performance ETL/ELT processes at scale. Skilled in Python, PySpark, and SQL for data transformation, modeling, and automation, with expertise in ensuring compliance and governance standards (HIPAA, GDPR). Adept at building scalable data lake and lakehouse environments, driving cost optimization, and enabling advanced analytics and machine learning applications.

Overview

7
7
years of professional experience

Work History

Data Engineer

Freddie Mac
New York, New York
12.2023 - Current
  • Developed data pipelines for efficient data processing and integration.
  • Collaborated with teams to design and implement data models.
  • Analyzed large datasets to identify trends and support decision-making.
  • Utilized SQL for querying databases and extracting relevant information.
  • Assisted in optimizing ETL processes for improved performance.
  • Documented technical specifications for data engineering projects.
  • Analyzed user requirements, designed and developed ETL processes to load enterprise data into the Data Warehouse.
  • Developed and implemented data models, database designs, data access and table maintenance codes.
  • Created stored procedures for automating periodic tasks in SQL Server.
  • Developed Python scripts for extracting data from web services API's and loading into databases.
  • Participated in code reviews to maintain best practices in data engineering.
  • Optimized existing queries to improve query performance by creating indexes on tables.
  • Managed version control and deployment of data applications using Git, Docker, and Jenkins.
  • Conducted data analysis using SQL and Python to derive insights and support decision-making processes.
  • Implemented data visualization tools like Tableau and Power BI to create dashboards and reports for business stakeholders.
  • Implemented and optimized big data storage solutions, including Hadoop and NoSQL databases, to improve data accessibility and efficiency.
  • Streamlined data flow from diverse sources using ETL tools such as Talend, Informatica, and Airflow.
  • Developed and deployed machine learning models for predictive analytics, utilizing Spark and TensorFlow.
  • Optimized SQL queries and database schemas for performance improvements in data retrieval operations.
  • Collaborated with data scientists and analysts to understand data needs and implement appropriate data models and structures.
  • Identified, protected and leveraged existing data.
  • Planned and installed database management system software upgrades to enhance systemic performance.
  • Collected, outlined and refined requirements, led design processes and oversaw project progress.

Data Engineer

State Farm
Newark, New Jersey
06.2023 - 12.2023
  • Developed data pipelines to process large datasets efficiently.
  • Collaborated with teams to analyze data requirements and design solutions.
  • Utilized SQL for querying databases and extracting relevant information.
  • Assisted in optimizing existing ETL processes for improved performance.
  • Created data models to support analytics and reporting needs.
  • Participated in code reviews to ensure quality and best practices.
  • Analyzed complex datasets to identify trends and insights.
  • Created ETL scripts to automate manual processes for efficient data loading.
  • Designed and implemented data models to support business requirements.
  • Utilized SQL to query large datasets from multiple sources.
  • Developed automated tests to ensure the quality of the output produced by ETL jobs.
  • Provided technical guidance related to database design, development, optimization, security, scalability.
  • Cleaned and manipulated raw data.
  • Used statistical software to analyze and process large data sets.
  • Analyzed large datasets to identify trends and patterns for stakeholders.
  • Developed Python scripts for extracting data from web services API's and loading into databases.

Senior Data Analyst

Cognizant Technologies Solutions
Hyderabad, Telangana,India
01.2021 - 07.2022
  • Led data engineering and analytics initiatives for Waymo’s autonomous vehicle program, handling large-scale LiDAR, radar, and camera datasets.
  • Designed ETL pipelines in PySpark and AWS EMR to process sensor data into structured formats for downstream AI/ML model training.
  • Developed real-time streaming pipelines using AWS Kinesis and Lambda to monitor autonomous vehicle events and safety metrics.
  • Partnered with engineering operations teams to establish data validation frameworks, ensuring high accuracy and integrity of autonomous driving datasets.
  • Built data quality dashboards in Power BI and Tableau to track fleet performance, route safety, and object detection efficiency.
  • Implemented data versioning, lineage, and compliance controls to ensure reproducibility and audit readiness for ML experiments.
  • Optimized S3 partitioning with Parquet/ORC formats, reducing storage and query costs by 35%.
  • Collaborated with Google’s ML and computer vision engineers to ensure that processed data seamlessly integrated into deep learning models for object classification.

Data Analyst

Cognizant Technologies Solutions
Hyderabad, Telangana,India
06.2018 - 12.2020
  • Collected, cleansed, and prepared autonomous vehicle sensor data (LiDAR, radar, GPS, and camera feeds) for analytics and ML training.
  • Designed SQL-based pipelines to aggregate vehicle telemetry and road event data for the analysis of driving patterns.
  • Performed exploratory data analysis (EDA) to support anomaly detection in vehicle performance and road event logging.
  • Assisted in the creation of simulation datasets to test autonomous vehicle responses under varied driving conditions.
  • Supported engineering operations teams with data reporting, validation scripts, and issue tracking for field vehicle deployments.
  • Automated Python-based validation scripts flag anomalies in real-time fleet datasets, reducing manual intervention.
  • Collaborated with cross-functional teams across Cognizant and Waymo to support scalable data pipeline enhancements.
  • Generated performance insights and reports for client stakeholders (Waymo) to assist in safety and operational decision-making.

Education

Master of Science - Data Science

New Jersey Institute of Technology
Newark, NJ
12-2023

Bachelor of Science - Mechanical Engineering

Andhra University
Visakhapatnam
05-2018

Skills

  • Data pipeline development
  • SQL optimization and programming
  • ETL design and process
  • Machine learning implementation
  • Big data technologies
  • Data modeling and visualization
  • Database management and structures
  • Query optimization strategies
  • Python programming skills
  • Analytical problem-solving
  • Cloud computing expertise
  • Excel functions proficiency
  • Tableau and Power BI dashboards
  • Google Cloud Platform knowledge
  • AWS and Azure proficiency

Timeline

Data Engineer

Freddie Mac
12.2023 - Current

Data Engineer

State Farm
06.2023 - 12.2023

Senior Data Analyst

Cognizant Technologies Solutions
01.2021 - 07.2022

Data Analyst

Cognizant Technologies Solutions
06.2018 - 12.2020

Master of Science - Data Science

New Jersey Institute of Technology

Bachelor of Science - Mechanical Engineering

Andhra University