Summary
Overview
Work History
Education
Skills
Timeline
Generic

Tarun Chitturi

Summary

Data Engineer with 9+ years of experience specializing in developing and optimizing data pipelines using Python, Pyspark, AWS and Google Cloud. Demonstrates strong skills in machine learning model deployment and data governance, enhancing business outcomes through strategic data integration and automation. Known for effective collaboration with teams to deliver robust, data-driven solutions that drive operational efficiency and growth.

Overview

9
9
years of professional experience

Work History

Data Engineer

Better Being
07.2024 - Current
  • Develop data pipelines, enhance database performance, ensure data integrity through validation
  • Design and deploy machine learning models for predictive analytics, streamline data workflows
  • Collaborate with teams to document processes, establish data governance frameworks
  • Lead data integration, resolve data quality issues, automate reporting for efficiency
  • Architect scalable ETL pipelines and optimize database performance while implementing robust data validation protocols for enhanced system reliability
  • Spearhead implementation of cutting-edge machine learning solutions for business intelligence, focusing on real-time data processing capabilities
  • Drive cross-functional initiatives to strengthen data governance standards, creating comprehensive documentation for sustainable data practices
  • Transform legacy data systems through strategic integration projects, delivering automated reporting solutions that enhance operational efficiency

Data Engineer

Tausight
01.2024 - 06.2024
  • Engineered a data pipeline on Google Cloud, leading to a $700K ARR increase for CrowdStrike Marketplace
  • Enhanced ePHI data pipeline efficiency on GCP, halving processing time
  • Designed and implemented advanced telemetry logging systems, incorporating NoSQL databases to strengthen healthcare data protection measures
  • Refined data monitoring architecture using Falcon Logscale, enabling precise threat detection and maintaining regulatory compliance standards
  • Optimized healthcare data pipeline performance on GCP, integrating NoSQL databases and advanced logging systems for robust ePHI protection
  • Architected cloud-based data solutions that generated $700K ARR increase through strategic CrowdStrike Marketplace integration
  • Streamlined GCP healthcare data pipeline architecture, integrating telemetry logging with NoSQL databases, resulting in substantial processing efficiency gains
  • Spearheaded CrowdStrike Marketplace integration strategy, deploying data solutions that directly contributed to $700K annual recurring revenue

Data Engineer Intern

Tausight
08.2023 - 12.2023
  • Led data transformation using Google Cloud Dataflow, enhancing data analysis
  • Developed robust monitoring using GCP, improving tracking of ePHI files
  • Leveraged technical skills in data engineering to provide innovative solutions
  • Implemented real-time data pipeline architecture in Google Cloud Dataflow, optimizing healthcare data processing and strengthening ePHI security protocols
  • Streamlined cloud monitoring systems using GCP stack, enabling proactive detection of data anomalies and ensuring regulatory compliance
  • Partnered with cross-functional teams to integrate monitoring solutions, establishing robust data governance practices across platforms
  • Engineered real-time data pipelines in Google Cloud Dataflow to strengthen ePHI security, implementing robust monitoring systems for healthcare data processing
  • Designed comprehensive GCP monitoring solutions for ePHI files, enabling proactive anomaly detection and maintaining strict regulatory compliance

Data Engineer

AT&T
01.2023 - 08.2023
  • Led team in creating data solutions using PySpark & Kafka
  • Optimized real-time data ingestion and processing
  • Delivered scalable AWS-based data engineering solutions
  • Streamlined communication with stakeholders to drive project success
  • Continuously improved system performance and cost efficiency
  • Engineered high-performance data pipelines integrating Kafka and PySpark, enhancing system reliability and processing capabilities for large-scale analytics
  • Architected cloud-native data solutions on AWS, implementing best practices for data governance and cost optimization across enterprise systems
  • Fostered cross-functional partnerships to align data engineering initiatives with business objectives, ensuring seamless integration of solutions
  • Pioneered real-time data processing improvements through advanced streaming architectures, reducing latency and enhancing data accessibility
  • Implemented robust data quality frameworks and monitoring systems, maintaining high standards of data integrity across distributed platforms
  • Developed advanced data pipeline solutions using PySpark and Kafka, optimizing real-time processing capabilities while maintaining robust data governance standards

Data Engineer co-op

American Tire Distributors (ATD)
01.2022 - 08.2022
  • Led ETL development using Python, PySpark, SQL, integrating data from 7 distinct sources into Snowflake
  • Enhanced SQL query performance by 25% during Oracle & GCP to Snowflake migration
  • Devised an automation model, reducing manual checks by 80% and accelerating deployment process by 30+ hours per sprint
  • Implemented real-time data pipeline optimization strategies in Snowflake, resulting in substantial query cost reduction and improved system efficiency
  • Streamlined cross-platform data integration workflows between Oracle, GCP, and Snowflake, delivering measurable performance improvements
  • Architected automated data validation framework using Python and PySpark, significantly reducing manual intervention in deployment cycles
  • Orchestrated end-to-end ETL processes across multiple data sources, ensuring data accuracy and maintaining robust documentation standards
  • Partnered with cross-functional teams to optimize data pipeline architecture, establishing efficient data transformation protocols
  • Developed robust ETL solutions in Snowflake utilizing Python and PySpark, optimizing data integration from multiple sources while achieving 25% SQL performance boost

Data Engineer

Apollo TeleHealth
02.2016 - 12.2020
  • Led technical design and implementation of data solutions
  • Developed ETL pipelines, handling large data volumes with AWS Glue and Lambda
  • Streamlined real-time data ingestion and processing using Kafka and AWS Kinesis
  • Optimized data processing jobs for batch and real-time workloads
  • Delivered optimal data engineering solutions aligning with business objectives
  • Engineered scalable data architecture using AWS services, implementing robust ETL solutions that enhanced system reliability and processing efficiency
  • Integrated Kafka streaming platform with AWS Kinesis, reducing data latency and enabling real-time analytics for healthcare monitoring systems
  • Designed microservices-based data pipelines, modernizing legacy systems and improving data accessibility across healthcare platforms
  • Optimized query performance and data warehouse operations, establishing efficient data retrieval patterns for clinical applications
  • Partnered with cross-functional teams to implement data governance protocols, ensuring compliance with healthcare data regulations
  • Spearheaded AWS data architecture modernization, implementing microservices-based pipelines that enhanced healthcare data accessibility and processing efficiency

Education

Master of Science - Information Systems

Northeastern University
Boston, MA
12.2022

Bachelor of Technology - Computer Science and Engineering

Indian Institute of Technology
06.2014

Skills

  • Python
  • SQL
  • NoSQL
  • Unix
  • PySpark
  • Snowflake
  • Google Cloud
  • Airflow
  • AWS
  • Alteryx
  • Talend
  • Tableau
  • Power BI
  • MySQL
  • Kafka
  • Git
  • Jira
  • Docker
  • Data Analysis
  • Data Integration
  • Data Modeling
  • ETL Development
  • ETL development
  • Data warehousing
  • Data modeling
  • Data pipeline design
  • Data migration
  • Big data processing
  • Performance tuning

Timeline

Data Engineer

Better Being
07.2024 - Current

Data Engineer

Tausight
01.2024 - 06.2024

Data Engineer Intern

Tausight
08.2023 - 12.2023

Data Engineer

AT&T
01.2023 - 08.2023

Data Engineer co-op

American Tire Distributors (ATD)
01.2022 - 08.2022

Data Engineer

Apollo TeleHealth
02.2016 - 12.2020

Bachelor of Technology - Computer Science and Engineering

Indian Institute of Technology

Master of Science - Information Systems

Northeastern University
Tarun Chitturi