Summary
Overview
Work History
Education
Skills
Websites
Publications
Timeline
Generic

Yashaswini Gouru

Data Engineer
Bellevue,WA

Summary

Experienced data engineer with expertise in designing, developing, and deploying data solutions across the entire data lifecycle. Collaborative and skilled in delivering valuable data-driven insights for self-service reporting and improved business decision-making. Demonstrated success in leveraging data to drive organizational growth and efficiency, with 5 years of industry experience.

Overview

7
7
years of professional experience
6
6
years of post-secondary education

Work History

Senior Data Engineer

SID Global Solution
Bellevue, Washington
09.2021 - Current
  • Collaborated with research scientists to translate project requirements into technical solutions, optimizing R&D processes, and deploying machine learning models within automated CI/CD pipelines using Docker, AWS services, and CDK automation, reducing deployment time by 30% and improving performance by 70%.
  • Architected and implemented scalable, real-time data delivery systems with AWS Kinesis and Apache Flink for low-latency processing of video data and IoT device events, while developing end-to-end ETL pipelines with AWS Glue and Athena to support large-scale data analytics.
  • Led the design and deployment of cloud platforms using AWS EC2, Lambda, Redshift, and SageMaker, addressing security challenges to ensure data integrity, privacy, and secure access control.
  • Collaborated with clients to define project requirements, delivering tailored reporting solutions and optimizing high-performance databases (Aurora MySQL, DynamoDB, MongoDB) and ETL pipelines, reducing operational labor by 40% and improving data accessibility by 70%.
  • Architected scalable data warehouse solutions on AWS Redshift, increasing data processing capacity by 4x and reducing query execution time by 30%, while developing and maintaining robust data pipelines using Apache Airflow, Apache Flink, Glue, and Athena.
  • Implemented database replication strategies for high availability and designed multi-factor authentication to enhance platform security, ensuring 90% uptime with minimal disruptions.
  • Enhanced system performance by designing and implementing scalable data solutions for high-traffic applications.

Consultant

Capgemini
Jersey, NJ
06.2021 - 09.2021
  • Designed and optimized Spark data processing applications using Python and PySpark, improving data workflows and processing efficiency.
  • Orchestrated the construction of 5+ modular data pipelines, using Airflow and Databricks, enabling real-time data ingestion for regulatory reporting, and achieving 99.99% data accuracy rate.
  • Acted as a troubleshooter for data-related issues, enhancing operational efficiency and resolving critical data challenges.

Data Engineer

Exafluence
New York, New York
03.2021 - 06.2021
  • Led a data migration project to Snowflake, optimizing ETL processes with Python and PySpark, improving data accuracy by 30% and reducing management time by 70%
  • Enhanced data processing workflows within AWS EMR, improving performance and processing efficiency for large-scale datasets
  • Migrated legacy systems to modern big-data technologies, improving performance and scalability while minimizing business disruption.
  • Evaluated various tools, technologies, and best practices for potential adoption in the company''s data engineering processes.

Graduate Research Assistant

University of New Haven
West Haven, Connecticut
08.2019 - 12.2020
  • Co-authored an IEEE publication on 'Attribution Modeling for Deep Morphological Neural Networks using Saliency Maps.'
  • Engineered PyTorch scripts for morphological networks, enhancing feature extraction accuracy by 15% in image classification tasks, and reducing model training time by 20% with optimized code
  • Assisted in manuscript preparation, contributing to the publication of influential articles in peer-reviewed journals.

Associate Software Engineer

Accenture
Hyderabad, Telengana
11.2017 - 12.2018
  • Developed Python scripts for automating data operations, reducing manual effort by 20%
  • Designed APIs and optimized Python code to improve data processing performance by 30%
  • Collaborated with cross-functional teams to develop, test, and deploy high-quality software solutions for clients.

Education

Master of Science - Data Science

University of New Haven
West Haven, CT
01.2019 - 12.2020

Bachelor of Engineering - Electrical and Electronics Engineering

G. Narayanamma Institute of Technology And Sciences
India
09.2013 - 06.2017

Skills

  • Languages: Python , Typescript , React , PySpark, Scala

  • Databases: SQL Databases (Mysql , Postgres ) , No SQL Databases (Dynamo DB , Mongo DB, Graph) , Data Warehouse (Redshift , snowflake)

  • Cloud: AWS Cloud services , AWS cloud architecture ,CDK

  • Data Science : Machine Learning , Deep Learning , Artificial Intelligence , AI Security ,NLP

  • Big Data Tools: Hadoop , Apache Spark ,Hive , Apache Flink

  • ETL : Quicksight , Informatica , Tableau

Publications

Attribution Modeling for Deep Morphological Neural Networks using Saliency Maps, Muhammad Aminul Islam, Yashaswini Gouru, Charlie Vea, Derek T. Anderson

Timeline

Senior Data Engineer

SID Global Solution
09.2021 - Current

Consultant

Capgemini
06.2021 - 09.2021

Data Engineer

Exafluence
03.2021 - 06.2021

Graduate Research Assistant

University of New Haven
08.2019 - 12.2020

Master of Science - Data Science

University of New Haven
01.2019 - 12.2020

Associate Software Engineer

Accenture
11.2017 - 12.2018

Bachelor of Engineering - Electrical and Electronics Engineering

G. Narayanamma Institute of Technology And Sciences
09.2013 - 06.2017
Yashaswini GouruData Engineer