Summary
Overview
Work History
Education
Skills
Timeline
Generic

YAMINI

Farmington,MI

Summary

Results-driven Data Engineer with 5 years of experience in designing, developing, and optimizing ETL pipelines, data warehouses, and real-time data processing systems. Proficient in SQL, Python, Apache Spark, and Airflow, with hands-on experience in cloud platforms (AWS, Azure, GCP) and big data technologies. Skilled in data modeling, query optimization, and automation, ensuring high performance and efficiency in data workflows. Experienced in working with structured and unstructured data, integrating diverse data sources, and implementing data governance and security standards. Adept at collaborating with cross-functional teams, data scientists, and analysts to deliver scalable and reliable data solutions that drive business insights and decision-making. Passionate about leveraging modern data engineering practices to enhance data accessibility, scalability, and efficiency in enterprise environments.

Overview

7
7
years of professional experience

Work History

Data Engineer

SM tech
Atlanta, GA
05.2023 - Current
  • Designed and developed scalable data pipelines using Apache Spark, Airflow, Python, and SQL to process large datasets efficiently.
  • Built and maintained cloud-based data infrastructure on AWS (Redshift, S3, Glue, Lambda, and EC2) to support enterprise data needs.
  • Implemented real-time data streaming solutions using Apache Kafka, enabling faster insights and improving data availability.
  • Developed ETL frameworks to automate data ingestion, transformation, and storage, reducing manual data processing efforts by 50%.
  • Optimized SQL queries and data models, improving query execution time by 35% and enhancing database performance.
  • Integrated multiple data sources (APIs, databases, flat files) into a centralized data warehouse for business intelligence and analytics.
  • Enhanced data security and compliance by implementing encryption, access control, and data governance policiesto meet regulatory standards.
  • Migrated on-premises data systems to cloud platforms (AWS, Snowflake), reducing infrastructure costs by 30% and improving scalability.
  • Developed CI/CD pipelines for data engineering workflows, ensuring seamless deployment and version control of ETL processes.
  • Created monitoring and logging solutions for real-time and batch data pipelines, reducing downtime and improving data reliability.
  • Worked with Data Scientists and Analysts to prepare data for AI/ML models, ensuring high-quality, well-structured datasets.
  • Implemented data partitioning and indexing strategies, leading to a 40% improvement in query performance.
  • Developed RESTful APIs for data access, enabling seamless integration with business applications and reporting tools.
  • Collaborated with cross-functional teams, including product managers and software engineers, to define data strategy and best practices.
  • Conducted root cause analysis on data failures, troubleshooting issues and implementing proactive solutions to minimize disruptions.
  • Automated data validation and quality checks, reducing data errors by 45% and improving reporting accuracy.
  • Built interactive dashboards using Tableau and Power BI, providing real-time insights for business stakeholders.Led training sessions for junior engineers, helping them understand best practices in ETL, data modeling, and cloud technologies.
  • Continuously researched and adopted emerging data technologies, ensuring the organization stays ahead with modern data engineering practices.
  • Improved overall system efficiency and data processing speed, contributing to a faster decision-making processwithin the company.

Data Engineer

Def Techs
Hyderabad, Telangana
06.2020 - 07.2022
  • Developed & Optimized Data Pipelines: Designed and implemented ETL workflows using Apache Spark, Airflow, SQL, and Python to efficiently process large datasets.
  • Built & Maintained Data Warehouses: Developed data lake and warehouse solutions on AWS (Redshift, S3, Glue), ensuring scalable and cost-effective data storage.
  • Implemented Real-Time Data Streaming: Integrated Apache Kafka for real-time data ingestion and processing, reducing latency in business-critical applications.
  • Optimized SQL Queries & Data Models: Refactored complex SQL queries and improved database indexing and partitioning, reducing query execution time by 30%.
  • Ensured Data Quality & Governance: Implemented data validation, anomaly detection, and governance frameworks, enhancing data integrity and compliance.
  • Automated Data Processing: Developed Python scripts and automated workflows to eliminate manual data handling, reducing operational workload by 40%.
  • Collaborated with Cross-Functional Teams: Worked closely with data analysts, scientists, and business stakeholders to provide data solutions supporting BI dashboards and ML models.
  • Enhanced Operational Efficiency: Streamlined data workflows, leading to improved cost optimization and resource utilization across data infrastructure.

Junior Data Engineer

XYZ Technologies
Hyderabad, Telangana
05.2018 - 05.2020
  • ETL Development:Built and maintained ETL workflows using SQL, Python, and Apache NiFi to ingest data from multiple sources (databases, APIs, flat files).
    Assisted in automating data pipelines to streamline the extraction, transformation, and loading (ETL) processes.
  • Data Pipeline Maintenance & Troubleshooting:Monitored scheduled ETL jobs for failures and performance issues, ensuring minimal downtime.
    Debugged and resolved data pipeline errors by analyzing logs and alerts.
    Collaborated with senior engineers to improve pipeline reliability.
  • SQL & Database Optimization:Wrote and optimized complex SQL queries and stored procedures to transform raw data into structured formats.
    Assisted in database indexing, partitioning, and query tuning to improve query execution time.
    Data Warehousing & Storage:Helped in migrating legacy systems to a cloud-based data warehouse (AWS Redshift, Snowflake, or BigQuery).
    Maintained data models and schemas to support analytics and reporting needs.
    Data Warehousing & Storage: Helped in migrating legacy systems to a cloud-based data warehouse (AWS Redshift, Snowflake, or BigQuery).
    Maintained data models and schemas to support analytics and reporting needs.

Data Warehousing & Storage:

  • Helped in migrating legacy systems to a cloud-based data warehouse (AWS Redshift, Snowflake, or BigQuery).
  • Maintained data models and schemas to support analytics and reporting needs.

Data Visualization & Reporting:

  • Created interactive dashboards using Tableau and Power BI to present insights from processed data.
  • Assisted business teams by generating ad-hoc reports and performing data validation checks.

Collaboration & Learning:

  • Worked with senior data engineers, data analysts, and data scientists to understand business data needs.
  • Participated in code reviews and knowledge-sharing sessions to improve best practices.

Education

Master of Science - Computer And Information Systems

Indiana Institute of Technology
Fort Wayne, IN
07-2024

Bachelor of Science - Computer Science

GITAM UNIVERSITY
INDIA
04-2022

Skills

  • Programming Languages — SAS, R(dplyr, ggplot), Python (including libraries such as NumPy, Pandas, Scikit-learn,T ensorFlow, Keras, PyT orch), SQL(MySQL,SQL Server, PostgreSQL)
  • Data Analysis & Visualization — Power BI, T ableau, Google Data Studio,Excel (Pivot T ables,VLOOKUP , VBA, Macros)
  • Database Management & Processing — MongoDB, Snowflake, ETL Processes, Data Warehousing, DataWrangling,Apache Spark, Hadoop, Google BigQuery
  • Data Science & Analytics — Supervised & Unsupervised Learning (Regression, Classification, Clustering),DeepLearning (CNNs, RNNs),Time-Series Forecasting & Analysis ,Natural Language Processing (BERT, Word2Vec,Sentiment Analysis),Feature Engineering,Predictive Modeling & Statistical Analysis,A/B T esting, Web Scraping forData Extraction
  • T ools & T echnologies — JIRA, Git, Docker, Kubernetes, Apache Spark, Visual studio, SAP
  • Data Engineering: ETL Pipelines, Data Modeling, Data Warehousing
  • Big Data and Cloud: AWS Glue, Google BigQuery, Snowflake, Apache Spark, Hadoop

Timeline

Data Engineer

SM tech
05.2023 - Current

Data Engineer

Def Techs
06.2020 - 07.2022

Junior Data Engineer

XYZ Technologies
05.2018 - 05.2020

Master of Science - Computer And Information Systems

Indiana Institute of Technology

Bachelor of Science - Computer Science

GITAM UNIVERSITY
YAMINI