Summary
Overview
Work History
Education
Skills
Timeline
Generic

SRIYA RAO AILINENI

plano

Summary

Data Engineer / Analyst with over 5 years of hands-on experience in developing and deploying end-to-end data solutions across retail, automotive, and industrial domains. Adept in Java, Python, Scala, and SQL with proven expertise in building cloud-native applications on AWS, Azure, and GCP. Proficient in designing data pipelines, ETL/ELT workflows, real-time streaming solutions (Kafka, Spark Streaming), and distributed systems (Hadoop, Hive, MapReduce). Experienced in working with NoSQL (MongoDB, Cassandra) and Cloud Data Warehouses (Snowflake, Redshift). Strong Agile team player with mentoring capabilities, unit testing skills, and a drive to stay ahead of tech trends.

Overview

4
4
years of professional experience

Work History

Data Engineer

Citi Bank
TX
03.2023 - 05.2025
  • Collaborated with cross-functional Agile teams to deliver scalable full-stack data solutions.
  • Led initiatives to refactor monolithic data jobs into modular microservices architecture using Python and PySpark.
  • Built and optimized distributed data processing pipelines on Azure Databricks with Spark and Delta Lake.
  • Designed and deployed streaming applications using Kafka and Spark Streaming to process real-time events.
  • Managed data flow into Snowflake and Redshift for downstream reporting and analytics.
  • Integrated machine learning models into production pipelines using AWS SageMaker.
  • Developed unit/integration tests using PyTest and participated in rigorous peer code reviews.
  • Maintained KPI dashboards using Power BI for business insights on customer satisfaction (CSAT) and website metrics.
  • Supported cloud infrastructure using Azure services (Data Factory, Blob Storage, ADLS) and AWS S3.
  • Technologies: Python, Scala, PySpark, Kafka, Hive, Presto, Snowflake, Redshift, Azure, Databricks, ADF, S3, Power BI

Data Engineer

Chase Bank
TX
02.2022 - 02.2023
  • Collaborated with Agile squads to build end-to-end data pipelines using Databricks, PySpark, and Scala, supporting enterprise-level reporting and advanced analytics use cases.
  • Designed and implemented real-time streaming applications using Apache Kafka to ingest and process vehicle telemetry data for predictive maintenance.
  • Refactored batch ETL into event-driven microservices using AWS Lambda and S3 triggers, enhancing processing efficiency by 50%.
  • Applied Spark performance tuning best practices such as broadcast joins, caching strategies, and partitioning to improve large dataset operations.
  • Integrated structured, semi-structured, and unstructured data from various sources into a centralized Snowflake warehouse, enabling business insights in real time.
  • Developed automated data validation frameworks with Python and PySpark, ensuring data quality and consistency in production pipelines.
  • Partnered with business stakeholders to gather requirements and build customized BI dashboards in Power BI and Tableau.
  • Utilized Azure Data Lake, Azure Blob Storage, and Azure SQL to build scalable cloud-based data repositories.
  • Implemented data lineage and metadata tracking using Collibra and integrated it with Azure Purview.
  • Supported and maintained CI/CD pipelines for code deployment using Azure DevOps and GitHub, enabling faster release cycles.
  • Created user guides and technical documentation to train new team members and business analysts.
  • Participated in cross-team architectural reviews, helping standardize reusable components for enterprise-wide data solutions.

Data Engineer

Honeywell Corporation
01.2021 - 12.2021
  • Developed scalable ETL workflows using Hadoop, Hive, and MapReduce to process large volumes of sensor and manufacturing data.
  • Designed and implemented a data lake architecture using AWS S3 and Glue Catalog, enabling downstream analytics for product engineering teams.
  • Led the integration of machine learning models with real-time decision systems using SageMaker endpoints and Lambda functions.
  • Conducted hyperparameter tuning and used grid search and cross-validation techniques to improve model accuracy by 18%.
  • Conducted data pre-processing, feature selection, and feature extraction using Python (Pandas, Scikit-learn), resulting in improved ML model performance.
  • Built monitoring dashboards using Power BI to visualize the output of deployed ML models and detect data drifts.
  • Created REST APIs in Python (Flask) to expose ML prediction services for consumption by other applications.
  • Established automated alerts for pipeline failures using CloudWatch and SNS notifications, improving system reliability and response times.
  • Led weekly cross-team meetings to report project status, clarify requirements, and present analytical findings to senior management.
  • Created data cataloging and tagging policies to support the adoption of enterprise data governance frameworks.
  • Mentored junior team members on big data technologies, machine learning workflows, and data engineering best practices.
  • Implemented unit testing frameworks using PyTest and integrated test coverage tools to ensure robust code quality.

Education

Masters - Information Science

Trine University
01.2023

Bachelors - Computer Engineering

SR University
01.2021

Skills

  • Programming languages: Python, Java, Scala, SQL, PySpark, Shell scripting, JavaScript (basics), VBA
  • Cloud platforms: AWS (S3, Lambda, EMR, Redshift, SageMaker, Step Functions), Azure (Data Factory, Blob, ADLS, Databricks), GCP (BigQuery, DataProc, Dataflow, Pub/Sub)
  • Big data technologies: Apache Spark, Kafka, Hadoop, Hive, MapReduce, EMR
  • Data warehousing solutions: Snowflake, Amazon Redshift, Azure Synapse, PostgreSQL
  • Database management: MongoDB, Cassandra, MySQL, Oracle
  • ETL tools: Apache Airflow, Azure Data Factory
  • Business intelligence tools: Power BI, Tableau
  • Machine learning techniques: Regression (linear and logistic), decision trees
  • DevOps practices: Git and GitHub Actions
  • Agile methodologies: Scrum and Kanban
  • Development environments: Jupyter Notebook and VS Code
  • Operating systems: Windows and Linux
  • Documentation tools: MS Office Suite and Microsoft Project

Timeline

Data Engineer

Citi Bank
03.2023 - 05.2025

Data Engineer

Chase Bank
02.2022 - 02.2023

Data Engineer

Honeywell Corporation
01.2021 - 12.2021

Masters - Information Science

Trine University

Bachelors - Computer Engineering

SR University