Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Sumasri Chitturi

Dallas

Summary

Data Engineer with around 6 years of experience working across telecom, SaaS, and finance domains. Skilled in building ETL pipelines, handling large datasets, and developing real-time data streaming solutions using tools like Apache Spark, Kafka, Databricks, Azure Data Factory, and AWS Glue. Proficient in Python and SQL for data transformation, automation, and analysis, with hands-on experience in cloud platforms such as Azure and AWS.

Experienced in designing data models, creating dashboards with Power BI and Tableau, and supporting business teams with analytics-ready data. Recently involved in building Generative AI applications using LangChain, LangGraph, and vector databases like FAISS and Pinecone. Focused on delivering reliable, scalable, and high-quality data solutions in Agile environments.

Overview

5
5
years of professional experience
1
1
Certification

Work History

Data Engineer

Frontier Communications
09.2024 - Current
  • Architected and implemented robust, scalable ETL frameworks leveraging Azure Data Factory and Apache Airflow to process petabytes of telecom data daily, ensuring 99.9% pipeline reliability and minimal latency for downstream analytics.
  • Led the design and optimization of complex, high-throughput data models on Azure SQL and Synapse Analytics, achieving a 40% reduction in query runtimes and enabling real-time telecom usage insights.
  • Spearheaded the migration of legacy on-premises data workflows to a cloud-native data platform, driving a 50% improvement in scalability and reducing infrastructure costs by 30%.
  • Developed advanced data quality frameworks integrating automated anomaly detection and root cause analysis, significantly improving data trustworthiness and operational decision-making.
  • Collaborated cross-functionally with data science and business intelligence teams to architect and deliver production-ready datasets for predictive modeling, including churn analysis and network fault prediction, impacting revenue retention strategies.
  • Mentored junior engineers in best practices for data pipeline development, SQL performance tuning, and cloud data solutions, fostering a culture of excellence and continuous learning.
  • Applied advanced statistical methods and hypothesis testing to validate telecom KPIs, contributing to strategic improvements in network reliability and customer experience.

Data Engineer

Verzeo
05.2023 - 08.2024
  • Supported the design, development, and maintenance of end-to-end ETL pipelines using Azure Data Factory and Python to integrate multi-source customer engagement and transaction data.
  • Assisted in building and optimizing relational and star schema data models on Azure SQL Server, improving query efficiency and supporting scalable analytics solutions.
  • Developed and maintained automated data quality frameworks using Python scripts and Power BI to monitor data accuracy, reducing data errors by 15%.
  • Wrote complex SQL queries, stored procedures, and views to extract actionable insights from large datasets, reducing report generation time by 25%.
  • Created interactive Power BI dashboards for marketing, sales, and customer success teams to track KPIs and measure campaign effectiveness.
  • Collaborated with senior engineers to migrate legacy batch processing workflows to Azure Synapse Analytics, enhancing data processing speed and cloud readiness.
  • Assisted in implementing data lineage and metadata management to improve data governance and compliance.
  • Participated in root cause analysis for data discrepancies and implemented fixes to ensure consistent and reliable data flow.
  • Engaged in regular code reviews, documentation, and knowledge-sharing sessions to align with team best practices and standards.
  • Applied basic statistical analysis and hypothesis testing for validating customer segmentation strategies and marketing A/B experiments.
  • Contributed to performance tuning of SQL queries and ETL jobs, supporting growing data volumes and evolving business requirements.
  • Coordinated with cross-functional teams including data scientists, analysts, and product managers to understand data needs and deliver timely solutions.

Data Analyst

Mindtree
01.2020 - 12.2022
  • Played a key role in data analysis and migration for a large-scale AEM portal migration project, ensuring seamless transition of business-critical content and metadata with minimal downtime.
  • Performed in-depth data profiling, cleansing, and transformation using SQL and Azure Data Studio, supporting the migration of legacy systems to a centralized cloud-based infrastructure.
  • Designed and implemented Power BI dashboards to visualize content usage metrics, migration progress, and system performance, enabling real-time stakeholder reporting.
  • Utilized Azure SQL Database and Azure Blob Storage for managing structured and unstructured data involved in portal integration.
  • Developed and maintained shell scripts for automating data extraction, transformation, and validation tasks across staging and production environments.
  • Worked closely with cross-functional teams including developers, AEM architects, and DevOps teams to align data mapping, API integration, and user experience requirements.
  • Ensured data integrity and compliance by building rule-based validation scripts and performing reconciliation checks across old and new portal environments.
  • Conducted impact analysis and built comparative reports to verify content consistency post-migration, reducing manual QA efforts by 40%.
  • Participated in documentation of data pipelines, business rules, and transformation logic for future scalability and audit readiness.
  • Gained hands-on experience with AEM (Adobe Experience Manager) for content structure understanding, metadata modeling, and integration with backend systems.
  • Contributed to task planning, delivery timelines, and post-migration support cycles, ensuring stakeholder satisfaction and successful project closure.

Education

Master of Professional Studies - Data Science

University of Maryland Baltimore County
Baltimore, Maryland

Skills

  • Programming Languages: Python, JavaScript, Shell scripting
  • Tools: Apache Airflow, Apache Kafka, Apache Spark, Databricks, PowerBI, Tableau
  • Databases: SQL, TSQL, MongoDB, Elasticsearch, SSIS
  • Big Data: Hadoop, Zookeeper, Sqoop, Hive
  • Machine Learning & Statistics: Regression, Clustering, Time-Series Forecasting, Exploratory Data Analysis, GBDT
  • DevOps Tools: Jenkins, Docker, Kubernetes, Github, Azure Devops, CloudBees, DBT (Data Build Tool)
  • CI/CD: Azure Devops, CloudBees, Jenkins
  • AWS Services: EC2, VPC, S3, RDS, Lambda, Redshift, Kinesis, QuickSight, ECS, EKS, Athena, Glue, Route53, CloudFormation, SageMaker, Lakeformation
  • Azure Services: AKS, Synapse Analytics, Azure Data Lake, Azure Blob Storage, Azure key Vaults, Data Factory, Microsoft Entra, Event Hubs, Stream Analytics, Autoscale, VM, Container Instances, Data Studio, SSIS
  • Gen AI Tools: Langchain, Langgraph, LangServe, Groq
  • VectorDbs: FAISS, CHROMA
  • LLM Models
  • Prompt Engineering

Certification

  • Introduction to Transformer Based NLP : NVIDIA
  • Building Real-Time Video AI Applications : NVIDIA
  • Accelerating End-to-End Data Science Workflows : NVIDIA
  • Project: Generative AI Applications with RAG and LangChain : IBM
  • IOT using Rasberry PI : Embedded Solutuions
  • Intermediate Python Certification : IBM

Timeline

Data Engineer

Frontier Communications
09.2024 - Current

Data Engineer

Verzeo
05.2023 - 08.2024

Data Analyst

Mindtree
01.2020 - 12.2022

Master of Professional Studies - Data Science

University of Maryland Baltimore County
Sumasri Chitturi