Summary
Overview
Work History
Education
Skills
OperationsManager
Sumanth Marri

Sumanth Marri

Data Engineer
Springfield,IL

Summary

Results-driven data engineering professional with solid foundation in designing and maintaining scalable data systems. Expertise in developing efficient ETL processes and ensuring data accuracy, contributing to impactful business insights. Known for strong collaborative skills and ability to adapt to dynamic project requirements, delivering reliable and timely solutions. Knowledgeable Data Engineer with robust background in data architecture and pipeline development. Proven ability to streamline data processes and enhance data integrity through innovative solutions. Demonstrates advanced proficiency in SQL and Python, leveraging these skills to support cross-functional teams and drive data-driven decision-making.

Skilled in building high-performance ETL workflows using Apache Airflow, Databricks, and Informatica, with proven success in reducing data processing times by up to 45% in enterprise environments. Adept in cloud data engineering on AWS and Azure platforms, including Redshift, Glue, S3, Data Factory, and Synapse, enhancing scalability, cost efficiency, and data accessibility for mission-critical projects. Proficient in modern Big Data ecosystems leveraging Spark, Hadoop, Hive, and Snowflake, enabling the delivery of real-time insights, data warehousing solutions, and large-scale analytics initiatives. Experienced in designing CI/CD pipelines using GitHub Actions, Jenkins, Terraform, and Kubernetes to automate deployments, ensuring high availability and reliability of complex data infrastructures. Highly collaborative professional with strong Agile and DevOps experience, driving cross-functional initiatives, improving operational efficiency, and supporting strategic decision-making through advanced data visualizations and predictive modeling.



Overview

4
4
years of professional experience

Work History

Data Engineer

Metlife
09.2023 - Current
  • Developed automated ETL and ELT workflows using Apache Airflow, Databricks, and AWS Glue, decreasing insurance claim data processing time by 45% across 8M+ daily transactions.
  • Built event-driven ingestion pipelines leveraging AWS Kinesis and Redshift, increasing real-time data availability by 38% for risk scoring and underwriting teams.
  • Led Snowflake warehouse optimization initiative, reducing query runtimes by 52% and enabling sub-second financial reporting for executive dashboards.
  • Engineered CI/CD pipelines with GitHub Actions and Jenkins for ETL releases, achieving 99.9% deployment success rates and cutting manual interventions by 70%.
  • Migrated 12+ on-premises Oracle and SQL Server datasets to AWS S3 and PostgreSQL, achieving 40% faster data retrieval speeds and 60% storage scalability gains.
  • Collaborated with Data Scientists to operationalized ML models on PyTorch and TensorFlow, reducing insurance fraud risk exposure by 22% year-over-year.
  • Developed Tableau dashboards to visualize investment performance, helping leadership increase timely investment decisions by 18% across diversified portfolios.
  • Implemented automated data quality validation with Talend and Kafka Streams, slashing ETL error rates by 48% and strengthening compliance with internal audit standards.

Data Engineer

Citius Tech
01.2020 - 07.2022
  • Designed and scaled ETL pipelines using Apache Spark and Informatica, reducing healthcare data aggregation cycle times by 33% across clinical and operational systems.
  • Integrated IBM Db2 and Azure SQL Data Warehouse for hybrid cloud analytics, accelerating claims reporting by 40% and improving SLA adherence to 99.5%.
  • Orchestrated real-time NoSQL database pipelines using Cassandra and DynamoDB, achieving 25% faster access to EMR and EHR systems for client hospitals.
  • Modernized ETL architecture through Azure Data Factory and Databricks, automating 90% of workflows and reducing pipeline maintenance overhead by 35%.
  • Developed AWS QuickSight dashboards to provide real-time infrastructure monitoring, enabling 28% faster remediation of critical system outages.
  • Migrated batch-driven Hadoop workloads to streaming-first architecture using Kafka and Hive, reducing batch window execution times by 42%.
  • Deployed CI/CD pipelines with Terraform, Jenkins, and Kubernetes, achieving 97% automated deployment coverage for production and staging data environments.
  • Collaborated in Agile DevOps squads, increasing sprint velocity by 20% and improving stakeholder satisfaction with real-time delivery of healthcare analytics products.

Education

Master's - Management information systems

University of Illinois at Springfield
12.2023

Bachelor's - Mechanical Engineering

Guru Nanak Institute of Technical campus
11.2020

Skills

  • Python
  • SQL
  • Scala
  • PySpark
  • PL/SQL
  • Spark SQL
  • HiveQL
  • DDL
  • AWS
  • Azure
  • GCP
  • Hadoop
  • Apache Spark
  • Hive
  • Kafka
  • HBase
  • Flink
  • Pig
  • Presto
  • Apache Airflow
  • DBT
  • Informatica
  • Talend
  • Databricks
  • AWS Glue
  • Azure Data Factory
  • Snowflake
  • Amazon Redshift
  • BigQuery
  • Azure SQL Data Warehouse
  • IBM Db2
  • Tableau
  • Power BI
  • AWS QuickSight
  • Looker
  • GitHub Actions
  • Jenkins
  • Terraform
  • Docker
  • Kubernetes
Sumanth MarriData Engineer