Summary
Overview
Work History
Education
Skills
Timeline
Generic

Shiva Reddy Anugu

Cameron Park,CA

Summary

AWS Data Engineer with 5+ years of experience in designing and implementing scalable cloud data solutions for healthcare, insurance, and retail sectors. Proficient in building end-to-end data pipelines using AWS Glue, EMR, Lambda, and Kinesis, ensuring efficient processing of large datasets. Demonstrated success in architecting data lakes on S3 integrated with Redshift and Snowflake, optimizing ETL/ELT workflows with Apache Spark and PySpark. AWS Certified Data Analytics Specialist with a strong focus on delivering data solutions that enhance business intelligence and support strategic decision-making.

Overview

6
6
years of professional experience

Work History

Data Engineer

Molina Healthcare
Long Beach, California
05.2024 - Current
  • Developed data warehouse for health maintenance organization, integrating IDX and MUMPS data with medical datasets.
  • Built and maintained AWS data pipelines using Glue, Kinesis, and S3 for healthcare data ingestion and processing.
  • Created real-time streaming solutions with Apache Spark and Kinesis to monitor patient data and operational metrics.
  • Collaborated with business experts to gather requirements for technical designs in data integration and reporting.
  • Implemented data quality checks and validated healthcare datasets through SQL queries.
  • Migrated large datasets from on-premises systems to AWS, ensuring integrity and compliance.
  • Applied machine learning algorithms in Python to predict user order quantities for automated suggestions using Kinesis Firehose and S3 data lake.

Data Engineer

Infosys Ltd.
Hyderabad, Telangana
02.2022 - 08.2023
  • Constructed scalable data pipelines with AWS Glue and Lambda, processing over 10TB of call records daily.
  • Engineered a data lake on S3 using Athena and Redshift to consolidate customer usage data.
  • Developed ETL workflows via Apache Spark on EMR, transforming raw telecom data into actionable insights.
  • Implemented real-time streaming solutions with AWS Kinesis to monitor network traffic and detect disruptions.
  • Established automated data quality checks and monitoring dashboards with CloudWatch and QuickSight, ensuring billing accuracy.

Data Analyst

Bitronics pvt Ltd
Hyderabad, Telangana
05.2020 - 12.2021
  • Analyzed claims data from over 50,000 policies using SQL and Excel to identify fraudulent patterns, helping the fraud investigation team recover $3.2 million and reduce false claims by 28%.
  • Built Power BI dashboards tracking policy renewal rates, customer demographics, and risk profiles, providing insights that improved underwriting decisions and increased policy retention by 15%.
  • Conducted predictive analysis on customer behavior and claims history using Python to segment high-risk customers. Low-risk policyholders enable more accurate premium pricing and reduce loss ratios by 12%.
  • Collaborated with actuarial and underwriting teams to analyze mortality trends and medical claims data, supporting the development of new health insurance products that generated $5M in first-year revenue.
  • Automated monthly reporting processes for regulatory compliance (NAIC, state filings) using Python scripts, reducing manual work by 25 hours per month, and ensuring 100% on-time submission accuracy.

Education

Master of Science - Computer Information Technology

Elmhurst University
Chicago, IL
05-2025

Skills

  • Big Data Technologies: Hadoop, MapReduce, HDFS, Hive, HBase, Kafka, Zookeeper, Yarn, Apache Spark
  • Databases: Oracle, MySQL, SQL Server, MongoDB, Cassandra, DynamoDB, PostgreSQL, Teradata, Cosmos
  • Programming: Python, PySpark, Scala, Java, C, C, Shell script, Perl script, SQL
  • Cloud Technologies: AWS, Microsoft Azure
  • Frameworks: Django REST framework
  • Tools: PyCharm, Eclipse, Visual Studio, SQL Developer, TOAD, SQL Query Analyzer, SQL Server Management Studio, SQL Assistance, Eclipse
  • Versioning tools: Git, GitHub
  • Operating systems: Windows 7, 8, XP, 2008, 2012, Ubuntu Linux, MacOS
  • Database modeling: dimension modeling, ER modeling, star schema modeling, snowflake modeling
  • Monitoring tool: Apache Airflow,
  • visualization/reporting: Tableau, ggplot2, matplotlib, SSRS, and Power BI
  • machine learning techniques: linear and logistic regression, classification and regression trees, random forest, associative rules, NLP, and clustering

Timeline

Data Engineer

Molina Healthcare
05.2024 - Current

Data Engineer

Infosys Ltd.
02.2022 - 08.2023

Data Analyst

Bitronics pvt Ltd
05.2020 - 12.2021

Master of Science - Computer Information Technology

Elmhurst University
Shiva Reddy Anugu