Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

NARSA REDDY ALLURI

Summary

Results-driven Data Engineer with 3+ years’ experience building scalable, secure, and efficient data pipelines across healthcare and financial sectors. Adept at Python, PySpark, advanced SQL (complex joins, window functions, query optimization), and big data ecosystems to deliver high-quality, analytics-ready datasets. Skilled in cloud-native architectures (AWS, Azure), modern orchestration tools (Airflow, dbt), and data quality automation. Known for optimizing costs, improving data reliability, and collaborating effectively with cross-functional teams to translate complex requirements into impactful solutions.

Overview

4
4
years of professional experience
1
1
Certification

Work History

Data Engineer

The Cigna Group
12.2024 - 07.2025
  • Engineered scalable, test-driven ingestion pipelines integrating AWS S3 with Apache Spark, storing healthcare claims in Parquet — reducing storage costs by 25%.
  • Automated data validation (null handling, schema checks, referential integrity) using PySpark/SQL, cutting load errors by 18%.
  • Developed incremental dbt models in Snowflake with version control and automated testing for reliable analytics workflows.
  • Orchestrated ETL processes with AWS Glue and event-driven AWS Lambda triggers.
  • Enforced HIPAA-aligned data security via IAM roles and KMS encryption.
  • Created optimized Athena external tables for direct S3 querying, lowering warehousing expenses by 15%.
  • Built real-time CDC ingestion from Oracle/PostgreSQL to AWS S3 and Snowflake, incorporating schema-change resilience.
  • Scheduled Spark jobs in Apache Airflow with retries, alerts, and quality gates for operational resilience.
  • Partnered with clinical analytics teams to design optimized SQL transformations, improving reporting delivery times by 20%.

Data Engineer

Citi Bank
05.2021 - 08.2023
  • Designed secure, validated financial data pipelines in Azure Data Factory, ingesting from Oracle into Azure Data Lake with encryption and audit-ready validation.
  • Delivered optimized ETL workflows in Azure Databricks using advanced SQL queries (window functions, nested CTEs, and performance tuning).
  • Built PySpark transformations converting raw data into curated datasets, integrating unit & integration testing for reliability.
  • Managed credentials via IAM and Azure Key Vault, strengthening platform security.
  • Produced optimized external tables in Azure Synapse powering high-performance Power BI dashboards.
  • Led ETL migration from Informatica to Azure Data Factory, reducing defects by 30% and increasing maintainability.
  • Worked closely with compliance teams to clearly document pipeline logic, ensuring audit compliance without delays.

Education

Master of Science - Technology Management

Lindsey Wilson University
KY, USA
05.2025

Bachelor of Science - Statistics & Computer Science

Keshav Memorial Institute of Sciences
HYDERABAD, INDIA
05.2022

Skills

  • Programming & Scripting: Python, PySpark, SQL (Advanced), Shell Scripting
  • Big Data & Processing: Apache Spark, Hadoop, Hive
  • Cloud Platforms: AWS (S3, EMR, Glue, Redshift), Azure (Data Factory, Databricks, Synapse)
  • Orchestration & Integration: Apache Airflow, dbt, AWS Glue, Azure Data Factory
  • Data Quality & Testing: Great Expectations, PyTest, custom SQL/Python unit tests
  • Databases & Storage: Oracle, PostgreSQL, Snowflake, Redshift
  • Streaming & Messaging: Apache Kafka
  • DevOps & Containers: Docker, Kubernetes
  • Reporting & Monitoring: Power BI, Tableau, Grafana, AWS CloudWatch

Certification

  • Google Cloud Professional Data Engineer
  • Python for Data Science -Learnbay
  • Machine Learning- Learnbay

Timeline

Data Engineer

The Cigna Group
12.2024 - 07.2025

Data Engineer

Citi Bank
05.2021 - 08.2023

Master of Science - Technology Management

Lindsey Wilson University

Bachelor of Science - Statistics & Computer Science

Keshav Memorial Institute of Sciences
NARSA REDDY ALLURI