Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Vighnesh Kurra

Summary

Highly skilled Data Engineer with 5 years of experience in designing, implementing, and optimizing scalable data pipelines, ETL processes, and distributed data systems. Adept at developing data architectures using cloud platforms like AWS, GCP, and Azure, while implementing advanced real-time and batch data processing solutions. Extensive hands-on experience with data modeling, system design, schema optimization, and API integrations. Proficient in building and maintaining large-scale data pipelines using technologies such as Apache Spark, Kafka, Hadoop, and ElasticSearch. Strong foundation in database management, transactions, indexing, and concurrency control. A dedicated professional with excellent communication skills and a proven track record of supporting data science and machine learning workflows.

Overview

6
6
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Capital One
01.2024 - Current
  • Designed and developed large-scale, distributed data architectures using AWS (S3, Redshift, Glue), processing over 10TB of data daily to support real-time analytics and batch processing.
  • Implemented complex ETL pipelines with Apache Airflow, AWS Glue, and Redshift, automating data ingestion and transformation from multiple sources.
  • Built real-time streaming pipelines using Apache Kafka and AWS Kinesis, reducing data latency by 40% and enabling real-time decision-making across key business units.
  • Developed and optimized data models using dbt, creating highly efficient schemas and improving query performance by 30%.
  • Collaborated with machine learning teams to design and implement data pipelines for training models, utilizing PySpark and Spark MLlib for large-scale predictive analytics.
  • Applied advanced machine learning algorithms (e.g., random forest, logistic regression) for customer churn prediction, enhancing model accuracy by 20%.
  • Managed distributed storage systems such as ElasticSearch and Cassandra, optimizing large-scale, high-throughput applications that handled millions of records per day.
  • Led the creation and execution of the data engineering roadmap, focusing on scalability, performance optimization, and future infrastructure planning.
  • Implemented batch processing workflows using Spark, automating daily data updates and reducing manual intervention by 80%.
  • Designed API integrations to sync data across various CRM platforms, facilitating seamless data exchange between internal systems and external applications.

Data Engineer

Amazon
01.2020 - 07.2022
  • rchitected and maintained a high-performance data lake on AWS S3 and Redshift, supporting analytics and reporting for globally distributed teams.
  • Designed and optimized ETL pipelines using AWS Glue, transforming and loading over 50 million records daily to support business intelligence and analytics initiatives.
  • Built real-time and batch data pipelines with Apache Kafka, Spark, and AWS Kinesis, improving data freshness and reducing processing times by 35%.
  • Developed advanced data models in Snowflake and Redshift, improving query performance by 45% and enabling efficient large-scale data retrieval.
  • Created interactive Tableau and Power BI dashboards to visualize business metrics, enhancing decision-making for product and marketing teams.
  • Collaborated on API development and integration for data synchronization across multiple systems, improving cross-team collaboration and efficiency.
  • Handled large-scale distributed storage systems, optimizing Elasticsearch and Cassandra clusters for improved indexing and query performance.

Data Engineer

United Health Group
09.2018 - 01.2020
  • Designed and developed batch and real-time data processing pipelines using Azure Data Factory, Apache Airflow, and Snowflake, reducing data latency and improving processing efficiency by 50%.
  • Built and optimized ETL processes for integrating data from multiple internal and external sources, ensuring consistency and accuracy across systems.
  • Developed distributed data systems using Hadoop, Spark, and Kafka, processing over 100 million records daily and improving system scalability.
  • Implemented advanced schema designs to support business intelligence and analytics requirements, improving query performance by 30%.
  • Created data ingestion frameworks for APIs and CRM platforms, enabling real-time data access for multiple business units.
  • Collaborated closely with cross-functional teams to build and maintain interactive dashboards using Power BI and Tableau, providing actionable insights for key stakeholders.
  • Designed and implemented a data governance framework, ensuring data quality, security, and regulatory compliance across the organization.

Education

Master of Science - Business Analytics

University of Maryland, College Park
College Park, MD

Bachelor of Science - Electronics And Communication Engineering

Vellore Institute of Technology
Vellore, India

Skills

    Programming Languages: Python, Scala, Java, SQL, Go
    Cloud Platforms: AWS (S3, Redshift, Glue, Lambda), GCP (BigQuery, Dataproc), Azure (Data Factory, Synapse)
    ETL & Orchestration: Apache Airflow, dbt, AWS Glue, SSIS
    Data Warehousing: Snowflake, Redshift, BigQuery, Azure Synapse
    Big Data Technologies: Apache Hadoop, Hive, Spark, HBase, Kafka, Presto, Beam
    CI/CD & DevOps: Jenkins, Docker, Kubernetes, Terraform
    Distributed Storage: ElasticSearch, Cassandra, HDFS, Parquet, Avro
    Database Systems: PostgreSQL, MySQL, SQL Server, Cassandra
    Data Visualization: Tableau, Power BI, Amazon QuickSight
    Data Modeling & Warehousing: Star Schema, Snowflake Schema, Data Lakes
    Streaming Technologies: Apache Kafka, AWS Kinesis
    API Integrations: RESTful APIs, CRM Platforms
    Data Science & ML: PySpark, Scikit-Learn, TensorFlow, Databricks

Certification

  • AWS Solutions Architect Associate
  • Salesforce Administrator
  • Tableau Desktop Specialist
  • Certified Snowflake Professional

Timeline

Senior Data Engineer

Capital One
01.2024 - Current

Data Engineer

Amazon
01.2020 - 07.2022

Data Engineer

United Health Group
09.2018 - 01.2020

Master of Science - Business Analytics

University of Maryland, College Park

Bachelor of Science - Electronics And Communication Engineering

Vellore Institute of Technology
Vighnesh Kurra