Summary
Overview
Work History
Education
Skills
Certification
Timeline
Kalyan K Soma

Kalyan K Soma

Data Engineer
Irving,TX

Summary

Skilled Data Engineer with 4.5+ years of experience designing and implementing scalable ETL pipelines using Vertex AI pipelines, AWS Glue, and Spark. Proven expertise in building and optimizing data models, data warehouses, and ensuring data quality and governance. Proficient in Google Cloud Platform (GCP) and AWS services, with a strong background in managing data infrastructure and enabling seamless collaboration between data teams. Positive, analytical problem-solver with strong foundation in data systems and processes. Possesses solid understanding of data modeling and database design, coupled with skills in SQL and Python. Capable of driving data-driven decision-making and improving data infrastructure.

Overview

5
5
years of professional experience
2
2
years of post-secondary education
3
3
Certifications

Work History

Data Engineer

Advithri Technologies
Frisco, TX
02.2023 - 12.2024
  • Company Overview: PNC Bank (Client)
  • Designed, developed, and maintained scalable ETL pipelines using Vertex AI pipelines, AWS Glue, and Spark for multi-terabyte datasets
  • Built and optimized data models, databases, and data warehouses, improving storage and retrieval by 30%
  • Collaborated with data analysts and data scientists to align pipeline outputs with analytical needs
  • Implemented data validation and governance processes, ensuring data quality and reliability
  • Deployed and managed data infrastructure on Google Cloud Platform (GCP), leveraging Dataflow for processing
  • Monitored and resolved performance issues using AWS CloudWatch and custom logging solutions
  • Maintained comprehensive documentation for all data systems, workflows, and processes
  • Optimized data processing by implementing efficient ETL pipelines and streamlining database design.
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Enhanced data quality by performing thorough cleaning, validation, and transformation tasks.
  • Led end-to-end implementation of multiple high-impact projects from requirements gathering through deployment and post-launch support stages.

Software Engineer - Data

Tech Mahindra
Bangalore, Karnataka
06.2019 - 08.2021
  • Company Overview: Nestle (Client)
  • Built scalable ETL pipelines using Databricks, PySpark, and AWS Glue for high-volume datasets
  • Optimized data models and pipelines, reducing latency by 25%
  • Implemented robust data governance measures, ensuring compliance and security
  • Orchestrated workflows using Apache Airflow, ensuring seamless job execution
  • Collaborated with data teams to enable advanced analytics and machine learning workflows
  • Developed scalable and maintainable code, ensuring long-term stability of the software.
  • Integrated new technologies into existing systems, increasing capabilities and improving overall performance.
  • Implemented effective debugging strategies, resulting in fewer software defects and increased reliability.

Junior Data Engineer

Souxe Technologies
Bangalore, Karnataka
06.2018 - 06.2019
  • Developed fault-tolerant ETL pipelines using Hadoop, Hive, and Spark
  • Managed data ingestion and streaming workflows using Kafka and SQS
  • Enhanced data validation and integrity checks, ensuring secure and accurate data processing
  • Participated in code reviews and provided constructive feedback to peers, fostering a culture of continuous improvement within the team.
  • Accelerated reporting capabilities by automating routine tasks, reducing manual effort, and improving overall efficiency.

Education

Master of Science - Data Science

University of North Texas, Denton, Texas
01.2021 - 01.2022

GPA: 3.7/4.0

Bachelor of Technology - Electronics and Communication

Gitam University, Visakhapatnam, India
01.2014 - 01.2018
GPA: 3.7/4.0

Skills

ETL development

undefined

Certification

Google Cloud Professional Data Engineer Certification, 2024

Timeline

Data Engineer - Advithri Technologies
02.2023 - 12.2024
University of North Texas - Master of Science, Data Science
01.2021 - 01.2022
Software Engineer - Data - Tech Mahindra
06.2019 - 08.2021
Junior Data Engineer - Souxe Technologies
06.2018 - 06.2019
Gitam University - Bachelor of Technology, Electronics and Communication
01.2014 - 01.2018
Kalyan K SomaData Engineer