Summary

Overview

Work history

Education

Skills

OBJECTIVE:

Timeline

YASHWANTH CHOWDARY KUNTA

San Antonio,TX

Summary

10+ Years Experience in Design, build, and maintain scalable data pipelines to support data integration, ETL processes, and data warehousing.
Develop and optimize SQL queries to extract, transform, and load data from various sources into datawarehouses.
Experience in Data engineering encompassing Requirements Analysis, Design Specification, and Testing in both Waterfall and Agile methodologies.
Experienced Data Engineer with expertise in designing and optimizing data pipelines using Microsoft Fabric. Proficient in integrating and managing large-scale datasets across hybrid environments. Skilled in building real-time data solutions, leveraging cloud-based architectures, and ensuring high-performance data processing with Fabric’s robust analytics and storage capabilities.
Fluent programming experience with Scala, Java, Python, SQL, T-SQL and Hands-on experience in developing and deploying enterprise-based applications using major Hadoop ecosystem components like MapReduce, YARN, Hive, HBase.
Experience in Extraction, Transformation and Loading (ETL) data from various sources into Data Warehouses, as well as data processing like collecting, aggregating and moving data from various sources using Apache Flume, Kafka, PowerBI and Microsoft SSIS.
Proficient in designing and managing REST APIs and integrations using the MuleSoft Any point Platform, with hands-on experience in deploying applications across various environments.
Hands-on experience with Hadoop architecture and various components such as Hadoop File System HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Hadoop MapReduce programming.
Experience working with Azure Logic APP Integration tool and Experience working with Data warehouse like Oracle, SAP, HANA.
Experience in building ETL (Azure Data Bricks) data pipelines leveraging PySpark, SparkSQL and Experience in building the Orchestration on Azure Data Factory for scheduling purposes and Batch Processing.
Hands-on experience in Azure Analytics Services - Azure Data Lake Store (ADLS), Azure Data Lake Analytics (ADLA), Azure SQL DW, Azure Data Factory (ADF), Azure Data Bricks (ADB) etc.
Orchestrated data integration CI/CD pipelines in ADF using various Activities like Get Metadata, Lookup, For Each, Wait, Execute Pipeline, Set Variable, Filter, and programming experience on working with Python, Scala.
Experience working in a cross-functional AGILE Scrum team and good knowledge on Poly base external tables in SQLDW and also involved in production support activities
Hands-on experience with Amazon EC2, Amazon S3, Amazon RDS, VPC, IAM, Amazon Elastic Load Balancing, Auto Scaling, CloudWatch, SNS, SES, SQS, AWS Lambda, EMR and other services of the AWS family.
Installed, created, and maintained CI/CD (continuous integration & deployment) pipelines and apply automation to environments and applications. and worked on various automation tools like GIT, Terraform, Ansible.
Developed web-based applications using Python, DJANGO, QT, C++, XML, CSS3, HTML5, DHTML, JavaScript and jQuery.
Proficient at developing sophisticated MapReduce systems that operate on a variety of file types, including Text, Sequence, XML, and JSON. Designed, build and managed ELT data pipelines leveraging Airflow, Python, and GCP solutions.
Experienced with JSON based RESTful web services, and XML/QML based SOAP web services and also worked on various applications using python integrated IDEs like Sublime Text and PyCharm.

Overview

years of professional experience

Work history

Azure Data Engineer

Christus Health

San Antonio, USA

01.2024 - 03.2025

Company Overview: Christus Health is a not-for-profit healthcare system based in Irving, Texas, providing high-quality medical services across the U.S., Mexico, and South America
Analyzed and developed a modern data solution with Azure PaaS service to enable data visualization
Understood the application's current Production state and the impact of new installation on existing business processes
Implemented Azure Data Factory (ADF) extensively for ingesting data from different source systems like relational and unstructured data to meet business functional requirements
Integrated structured and unstructured data from various sources into Microsoft Fabric, facilitating seamless data flow and real-time analytics
Designed and implemented data population processes for cloud-based databases, maintaining structured data models and ensuring the availability of clean, accurate data across AWS RDS, Google Cloud SQL, and Azure platforms
Experience in using Kafka as a messaging system to implement real-time Streaming solutions using Spark Streaming
Worked on Big Data Integration & Analytics based on Hadoop, SOLR, PySpark, Kafka, Storm and web Methods
Involved in requirement gathering, business analysis, and technical design for Hadoop and Big Data projects
Developed Databricks ETL pipelines using notebooks, Spark Data frames, SPARK SQL and python scripting
Managed and optimized cloud databases (e.g., AWS RDS, Google Cloud SQL) to support scalable data pipelines, ensuring efficient and cost-effective data operations
Developed data pipeline programs with Spark Scala APIs, data aggregations with Hive, and formatting data (JSON) for visualization, and generating
Designed and implemented Infrastructure as code using Terraform, enabling automated provisioning and scaling of cloud resources on Azure
Involved in various phases of Software Development Lifecycle (SDLC) of the application, like gathering requirements, design, development, deployment, and analysis of the application
Managed large datasets using Panda data frames and SQL
Documented data migration procedures, scripts, and data handling processes to provide a reference for troubleshooting and future developments
Implemented Synapse Integration with Azure Databricks notebooks which reduce about half of development work and achieved performance improvement on Synapse loading by implementing a dynamic partition switch
Implemented Continuous Integration Continuous Delivery (CI/CD) for end-to-end automation of release pipeline using DevOps tools like Jenkins
Christus Health is a not-for-profit healthcare system based in Irving, Texas, providing high-quality medical services across the U.S., Mexico, and South America
Environment: Jenkins, CI/CD, DevOps, Azure Databricks, Synapse Integration, T-SQL scripting, Panda, Terraform, Spark Scala APIs, Hive, Tableau, SPARK SQL, python scripting, Hadoop, Big Data, Snowflake

AWS Data Engineer

Goldman Sachs

Bengaluru, INDIA

08.2020 - 08.2023

Company Overview: Goldman Sachs is a leading global investment banking, securities, and asset management firm expertise in financial advisory, trading, and wealth management, the firm serves corporations, governments, institutions, and individuals worldwide
Develop, deploy, and manage scalable data pipelines using AWS services like AWSGlue, AWSLambda, and Amazon Kinesis to handle data ingestion, transformation, and loading
Utilize Amazon S3 to build and manage data lakes for storing structured and unstructured data, providing scalable and secure data storage
Integrated AWS DynamoDB using AWS Lambda to store the values of items and backup the DynamoDB streams
Responsible for loading the data from BDW Oracle database, Teradata into HDFS using Sqoop
Implemented AJAX, JSON, and Java script to create interactive web screens
Responsible for creating Hive tables, loading data, and writing hive queries
Configured Spark streaming to get ongoing information from the Kafka and store the stream information to DBFS
Load and transform large sets of structured, semi structured, and unstructured data using Hadoop/Big Data concepts
Good Knowledge on architecture and components of Spark, and efficient in working with Spark Core, Spark SQL, Spark streaming and expertise in building PySpark and Spark-Scala applications for interactive analysis, batch processing and stream processing
Provisioned high availability of AWS EC2 instances, migrated legacy systems to AWS, and developed Terraform plugins, modules, and templates for automating AWS infrastructure
Worked on SQL and PL/SQL for backend data transactions and validations
Set up the CI/CD pipelines using Jenkins, Maven, GitHub, Chef, Terraform, and AWS
Created datasets from S3 using AWS Athena and created Visual insights using AWS Quick sight Monitoring Data Quality and integrity end to end testing and reverse engineering and documented existing program and codes
Goldman Sachs is a leading global investment banking, securities, and asset management firm expertise in financial advisory, trading, and wealth management, the firm serves corporations, governments, institutions, and individuals worldwide
Environment: AWS Athena, S3, SQL, PL/SQL, CI/CD, Jenkins, Maven, GitHub, Chef, Terraform, AWS, AWS EC2, PySpark, Kafka, Hive, Java, Teradata, Sqoop, DynamoDB, AWS Glue, AWS Lambda, Amazon Kinesis

Data Engineer

Lowe’s

Bengaluru, India

01.2017 - 07.2020

Company Overview: Lowe’s is a leading home improvement retailer, providing a wide range of products, tools, and services for homeowners, builders, and contractors
Part of the Data and reporting team creating insights and Visualization for the business to make decisions on
Designed and deployed a Kubernetes-based containerized infrastructure for data processing and analytics, leading to a 20% increase in data processing capacity
Written queries in MySQL and Native SQL
Well versed with various aspects of ETL processes used in loading and updating Oracle data warehouse
Presented the project to faculty and industry experts, showcasing the pipeline's effectiveness in providing real-time insights for marketing and brand management
Used Python based GUI components for the Frontend functionality such as selection criteria
Used Azure Data factory to ingest data from log files and business custom applications, processed data on Data bricks per day- to-day requirements, and loaded them to Azure Data Lakes
Configured Spark streaming to get ongoing information from the Kafka and store the stream information to DBFS
Deployed models as python package, as API for backend integration and as services in a microservices architecture with a Kubernetes orchestration layer for the Dockers containers
Used Python to write Data into JSON files for testing Django Websites, Created scripts for data modelling and data import and export
Led requirement gathering, business analysis, and technical design for Hadoop and Big Data projects
Managed relational database services in which the Azure SQL handles reliability, scaling, and maintenance
Integrated data storage solutions
Build Jenkins jobs for CI/CD Infrastructure for GitHub repos
Created Session Beans and controller Servlets for handling HTTP requests from Talend
Performed Data Visualization and Designed Dashboards with Tableau and generated complex reports including chars, summaries, and graphs to interpret the findings to the team and stakeholders
Used Apache airflow in GCP composer environment to build data pipelines and used various airflow operators like bash operators, Hadoop operators and python callable and branching operators
Lowe’s is a leading home improvement retailer, providing a wide range of products, tools, and services for homeowners, builders, and contractors
Environment: Azure, Oracle, Kafka, Python, Informatica, SQL Server, Erwin, RDS, NOSQL, Snowflake Schema, MySQL, Bash, Dynamo DB, PostgreSQL, Tableau, Git Hub, Linux/Unix

Data Engineer

GE Aerospace

Bengaluru, India

06.2014 - 12.2016

Company Overview: GE Aerospace is a leading global provider of jet engines, components, and integrated systems for commercial and military aircraft the company specializes in advanced propulsion technologies, digital solutions, and sustainable aviation innovations
Hands on experience with building data pipelines in python/Pyspark/Hive SQL/Presto and Monitored Data Engines to define data requirements and data Accusations from both relational and non-relational databases including Cassandra, HDFS
Created ETL Pipeline using Spark and Hive for ingest data from multiple sources
Carried out data transformation and cleansing using SQL queries, Python and Pyspark
Expertise knowledge in Hive SQL, Presto SQL and Spark SQL for ETL jobs and using the right technology to get the job done
Worked on building dashboards in Tableau with ODBC connections from different sources like Big Query/ presto SQL engine and developed stored procedures in MS SQL to fetch the data from different servers using FTP and processed these files to update the tables
Involved in using SAP and transactions done in SAP - SD Module for handling customers of the client and generating the sales reports
Design and configure database, Back-end applications and programs
Managed large datasets using Pandas data frames and SQL
Worked with AWS Terraform templates in maintaining the infrastructure as code
Building/Maintaining Docker container clusters managed by Kubernetes Linux, Bash, Git, Docker
Implemented a continuous delivery (CI/CD) pipeline with Docker for custom application images in the cloud using Jenkins
GE Aerospace is a leading global provider of jet engines, components, and integrated systems for commercial and military aircraft the company specializes in advanced propulsion technologies, digital solutions, and sustainable aviation innovations
Environment: python, Pyspark, Hive SQL, Presto, Spark, Hive, Cassandra, HDFS, Cassandra, HDFS, Kubernetes Linux, Git, Docker, SAP, Hive SQL, Tableau, AWS Terraform, Pandas

Education

Masters - Information Technology Management

Webster University

San Antonio, Texas

Skills

Cloud Technologies: AWS, Azure, MuleSoft, Salesforce, Mule ESB, Design Center, Any point Exchange, Runtime Manager, Any point Studio, API Manager, Any point Monitoring, Amazon S3, EMR, Redshift, Lambda, Athena Composer, Big Query
Script Languages: Python, Shell Script (bash, shell)
Programming Languages: Java, Python, Hibernate, JDBC, JSON, HTML, CSS, RAML
Databases: Oracle, MySQL, SQL Server, PostgreSQL, HBase, Snowflake, Cassandra, MongoDB
Version controls and Tools: GIT, Maven, SBT,CBT
Web/Application server: Apache Tomcat, WebLogic, WebSphere
AWS Ecosystem: S3Bucket, Athena, Glue, EMR, Redshift, Data Lake, AWS Lambda, Kinesis

Azure Ecosystem: Azure Data Lake, ADF, Databricks, Azure SQL, Azure Functions
Operating Systems: Windows, Unix,Linux
IDE Methodologies: Eclipse, Dreamweaver
Hadoop Components / Big Data: HDFS, Hue, MapReduce, PIG, Hive, HCatalog, HBase, Sqoop, Impala, Zookeeper, Flume, Kafka, Yarn, Cloudera Manager, Kerberos, PySpark Airflow, Kafka, Snowflake Spark Components, Batch Processing
Visualization& ETL tools: Tableau, PowerBI, Informatica, Talend
Tools: TOAD, SQL developer, Azure Data Studio, Soap UI, SSMS, GitHub, Share Point, Visual Studio, Teradata SQL Assistant
ETL/Middleware Tools: Talend, SSIS, Azure Data Factory, Azure Data Bricks, MuleSoft, Microsoft Fabric, data lake management

OBJECTIVE:

A results-driven Data Engineer with 10+ years of experience in data integration, designing, implementing, and maintaining data pipelines and data warehousing solutions. Proven expertise in optimizing data processes, working with large datasets, and leveraging advanced data technologies.

Timeline

Azure Data Engineer

Christus Health

01.2024 - 03.2025

AWS Data Engineer

Goldman Sachs

08.2020 - 08.2023

Data Engineer

Lowe’s

01.2017 - 07.2020

Data Engineer

GE Aerospace

06.2014 - 12.2016

Masters - Information Technology Management

Webster University

YASHWANTH CHOWDARY KUNTA

Summary

Overview

Work history

Azure Data Engineer

AWS Data Engineer

Data Engineer

Data Engineer

Education

Masters - Information Technology Management

Skills

OBJECTIVE:

Timeline

Azure Data Engineer

AWS Data Engineer

Data Engineer

Data Engineer

Masters - Information Technology Management

Similar Profiles

Kody AlexanderKody Alexander

Diana BermejoDiana Bermejo

Patricia HannaPatricia Hanna

Cally HayesCally Hayes

Michelle SolisMichelle Solis