Summary

Overview

Work History

Education

Skills

Certification

Timeline

VAMSI KRISHNA BARIGELA MAHESH

New Jersey,NJ

Summary

Experienced Data Engineer with over 3 years of expertise in designing and implementing scalable data solutions. Proficient in data integration, ETL pipeline development, and data modeling, with a strong background in leveraging big data technologies and cloud platforms for processing and analyzing large-scale datasets. Adept at collaborating with cross-functional teams to deliver high-quality data solutions that align with business goals.

Overview

years of professional experience

Certification

Work History

Data Engineer

Discover Financial Services

04.2024 - Current

Engaged in all phases of the Software Development Life Cycle (SDLC), including analysis, design, and development, while collaborating with the team using Agile methodologies
Implemented PySpark to process data from diverse RDBMS and streaming sources, utilizing Snowflake for data warehousing solutions
Designed and deployed end-to-end data pipelines and analytics solutions using AWS services, including EMR, EC2, S3, RDS, Lambda, Glue, SQS, and Redshift
Developed efficient Spark-SQL scripts for data processing and executed complex HiveQL queries on Hive tables
Created and managed Hive tables, implementing partitioning, dynamic partitions, and bucketing for optimized data analysis
Built data pipelines to extract, transform, and load (ETL) data from multiple sources into Snowflake tables to meet business requirements
Configured CI/CD pipelines using Git and Jenkins to streamline the deployment and management of big data architecture on AWS
Orchestrated workflows for large-scale data transformations using Apache Airflow and Apache Oozie to schedule and automate Hadoop jobs

Data Engineer

Discover Financial Services

01.2022 - 02.2023

Developed ETL pipelines to load, transform, and analyze large structured, semi-structured, and unstructured datasets using Azure Data Factory, Spark SQL, and Hive
Ingested data into Azure services, including Azure Data Lake, Blob Storage, and Azure SQL Data Warehouse, and processed data in Azure Databricks for analytics
Created pipelines in Azure Data Factory using Linked Services, Datasets, and Pipelines to extract, transform, and load data from diverse sources like Azure SQL and Blob Storage
Partnered with data scientists to integrate machine learning models within Spark pipelines, utilizing Spark MLlib for predictive analytics and real-time decision-making
Designed and implemented batch and streaming workflows in Spark for high availability and reliability of mission-critical systems
Enhanced the company’s big data ecosystem using Hadoop and Spark, enabling efficient processing of petabyte-scale datasets
Created and managed RDDs and leveraged DataFrames for efficient manipulation and analysis of structured data
Implemented complex Spark SQL queries for data aggregation and analysis, integrating Hive and HBase for effective data storage and retrieval
Configured CI/CD pipelines in Azure DevOps for automated build, test, and deployment of applications
Automated workflows with Apache Oozie, reducing manual data handling efforts and improving operational efficiency

Data Engineer

Amazon Development Centre

12.2017 - 12.2019

Designed and implemented scalable ETL pipelines for both cloud and on-premises environments, focusing on data modeling and data migration
Strong expertise in Apache Spark, including Spark Core, Spark SQL, and Spark Streaming, with hands-on experience developing applications for data validation, cleansing, transformation, and aggregation
Configured Spark Streaming to process real-time data from Apache Kafka and store it in HDFS using Scala and PySpark
Built partitioned and bucketed Hive tables in Parquet file formats with Snappy compression for optimized storage and query performance
Automated data workflows using Apache Airflow, reducing manual intervention by40% and ensuring timely data availability for analytics
Migrated legacy systems to cloud platforms (AWS and Azure) using AWS CloudFormation and IAM, achieving a20% reduction in operational costs while improving scalability
Created real-time data streaming solutions with Kafka, enabling instant access to critical data for analytics and decision-making
Implemented data lakes on AWS S3, leveraging partitioning and optimization techniques to enhance query performance and reduce costs
Collaborated with cross-functional teams using Jira, Confluence, and Git, ensuring seamless project execution
Designed and managed CI/CD pipelines with Jenkins, AWS CodeBuild, and Azure DevOps, reducing deployment cycles by50%
Migrated complex SSIS packages to Databricks, improving data processing speeds by over50% while lowering infrastructure costs

Education

Master of Science - Business Analytics

Sacred Heart University

Fairfield, CT

06.2024

Master of Science - Business with International Management

Northumbria University

Newcastle Upon Tyne, United Kingdom

01-2022

Bachelor of Science - Mechanical engineering

Vardhaman College Of Engineering

Hyderabad, Telangana

04.2017

Skills

Programming Languages: Scala, Python, Java, R, SQL, JavaScript
Big Data Technologies: Hadoop, Apache Spark, Kafka, HDFS, MapReduce, Sqoop, Hive, Pig, Flume, NiFi, Impala, Zookeeper, Yarn, Cassandra, Snowflake, Apache Flink, Airflow, Cloudera Manager
Cloud Platforms: AWS: S3, Lambda, Athena, EMR, Kinesis, Redshift, RDS, Step Functions, CloudWatch, ECS, ElasticSearch, SNS, Route53, IAM, Glue, CodePipeline, CodeDeploy, SageMaker, QuickSight Azure: Databricks, Blob Storage, Azure Functions, HDInsight, Stream Analytics, Event Hubs, Logic Apps, Virtual Machines, Azure Service Bus, Synapse Analytics GCP: Comprehensive understanding of data and cloud services
ETL & Data Storage: SSIS, SSAS, PostgreSQL, MySQL, MongoDB, Cassandra, DynamoDB, Redshift
Data Visualization: Tableau, Power BI, Amazon QuickSight, Grafana

Data Analytics & Processing: Data Manipulation, Data Cleaning, Data Integration, Data Transformation, Data Streaming, Data Pipelining
Machine Learning & MLOps: TensorFlow, PyTorch, Scikit-learn
DevOps Tools: Docker, Kubernetes, Jenkins, Git, AWS CodeBuild, AWS CodeDeploy
Methodologies: Agile, Waterfall, SDLC
Security & Networking: VPC configurations, IAM roles, Security Groups, Network Protocols

Certification

AWS Certified Data Engineer Associate
Microsoft Certified Azure Data Engineer Associate

Timeline

Data Engineer

Discover Financial Services

04.2024 - Current

Data Engineer

Discover Financial Services

01.2022 - 02.2023

Data Engineer

Amazon Development Centre

12.2017 - 12.2019

Master of Science - Business Analytics

Sacred Heart University

Master of Science - Business with International Management

Northumbria University

Bachelor of Science - Mechanical engineering

Vardhaman College Of Engineering

VAMSI KRISHNA BARIGELA MAHESH

Summary

Overview

Work History

Data Engineer

Data Engineer

Data Engineer

Education

Master of Science - Business Analytics

Master of Science - Business with International Management

Bachelor of Science - Mechanical engineering

Skills

Certification

Timeline

Data Engineer

Data Engineer

Data Engineer

Master of Science - Business Analytics

Master of Science - Business with International Management

Bachelor of Science - Mechanical engineering

Similar Profiles

Justin SicienskiJustin Sicienski

Camille HolmesCamille Holmes

HAZZEL MARTINEZHAZZEL MARTINEZ