Summary
Overview
Work History
Education
Skills
Websites
Certification
Personal Information
Timeline
Generic

SIVA PAVAN PHANI KUMAR JASTI

USA

Summary

Senior Data Engineer with over 5+ years of experience in designing and implementing robust data infrastructures. Proficient in developing scalable data pipelines, optimizing data storage solutions, and ensuring seamless data integration across cloud platforms. Demonstrated expertise in leveraging big data technologies to drive insightful analytics and support strategic decision-making. Recognized for enhancing system performance and ensuring data integrity in high-demand environments.

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Engineer

First Citizens Bank
North Carolina, USA
01.2024 - Current
  • Architected real-time data streaming with Apache Airflow and Kafka, ensuring seamless data flow
  • Optimized Azure SQL Database & Data Lake Storage for high availability, security, and efficient warehousing
  • Developed scalable data pipelines using Stream Sets & Azure Databricks, enabling real-time analytics
  • Secured data transmission with Azure Active Directory & Key Vault, ensuring regulatory compliance
  • Containerized applications with Docker, enhancing deployment consistency across environments
  • Designed complex SQL queries for large-scale data manipulation in Azure SQL Database
  • Automated data transformation with Python, minimizing errors & manual workload
  • Ensured compliance by maintaining data lineage & metadata tracking in Azure Data Factory
  • Orchestrated ETL workflows with Azure Data Factory, improving operational efficiency
  • Implemented real-time financial data streaming via Azure Stream Analytics
  • Managed event-driven architectures with Azure Event Hubs & Stream Sets for optimal data flow
  • Deployed Azure Machine Learning models for predictive analytics & financial forecasting
  • Accelerated deployments by implementing CI/CD pipelines using Azure DevOps
  • Optimized SQL database & pipeline performance, enhancing system efficiency
  • Established and enforced data governance & security policies, collaborating cross-functionally
  • Monitored and troubleshot data infrastructure to prevent data loss & ensure stability
  • Mentored junior engineers on Azure data services & best practices, fostering skill development
  • Provided strategic input in stakeholder meetings, influencing data architecture decisions
  • Enhanced data extraction by integrating Azure Data Factory with financial systems & APIs
  • Researched & recommended emerging data technologies for continuous process improvements
  • Environment: Azure Data Factory, Apache Kafka, Azure SQL Database, Azure Data Lake Storage (ADLS), Stream Sets, Azure Databricks, Azure Active Directory, Azure Key Vault, Docker, Python, Azure Stream Analytics, Azure Event Hubs, Azure Machine Learning, Azure DevOps, SQL

Data Engineer

BestBuy
Minnesota, USA
01.2020 - 12.2022
  • Architected scalable data processing with Apache Spark on AWS EMR, optimizing analytics and efficiency
  • Automated ETL workflows with AWS Glue, Informatica, and Python, streamlining data integration and transformation
  • Optimized data warehousing in Amazon Redshift and PostgreSQL, improving query performance and BI analytics
  • Administered AWS S3, ensuring secure, scalable, and efficient data storage and retrieval
  • Implemented real-time data streaming using Apache Kafka and AWS Step Functions, enabling low-latency processing
  • Orchestrated workflows with Apache Airflow, enhancing automation and operational efficiency
  • Established secure data lakes via AWS Lake Formation, ensuring governance and compliance
  • Engineered high-performance NoSQL solutions with AWS DynamoDB, supporting scalable applications
  • Integrated Databricks for machine learning, advanced analytics, and data science workflows
  • Applied Agile & DevOps practices, optimizing CI/CD pipelines, security, and project delivery
  • Managed AWS Lambda, EC2, and VPC configurations for scalable cloud computing
  • Enhanced cluster management using Hadoop YARN and OpenShift for distributed data processing
  • Strengthened security with AWS IAM, role-based access, and compliance policies
  • Led cloud migration initiatives, transitioning legacy infrastructures to AWS for improved scalability
  • Deployed AWS CloudWatch for real-time monitoring and proactive incident management
  • Optimized data pipelines using AWS Airflow, EMR, Athena, Talend, and Snowflake, ensuring seamless data operations
  • Environment: Apache Spark, AWS EMR, AWS Glue, Amazon Redshift, AWS S3, Apache Kafka, AWS Step Functions, AWS Lake Formation, Apache Airflow, AWS DynamoDB, Databricks, Python, SQL, Hadoop YARN, AWS IAM

Data Engineer

United Health
Minnesota, USA
01.2019 - 01.2020
  • Architected a multi-terabyte Data Warehouse on Amazon Redshift, handling millions of records for large-scale processing
  • Optimized Redshift performance, achieving 100x query speed improvements for Tableau and SAS Visual Analytics
  • Developed and migrated large datasets to Delta Lake on AWS using Databricks, Apache Spark, and AWS Glue
  • Led migration of legacy systems to Databricks, improving scalability and reducing batch processing times
  • Managed AWS SQL Database and optimized multi-cloud infrastructure (AWS, GCP, Kubernetes) for ML Ops workloads
  • Engineered large-scale analytics workflows using AWS EMR, enhancing distributed data processing
  • Implemented Terraform-based multi-cloud deployments across AWS, Azure, and Google Cloud
  • Strengthened data security in ETL pipelines with encryption techniques, IAM policies, and compliance measures
  • Designed advanced ETL transformations, aggregations, and UDFs to improve processing efficiency
  • Integrated AWS Directory Service with on-prem Active Directory for seamless authentication
  • Implemented Workload Management (WML) in Redshift, enhancing real-time reporting performance
  • Designed and published Tableau and SAS Visual Analytics dashboards for business intelligence
  • Optimized Azure data pipelines and storage, ensuring scalability, reliability, and cost efficiency
  • Configured Elastic Load Balancers (ELB) for high availability and traffic distribution across EC2 instances
  • Deployed AWS Cosmos DB for scalable NoSQL solutions, improving real-time global data access
  • Automated CI/CD pipelines for data deployments, ensuring performance optimization and system updates
  • Mentored teams, reviewed engineering practices, and collaborated with DevOps for CI/CD implementation
  • Designed prototypes for performance optimization and business intelligence insights
  • Environment: Confidential Redshift, R, Azure Synapse Analytics, AWS Data Pipeline, Databricks, S3, SQL Server Integration Services, SQL Server 2014, Aws SQL Database, AWS Data Migration Services, DQS, SAS Visual Analytics, SAS Forecast server, Tableau and Power BI

Education

Master’s - Computer Science

University of Central Missouri
USA
05.2024

Skills

  • R
  • Python
  • SQL
  • Tableau
  • Power BI
  • Excel
  • Apache Spark
  • Hadoop YARN
  • Apache Kafka
  • Databricks
  • AWS
  • Azure
  • Google Cloud Platform
  • SSIS
  • AWS Data Pipeline
  • Azure Data Factory
  • Talend
  • AWS Redshift
  • Azure SQL Database
  • Snowflake
  • Google Big Query
  • Apache NiFi
  • AWS Analytics
  • Data Streaming
  • Docker
  • Kubernetes
  • ML flow
  • Azure Active Directory
  • Azure Key Vault
  • Google Cloud IAM
  • Agile Practices
  • Stream Sets
  • Azure DevOps

Certification

  • Microsoft Certified Azure Data Engineer Associate
  • AWS Certified Data Engineer Associate
  • Microsoft Certified: Azure Fundamentals

Personal Information

Title: Senior Data Engineer

Timeline

Data Engineer

First Citizens Bank
01.2024 - Current

Data Engineer

BestBuy
01.2020 - 12.2022

Data Engineer

United Health
01.2019 - 01.2020

Master’s - Computer Science

University of Central Missouri
SIVA PAVAN PHANI KUMAR JASTI