Summary
Overview
Work History
Education
Skills
Certification
Websites
Timeline
Generic

Shiva Sai Krishna Dhulipala

Hillsboro,OR

Summary

Practical Microsoft Azure Certified Data Engineer with a solid 11+ years of overall industry experience, including 7 years dedicated to data engineering. Specializing in Spark, PySpark, Azure Data Factory, Azure DevOps, Azure Databricks, PowerShell, Azure ARM templates, and AWS cloud services, Snowflake. Possesses a strong understanding of data processing, ETL pipelines, and data warehousing, with a proven track record of delivering scalable and reliable solutions to complex business challenges.

Demonstrates expertise in Spark and PySpark for efficient data processing and transformation. Proficient in utilizing Azure Data Factory to orchestrate and manage data pipelines and experienced in utilizing Azure DevOps for streamlined development and deployment processes. Extensive experience with Azure Databricks for advanced analytics and collaborative data engineering. Additionally, possesses a good understanding of AWS cloud services, including Apache Airflow for workflow management, S3 for scalable object storage, Athena for interactive querying, and AWS Glue for data cataloging and ETL. This knowledge expands the ability to work with multi-cloud environments and provides a broader skillset for data engineering projects.

Continuously stays abreast of the latest industry trends and technologies, including advancements in Azure and AWS cloud services, to optimize data workflows and enhance overall data infrastructure. Combines strong analytical skills with a keen attention to detail to ensure the accuracy and integrity of data. Committed to delivering high-quality results in a fast-paced and dynamic environment.

Overview

12
12
years of professional experience
1
1
Certification

Work History

Data Engineer

Lululemon
08.2023 - Current
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Collaborated with Engineers in developing Spark applications using PySpark and Spark-SQL in Databricks for data processing, transformation, and aggregation.
  • Developed efficient data ingestion pipelines to load diverse datasets into Snowflake tables using technologies like Python, SQL, and Snowflake's native utilities.
  • Performed gap analysis on the Snowflake database tables to migrate to new source system as part of the ELA decommissioning project.
  • Utilized Azure Data Factory, Databricks and Azure Key Vault for creating pipelines and schedule the runbooks using scheduled triggers.

Senior Data Engineer

TransAmerica
05.2023 - 07.2023
  • Developed and maintained end-to-end operations of ETL data pipelines and worked with large data sets in Azure Data Factory and Azure Synapse environments.
  • Leveraged existing Apache Airflow DAGs to facilitate loading of data into target systems, streamlining data ingestion and transformation processes.
  • Debugged and troubleshooted issues within Apache Airflow DAGs to ensure smooth data loading operations, identifying and resolving errors or inconsistencies in process.
  • Expertise in utilizing AWS S3 as scalable object storage solution, implementing data lake architectures and optimizing data storage and retrieval for enhanced performance.
  • Implemented AWS Athena for interactive querying of data stored in S3, enabling ad-hoc analysis and exploration of data with SQL-like syntax, reducing need for data preprocessing.
  • Designed and optimized ETL processes using AWS Glue and Spark, enabling efficient extraction, transformation, and loading of data from diverse sources into target data warehouses.
  • Demonstrated expertise in utilizing Jenkins as a key tool for deploying code to various environments, including development, testing, staging, and production.
  • Proficient in utilizing BitBucket as a version control system for managing source code repositories, including creating branches, merging code, and resolving conflicts.

Senior Data Engineer - Azure

Brinks Home Security
12.2021 - 04.2023
  • Developed and maintained end-to-end operations of ETL data pipelines and worked with large data sets in Azure Data Factory.
  • Researched and Implemented various components like pipeline, activity, mapping data flows, data sets, linked services, triggers and control flow.
  • Performed extensive debugging, data validation, error handling mechanism, transformation types and data clean up analysis within large datasets.
  • Collaborated with DevOps Engineers to developed automated CI/CD and Test driven development pipeline using Azure as per client standards.
  • Experience in developing Spark applications using PySpark and Spark-SQL in Databricks for data extraction, transformation, and aggregation.
  • Experience in using REST APIs to retrieve analytics data from different data feeds to meet business requirements.
  • Expertise in building python notebooks in Azure Databricks to implement transformations as per business needs.
  • Worked on Tableau Extracts to send data to Dashboards connected to Databricks and Azure Synapse Analytics (DW).

Data Engineer III

The Standard
04.2019 - 12.2021
  • Design and build ETL pipelines to automate ingestion of structured and unstructured data using Azure Data Factory.
  • Experience in Data Ingestion from various sources to one or more Azure Services – (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing data in Azure Databricks.
  • Design and automated DataLab’s space (Sandbox environments) in Azure environment using PowerShell Scripts and Python Scripts for business teams.
  • Deployed and managed (CI/CD) pipelines to build code using Azure DevOps. Used Kusto (Azure Data Explorer) Queries to analyze data from ADF monitoring through Log Analytics Workspaces.
  • Hands-on experience using PowerShell, Azure Runbooks and Webhooks for automating development and deployments.
  • Created YAML pipeline templates(.yml) in pre-production and production environment using Azure DevOps.
  • Developed Spark code using Python (PySpark) and Spark-SQL for faster testing and processing of data and Implemented PySpark applications to ingest data into data lake central repository from different legacy sources.
  • Worked with Data Governance team (Cross Functionality) to install and configure Collibra infrastructure over Azure environment and helped in migration process from on-premises to cloud. Managed and implemented Sandbox environment with team of 10 members.

Big Data Engineer (Consultant)

Microsoft
04.2017 - 03.2019
  • Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.
  • Developed Spark applications using Scala and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
  • Responsible for estimating cluster size, monitoring, and troubleshooting of HDInsight clusters.
  • Used Zeppelin, Jupyter notebooks and Spark-Shell to develop, test and analyze Spark jobs before Scheduling Customized Spark jobs.
  • Deployed and tested (CI/CD) our developed code using Visual Studio Team Services (VSTS).
  • Expertise in creating HDInsight cluster and Storage Account with End-to-End environment for running jobs.
  • Hands-on experience on developing PowerShell Scripts for automation purpose.
  • Created Build and Release for multiple projects (modules) in production environment using Visual Studio Team Services (VSTS).
  • Experience in using Scala Test Funsuite Framework for developing Unit Tests cases and Integration testing.
  • For Log analytics and for better query response used Kusto Explorer(Azure Data Explorer).
  • Migrated existing MapReduce programs and Hive Queries into Spark application using Scala.

BI/Junior Data Analyst

TekCommands Software Services Private Ltd.
08.2012 - 08.2015

Education

Master of Science - Information Technology

University of The Cumberlands
Williamsburg, KY
2019

Master of Science - Computer Science

Northwestern Polytechnic University
Fremont, CA
12.2016

Bachelor of Science - Information Technology

Jawaharlal Nehru Technological University
India
05.2012

Skills

  • Languages : SQL, Python, PowerShell, Linux(Hadoop Commands), PySpark, Spark
  • Azure Cloud Platform using Azure Data Factory, Azure SQL Database, Azure Synapse SQL Datawarehouse, Functions, Azure SQL, ADLS Gen2/Azure Data Lake and Azure KeyVault, Azure HDInsight, Azure Storage
  • AWS Cloud Platform using Apache Airflow, Athena, S3 and AWS Glue services, EMR
  • Data formats : structured, unstructured, Json, YMAL and XML
  • BI Stack : T-SQL, SSIS, Oracle 11g R2 Enterprise Edition and Microsoft SQL Server 2016/2014
  • Version Control : GitHub, Bitbucket and Azure DevOps Git
  • Visualization: PowerBI, Tableau

Certification

  • Microsoft Certified Azure Fundamentals (AZ-900)
  • Microsoft Certified Azure Data Fundamentals (DP-900)
  • Microsoft Certified: Azure Administrator Associate (AZ-104)
  • Microsoft Certified: DevOps Engineer Expert (AZ-400)

Timeline

Data Engineer

Lululemon
08.2023 - Current

Senior Data Engineer

TransAmerica
05.2023 - 07.2023

Senior Data Engineer - Azure

Brinks Home Security
12.2021 - 04.2023

Data Engineer III

The Standard
04.2019 - 12.2021

Big Data Engineer (Consultant)

Microsoft
04.2017 - 03.2019

BI/Junior Data Analyst

TekCommands Software Services Private Ltd.
08.2012 - 08.2015

Master of Science - Information Technology

University of The Cumberlands

Master of Science - Computer Science

Northwestern Polytechnic University

Bachelor of Science - Information Technology

Jawaharlal Nehru Technological University
Shiva Sai Krishna Dhulipala