Kalyan K

Collierville,TN

Summary

Results-focused Azure Data Engineer with over 4 years of expertise in data extraction (ETL), data pipeline development (ADF, Spark), Cloud (Azure), and data visualization (Power BI, Tableau). Proficient in migrating SQL databases to Azure Data Lake Analytics, Azure SQL Database, Data Bricks, and Azure SQL Data warehouse, as well as controlling and granting database access. Skilled in optimizing Data Build Tool (DBT) projects in a Snowflake environment by implementing incremental models, leveraging partitioning, tuning query performance, and reducing query runtime and cloud data warehousing costs. Experienced in Spark applications using Spark-SQL in Data bricks for data extraction, transformation, and aggregation from multiple file formats to uncover insights into customer usage patterns. Known for designing, building, and optimizing complex data pipelines and ETL processes with expertise in SQL, Python, and cloud platforms to ensure seamless data integration and robust data solutions. Recognized for excelling in collaborative environments, adapting swiftly to evolving needs, and driving team success.

Overview

years of professional experience

Work History

Azure Data Engineer

Allstate

Tennessee

01.2024 - Current

Designed and employed Azure SQL databases to support various critical business applications, achieving a 20% improvement in query performance through database optimization.
Configured and managed Azure Data Lake on Azure Data Lake Storage Gen2 to store and manage large volumes of structured and unstructured data.
Executed data pipelines using Python and PySpark (including libraries like NumPy and Pandas) to efficiently ingest, clean, and transform large-scale datasets (structured, semi-structured, and unstructured) from various sources.
Built and optimized Apache Spark clusters on Azure Databricks to process 500 TB data, accelerating data processing.
Accomplished and deployed serverless data pipelines on Azure Functions and Logic Apps to handle real-time data processing, increasing data processing speed by 40%.
Enhanced query performance and resource usage with the use of SparkSQL to optimize data processing pipelines.
Analyzed data and collaborated with the team to create comprehensive reports and presentations that effectively conveyed key metrics and trends using clear data visualizations using Power BI.
Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
.Fine-tuned query performance and optimized database structures for faster, more accurate data retrieval and reporting.

Azure Data Engineer

Trion IT Solutions, India

India

01.2020 - 07.2022

Utilized Azure Data Factory to extract data from Azure SQL Database, perform transformations using Azure Databricks notebook, and load the transformed data into Azure Data Lake Storage.
Designed and implemented a scalable data lake architecture to store and manage 100 petabytes of structured and unstructured data, achieving a 50% reduction in storage costs through efficient data tiering.
Configured and managed Azure DevOps for continuous integration and continuous delivery (CI/CD) of data pipelines, ensuring efficient deployment and version control.
Constructed various ETL/ELT pipelines using Databricks notebooks and SQL to extract, transform, and load data from multiple sources into the data warehouse and Lakehouse environments.
Generated and deployed Apache Flink applications for processing high-volume streaming data, achieving a 25% improvement in processing speed compared to traditional batch processing methods.
Used Flink state management features to perform transformations on streaming data, enabling 20% faster insight generation.
Created HiveQL queries that identified a 30% increase in customer churn rate, allowing for targeted customer retention campaigns.
Enhanced data quality by performing thorough cleaning, validation, and transformation tasks.
Streamlined complex workflows by breaking them down into manageable components for easier implementation and maintenance.
Optimized data processing by implementing efficient ETL pipelines and streamlining database design.

Education

Master of Science - Data Science

University of Memphis

Memphis, TN

05.2024

Bachelor of Technology -

GITAM University

Andhra Pradesh

06.2021

Skills

Programming Language: Python, R, SQL
Big Data Ecosystem: Apache Spark, Apache Kafka, Apache Nessie, Hadoop, Hive, HDFS
Cloud: Azure (Azure Data Lake Storage, Azure Data Factory, Azure SQL databases, Azure Databricks, Azure DevOps, Azure stream analytics, Azure Synapse, Azure Blob Storage)
Visualizations: Tableau, Power BI, Excel
Packages: NumPy, Pandas, Matplotlib, Seaborn, PySpark
ETL and Data Processing: Fivetran, SSIS, Data Pipelines, Data build tool (DBT), Apache Airflow
Version Control & Database: GitHub, Git, SQL Server, PostgreSQL, Cassandra, MySQL, Snowflake
Certification: Python Data Structures
ETL development
Data warehousing

Data governance
Data integration
SQL and databases
Data analysis
RDBMS
Relational databases
Data pipeline design
Data modeling
Big data processing

Languages

English

Professional Working

Telugu

Native or Bilingual

Timeline

Azure Data Engineer

Allstate

01.2024 - Current

Azure Data Engineer

Trion IT Solutions, India

01.2020 - 07.2022

Master of Science - Data Science

University of Memphis

Bachelor of Technology -

GITAM University