Sudhakar Bojja

Summary

I am a dedicated, organized, and methodical individual. I have good interpersonal skills, am an excellent team worker, and am keen and very willing to learn and develop new skills.

Experience in implementing complete ETL Data solutions, including data acquisition, data validation, data profiling, storage, transformation, analysis and integration with other frameworks to meet business needs using Azure Data Factory, Azure Data Lake Storage Gen2, Python, Spark and Databricks. Skilled in managing data on Microsoft's Azure Cloud Platform, including Azure SQL DB, Azure Synapse, and Azure Data Factory, for efficient data processing. Capable of initiating and maintaining databases, ensuring architectural integrity and data reliability. Proficient in extracting, transforming, and loading data using Azure Data Factory, T-SQL, and Spark SQL. Skilled in developing Spark applications with Pyspark and Spark-SQL for data extraction, transformation, and aggregation. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks. Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse. Experienced in deploying pipelines in Azure Data Factory and conducting code reviews to ensure quality and consistency. Proficient in utilizing Snowflake for building and managing data warehouses, enabling efficient storage and retrieval of large volumes of structured and semi-structured data. Proficient in using the Linux command line for system administration tasks, file management, user management, process management, and system monitoring. Strong documentation skills and a commitment to continuous learning and professional growth in the field of data engineering and analysis.

Overview

1

year of professional experience

Work History

Azure cloud Data Engineer

Key DataOps

07.2023 - 01.2024

Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL
Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks
Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse and Snowflake
Transform semi-structured JSON events from EventHub using PySpark/Python to align with business requirements before loading into Snowflake tables
Expertise in parsing JSON-structured messages, applying necessary transformations using PySpark/Python, and storing data in Snowflake for reporting purposes
Developed a versatile Azure Data Factory (ADF) pipeline for efficiently loading incremental or full datasets from SQL Server into Snowflake through ADF Copy activity
Created Spark Structured Streaming Applications to consume data from EventHub and load into Delta tables after transformations
Hands-on experience on developing SQL Scripts for automation purpose
Created Build and Release for multiple projects (modules) in production environment using Azure DevOps
Conducting code reviews for team members to ensure proper test coverage and consistent code standards
Responsible for documenting the process and cleanup of unwanted data
Hands on experience in working on Spark SQL queries, Data frames, and import data from Data sources, perform transformations; perform read/write operations, save the results to output directory into HDFS
Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks
Created Databricks Delta tables to store various data formats coming from different applications
Created & loaded collections with JSON documents in cosmos for Mango DB API using ADF & Databricks
Experience in working with Restful APIs
Responsible for modifying the code, debugging, and testing the code before deploying on the production cluster

Student Teaching Assistant

UNT, Texas

10.2022 - 03.2023

Assisted the professor with setting up the individual case report forms in his research
Collected confidential information from research subjects
Assisted professor in drafting presentations on research findings
Summarized research data into tables, graphs, charts, and narratives
Wrote reports and gave oral presentations on research activities summary.

Education

Master’s in information science -

University of North Texas

05.2023

Skills

Azure
Docker
Jenkins
GIT
Python
Pyspark
SQL
Visual Studio Code
SQL Developer
GitHub
MySql

PostgreSQL
Snowflake
Git
Windows
Linux
Tableau
MS Excel
MS Word
MS Outlook
Maven

Projects

1.Impact of Diabetes on Length of Stay in Hospitals Impact on Length of stay in Hospitals:

Built a web-based application for the impact of diabetes on hospitalization using the SQL-based Database system, Python Packages (i.e., Pandas, NumPy) are used for data analysis and visualization.

Custom

2. Implementation of Database Design by using Azure Database: Extract Transform and Load data from different Sources to Azure Data Lake using a combination of Azure Data Factory, Spark SQL and Azure Data Lake Analytics and processing the data in In Azure Database. Designed and implemented a data pipeline to process and data using Python and Apache Spark. Extracted data from various sources, transformed it into a structured format, and loaded it into a MySQL database for further analysis. Implemented data quality checks and error handling mechanisms to ensure the reliability of the pipeline.

Custom

3. Project Description: Jan 2022 – May 2022

The website is developed using HTML, CSS, and JavaScript to create a visually captivating and interactive user interface.

1) This website deals with the importance of marketing and brand value and marketing strategies.

2) On this website, I have to display the video by using an iframe.

3) It has a Contact Us form if you have any queries to fill in the details and submit them.

4) It has a registered form also. It is our pleasure that you have to register with us. Our team will review your details and send you a confirmation email highlighting the next step of action.

Key Features:

· Easy Navigation: The website is designed to provide a seamless and intuitive user experience. To quickly locate the relevant navigation links to access the files.

· File Organization: Each assignment will have its dedicated file, ensuring can easily find and review the correct files.

· Responsive Design: The website is optimized for various devices, including desktop computers, tablets, and smartphones. It can easily access and review the files, regardless of the device they're using.

· User-Friendly Interface: It is designed to be intuitive and straightforward. It will have no trouble navigating the site and accessing the files they need.

Timeline

Azure cloud Data Engineer

Key DataOps

07.2023 - 01.2024

Student Teaching Assistant

UNT, Texas

10.2022 - 03.2023

Master’s in information science -

University of North Texas

Summary

Overview

Work History

Azure cloud Data Engineer

Student Teaching Assistant

Education

Master’s in information science -

Skills

Projects

Custom

Custom

Timeline

Azure cloud Data Engineer

Student Teaching Assistant

Master’s in information science -

Similar Profiles

TUSHAR PURiTUSHAR PURi

Rohit RawatRohit Rawat

Saranya ManoharanSaranya Manoharan

Arjun ChandArjun Chand