Summary
Overview
Work History
Education
Skills
Projects
Custom
Custom
Timeline
Generic

Sudhakar Bojja

Summary

I am a dedicated, organized, and methodical individual. I have good interpersonal skills, am an excellent team worker, and am keen and very willing to learn and develop new skills.

Experience in implementing complete ETL Data solutions, including data acquisition, data validation, data profiling, storage, transformation, analysis and integration with other frameworks to meet business needs using Azure Data Factory, Azure Data Lake Storage Gen2, Python, Spark and Databricks. Skilled in managing data on Microsoft's Azure Cloud Platform, including Azure SQL DB, Azure Synapse, and Azure Data Factory, for efficient data processing. Capable of initiating and maintaining databases, ensuring architectural integrity and data reliability. Proficient in extracting, transforming, and loading data using Azure Data Factory, T-SQL, and Spark SQL. Skilled in developing Spark applications with Pyspark and Spark-SQL for data extraction, transformation, and aggregation. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks. Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse. Experienced in deploying pipelines in Azure Data Factory and conducting code reviews to ensure quality and consistency. Proficient in utilizing Snowflake for building and managing data warehouses, enabling efficient storage and retrieval of large volumes of structured and semi-structured data. Proficient in using the Linux command line for system administration tasks, file management, user management, process management, and system monitoring. Strong documentation skills and a commitment to continuous learning and professional growth in the field of data engineering and analysis.

Overview

1
1
year of professional experience

Work History

Azure cloud Data Engineer

Key DataOps
07.2023 - 01.2024
  • Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL
  • Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks
  • Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse and Snowflake
  • Transform semi-structured JSON events from EventHub using PySpark/Python to align with business requirements before loading into Snowflake tables
  • Expertise in parsing JSON-structured messages, applying necessary transformations using PySpark/Python, and storing data in Snowflake for reporting purposes
  • Developed a versatile Azure Data Factory (ADF) pipeline for efficiently loading incremental or full datasets from SQL Server into Snowflake through ADF Copy activity
  • Created Spark Structured Streaming Applications to consume data from EventHub and load into Delta tables after transformations
  • Hands-on experience on developing SQL Scripts for automation purpose
  • Created Build and Release for multiple projects (modules) in production environment using Azure DevOps
  • Conducting code reviews for team members to ensure proper test coverage and consistent code standards
  • Responsible for documenting the process and cleanup of unwanted data
  • Hands on experience in working on Spark SQL queries, Data frames, and import data from Data sources, perform transformations; perform read/write operations, save the results to output directory into HDFS
  • Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks
  • Created Databricks Delta tables to store various data formats coming from different applications
  • Created & loaded collections with JSON documents in cosmos for Mango DB API using ADF & Databricks
  • Experience in working with Restful APIs
  • Responsible for modifying the code, debugging, and testing the code before deploying on the production cluster

Student Teaching Assistant

UNT, Texas
10.2022 - 03.2023
  • Assisted the professor with setting up the individual case report forms in his research
  • Collected confidential information from research subjects
  • Assisted professor in drafting presentations on research findings
  • Summarized research data into tables, graphs, charts, and narratives
  • Wrote reports and gave oral presentations on research activities summary.

Education

Master’s in information science -

University of North Texas
05.2023

Skills

  • Azure
  • Docker
  • Jenkins
  • GIT
  • Python
  • Pyspark
  • SQL
  • Visual Studio Code
  • SQL Developer
  • GitHub
  • MySql
  • PostgreSQL
  • Snowflake
  • Git
  • Windows
  • Linux
  • Tableau
  • MS Excel
  • MS Word
  • MS Outlook
  • Maven

Projects

1.Impact of Diabetes on Length of Stay in Hospitals Impact on Length of stay in Hospitals:

Built a web-based application for the impact of diabetes on hospitalization using the SQL-based Database system, Python Packages (i.e., Pandas, NumPy) are used for data analysis and visualization. 

Custom

2. Implementation of Database Design by using Azure Database:  Extract Transform and Load data from different Sources to Azure Data Lake using a combination of Azure Data Factory, Spark SQL and Azure Data Lake Analytics and processing the data in In Azure Database. Designed and implemented a data pipeline to process and data using Python and Apache Spark. Extracted data from various sources, transformed it into a structured format, and loaded it into a MySQL database for further analysis. Implemented data quality checks and error handling mechanisms to ensure the reliability of the pipeline.

Custom

3. Project Description: Jan 2022 – May 2022

The website is developed using HTML, CSS, and JavaScript to create a visually captivating and interactive user interface.

1) This website deals with the importance of marketing and brand value and marketing strategies.

2) On this website, I have to display the video by using an iframe.

3) It has a Contact Us form if you have any queries to fill in the details and submit them.

4) It has a registered form also. It is our pleasure that you have to register with us. Our team will review your details and send you a confirmation email highlighting the next step of action.

Key Features:

· Easy Navigation: The website is designed to provide a seamless and intuitive user experience. To quickly locate the relevant navigation links to access the files.

· File Organization: Each assignment will have its dedicated file, ensuring can easily find and review the correct files.

· Responsive Design: The website is optimized for various devices, including desktop computers, tablets, and smartphones. It can easily access and review the files, regardless of the device they're using.

· User-Friendly Interface: It is designed to be intuitive and straightforward. It will have no trouble navigating the site and accessing the files they need.

Timeline

Azure cloud Data Engineer

Key DataOps
07.2023 - 01.2024

Student Teaching Assistant

UNT, Texas
10.2022 - 03.2023

Master’s in information science -

University of North Texas
Sudhakar Bojja