Summary
Overview
Work History
Education
Skills
Languages
Websites
Timeline
Generic

Sudhakar B

St Louis,MO

Summary

As a Data Engineer with 9+ years of IT experience as on professional responsible for designing, implementing, and managing data solutions using Microsoft Azure's cloud-based services. the primary role involves leveraging Azure's data services to build robust and scalable data pipelines, data warehouses, and analytics solutions microsoft Azure and AWS Cloud Experience on Migrating from SQL database to azure data lake service, Azure data lake Analytics, Azure SQL Database and Big Query, Data Bricks and Azure SQL Data warehouse and granting database access and Migrating On premise database to Azure Data Lake store and goggle cloud storage. Using Azure Data factory.

Overview

10
10
years of professional experience

Work History

Data Engineer

Walmart
St. Louis, United States
11.2022 - Current
  • Extract Transform and Load data from source Systems to Azure Data storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics
  • Data Ingestion to one or more Azure Services –(Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks
  • Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob Storage, Azure SQL Data warehouse, write-back tool and backwards
  • Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns
  • Implement data pipeline to collect data from various sources
  • Proficient in designing and implementing database schema and data models to ensure data integrity scalability, and maintainability in a large-scale SQL Server environment
  • Demonstrated ability to identify and resolve performance bottlenecks in SQL Server databases, including query optimization, index tuning, and databases configuration adjustments
  • Experience in implementing partitioning strategies to handle large datasets efficiently and optimize query performance in distributed environments
  • Knowledge in designing Extract, Transform, Load (ETL) processed and data integration workflow using SQL Server integration Service (SSIS) to consolidate data from various sources
  • Design and implement data storage solutions using Azure services such as Azure SQL Database, Azure Cosmos DB, and Azure Data Lake Storage and Google cloud storage
  • Develop and maintain data pipelines using Azure Data Factory and Azure Databricks Create and manage data processing jobs using Azure HDInsight and Azure Stream Analytics Developed JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data using the Sql Activity
  • Optimize data processing and storage for performance and cost efficiency
  • Collaborate with data scientists and analysts to provide data insights and support data-driven decision making
  • Troubleshoot and resolve data processing and storage issues Develop and maintain documentation for data storage and processing solutions
  • Stay up to date with new Azure services and technologies and evaluate their potential for improving data storage and processing solutions.

Azure Cloud Engineer

SVB Capital
St. Louis, United States
02.2020 - 10.2022
  • Worked on design, development of complex applications using various technologies
  • Worked on Azure networking and security principles
  • Worked on Bastion networking process such as private link services and private endpoints setup
  • Designed and delivered solutions using Terraform and Azure DevOps agents
  • Worked on building data pipelines using Databricks and scheduling data bricks jobs
  • Created mount points for ADLS Gen2 storage in DBFS to implement RBAC for end users Developed VNET Peering and Private link connectivity for remote subscription resources Worked with structured and unstructured data including imaging & geospatial data
  • Worked on Azure Cloud services and Snowflake development
  • Experience with data load and manage cloud DB and worked on DevOps for migrating servers from MYSQL to Oracle and Teradata
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning
  • Working with the application team to design and develop an effective Hadoop solution
  • Be actively engaged and responsible in the development process
  • Developing and testing workflow scheduler job scripts
  • Working knowledge of continuous integration and deployment process along with tools like Code Cloud
  • Development of new segments for New Relic Transaction for error handling
  • Always followed standards and procedures for documentation.

Data Engineer

Amgen Healthcare
Greater St. Louis, United States
09.2016 - 01.2020
  • Worked on design, development of complex applications using various technologies
  • Developed data pipelines using Big Data technologies
  • Hadoop, Spark, Hive, Pig, Sqoop, NoSQL database) that leverage value to the customer and understand customer use cases and workflows and translate them into engineering deliverables
  • Implemented a serverless architecture using API Gateway, Lambda, and Dynamo DB and deploy AWS code from Amazon S3 bucket, Actively participate in scrum calls, story points, estimates, and own the development piece
  • Analyze the user stories and understand the requirements and develop the code as per the design
  • Develop, create, and modify general computer applications software or specialized utility programs
  • Design and Implementation of ETL process in Hadoop, Process large internal pipelines in Stream sets to import data from oracle to Hive/Impala
  • Design and Implementation of ETL process in Hadoop, Process large internal pipelines in Stream sets to import data from oracle to Hive/Impala and also have good hands on HBase
  • Strong knowledge of Data Modeling methodologies such as Star Schema and Snowflake Schema and tools such as Ab Initio Data Profiler
  • Always followed standards and procedures for documentation
  • Developed ETL pipelines in and out of data warehouse using combination of Python and Snowflake’s Snow SQL
  • Generation of log files that contains informational events, errors, and warnings
  • Always followed standards and procedures for documentation.

ETL BI Developer

Accenture
Hyderabad, India
08.2014 - 07.2015
  • Developed SSIS packages to Extract, Transformation and Load(ETL) data into the data warehouse database from heterogeneous databases/data source
  • Identified the dimension, fact table and designed the data warehouse using star schema
  • Created technical design for the back-end PL/SQL based on business requirement documents and the functional system design
  • Creation of database objects like tables, views, synonyms, materialized views, stored procedures, packages using Oracle tools like PL/SQL developer
  • Coordinated with the front-end design team to provide them with the necessary stored procedures and packages and the necessary insight into the data
  • Involved in updating procedures, functions, triggers, and packages based on the change request
  • Built complex queries using SQL and wrote stored procedures using PL/SQL
  • Used ref cursors and collections for accessing Complex data resulted from joining of a large number of tables
  • Involved in moving the data from flat files to staging area tables using SQL loader
  • Extensively used for all and bulk collect to fetch large volumes of data from the table.

Education

Master of Science - Computer And Information Systems Security

University of Cumberlands
Williamsburg, Kentucky
12-2018

Master of Science - Information Technology

Northwestern University
California
12-2016

Bachelor of Science - Computer Engineering

JNTU
Hyderabad
08-2014

Skills

  • Programming language : Python, SQL, Java
  • Apache Spark : Spark Core, Dataframe, Spark SQL, Spark Streaming, Scala
  • Azure Data lake, Azure Data Factory
  • Azure databricks, Azure SQL Database
  • Azure SQL data Warehousing
  • AWS, S3 Bucket
  • GCS bucket
  • BigQuery
  • Data Warehousing
  • SQL and Databases
  • Big data technologies
  • Data Mining
  • Management tool : Jira, Rally
  • Git, Jenkins CI/CD

Languages

English
Professional

Timeline

Data Engineer

Walmart
11.2022 - Current

Azure Cloud Engineer

SVB Capital
02.2020 - 10.2022

Data Engineer

Amgen Healthcare
09.2016 - 01.2020

ETL BI Developer

Accenture
08.2014 - 07.2015

Master of Science - Computer And Information Systems Security

University of Cumberlands

Master of Science - Information Technology

Northwestern University

Bachelor of Science - Computer Engineering

JNTU
Sudhakar B