Summary
Overview
Work History
Education
Skills
Timeline
Generic

SUDHEER VUGGAM

Vijayawada,Andhra Pradesh

Summary

Overall, around 9 years of experience in Software applications development including Analysis, Design, Development, Integration, Testing and Maintenance of various database applications using SQL, T-SQL, PL\SQL, and Python languages. Experienced developing database applications in cloud platforms like Azure and GCP and on-premises platforms like Oracle and SQL Server. Currently seeking for roles as a Data engineer in an esteemed organization to improve the company’s data reliability and architecture.

Overview

10
10
years of professional experience

Work History

Data Engineer

UTMB
04.2022 - Current
  • Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL
  • Architect & implement Data solutions on Azure using Azure Data Platform services (Azure Data Lake, Data Factory, Synapse, Azure SQL DW, Serverless SQL Pool, Logic apps and Databricks)
  • Pulled data from Oracle on premise to Azure Cloud and stored data in Datalake Gen2
  • Designed the Datawarehouse architecture on Serverless sql pool and stored the data in Delta tables
  • Synapse pipelines are extensively used to bring data from On Prem to cloud
  • Triggers and Email alerts were designed so that client receives alerts whenever there is failure
  • Worked on various sources to pull data from such as SQL Server, CSV, Oracle, SQL Azure etc
  • Worked on Initial design of the architecture and successfully completed multiple projects like HCM and FMS
  • Worked on delivering Power bi reports using Import and Direct Query Modes
  • Helped in creating Devops pipeline for CICD from One environment to another
  • Developed Spark applications using Spark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.

Data Engineer

Gilead Sciences
10.2019 - 09.2020
  • Migrating existing data from Oracle database to BigQuery as part of the platform migration to GCP
  • Build data pipelines in Airflow in GCP for ETL related jobs using different Airflow operators
  • Worked on building and architecting multiple Data pipelines, end to end ETL and ELT process for Data ingestion and transformation in GCP and coordinate task among the team
  • Experience in GCP Dataproc, GCS, Cloud functions, BigQuery
  • Used cloud shell SDK in GCP to configure the services Data Proc, Storage, BigQuery Coordinated with team and Developed framework to generate Daily adhoc reports and Extracts from enterprise data from BigQuery
  • Write a Python program to maintain raw file archival in GCS bucket
  • Loaded transactional Data every 15 min on incremental basis to BIGQUERY raw and staging layer using Google DataProc, Pub/Sub, GCS bucket, HIVE, Spark, Scala, Python, Gsutil and Shell Script
  • Designed and Co-ordinated with Data Science team in implementing Advanced Analytical Models in Hadoop Cluster over large Datasets
  • Wrote scripts in Hive SQL for creating complex tables with high performance metrics like partitioning, clustering and skewing
  • Work related to downloading BigQuery data into pandas or Spark data frames for advanced ETL capabilities
  • Worked with Google Data Catalog and other google cloud API’s for monitoring, query and billing related analysis for BigQuery usage
  • Worked on creating POC for utilizing the ML models and Cloud ML for table Quality Analysis for the batch process
  • Created BigQuery authorized views for row level security or exposing the data to other teams.

Data Engineer

J B Hunt
09.2020 - 03.2020
  • Analyze, design and build Modern data solutions using Azure PaaS service to support visualization of data
  • Understand current Production state of application and determine the impact of new implementation on existing business processes
  • Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics
  • Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks
  • Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards
  • Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns
  • Responsible for estimating the cluster size, monitoring and troubleshooting of the Spark Databricks cluster
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning
  • Developed UDF’s in Scala and Pyspark to support the specific business requirements
  • Developed JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data using the SQL Activity
  • Hands-on experience on developing SQL Scripts for automation purpose
  • Created Build and Release for multiple projects (modules) in production environment using Visual Studio Team Services (VSTS).

Data Engineer

CSS
05.2017 - 10.2019
  • Responsible for Business Analysis and Requirements Collection
  • Worked on Informatica Power Center tools- Designer, Repository Manager, Workflow Manager, and Workflow Monitor
  • Parsed high-level design specification to simple ETL coding and mapping standards
  • Designed and customized data models for Data warehouse supporting data from multiple sources
  • Involved in building the ETL architecture and Source to Target mapping to load data into Data warehouse
  • Created mapping documents to outline data flow from sources to targets
  • Involved in Dimensional modeling (Star Schema) of the Data warehouse and used Erwin to design the business process, dimensions and measured facts
  • Extracted the data from the flat files and other RDBMS databases into staging area and populated onto Data warehouse
  • Maintained stored definitions, transformation rules and targets definitions
  • Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner, Stored Procedure, and Union to develop robust mappings in the Informatica Designer
  • Developed mapping parameters and variables to support SQL override
  • Created mapplets to use them in different mappings
  • Developed mappings to load into staging tables and then to Dimensions and Facts
  • Used existing ETL standards to develop these mappings
  • Created sessions, configured workflows to extract data from various sources, transformed data, and loading into data warehouse
  • Used Type 1 SCD and Type 2 SCD mappings to update slowly Changing Dimension Tables
  • Extensively used SQL
  • Loader to load data from flat files to the database tables in Oracle
  • Modified existing mappings for enhancements of new business requirements
  • Used Debugger to test the mappings and fixed the bugs
  • Wrote UNIX shell Scripts & PMCMD commands for FTP of files from remote server and backup of repository and folder
  • Involved in Performance tuning at source, target, mappings, sessions, and system levels
  • Prepared migration document to move the mappings from development to testing and then to production repositories.

Data Engineer

Fedex
02.2014 - 11.2015
  • Logical and Physical data modeling was done using Erwin for data warehouse database in STAR SCHEMA Using Informatica PowerCenter Designer analyzed the source data to Extract & Transform from various source systems (oracle 10g,DB2,SQL server and flat files) by incorporating business rules using different objects and functions that the tool supports
  • Using Informatica PowerCenter created mappings to transform the data according to the business rules
  • Used various transformations like Source Qualifier, Joiner, Lookup, SQL, router, Filter, Expression and Update Strategy
  • Implemented slowly changing dimensions (SCD) for some of the Tables as per user requirement
  • Developed Stored Procedures and used them in Stored Procedure transformation for data processing and have used data migration tools Documented Informatica mappings in Excel spread sheet
  • Tuned the Informatica mappings for optimal load performance
  • Have used BTEQ, FEXP, FLOAD, MLOAD Teradata utilities to export and load data to/from Flat files
  • Created and Configured Workflows and Sessions to transport the data to target warehouse Oracle tables using Informatica Workflow Manager
  • Have generated reports using OBIEE 10.1.3 for the future business utilities
  • Worked along with UNIX team for writing UNIX shell scripts to customize the server scheduling jobs
  • Constantly interacted with business users to discuss requirements.

Education

Master’s Degree - MSIS

Wilmington University
01.2017

Bachelor’s Degree - Information Technology

Vignan University, A.P, India
01.2014

Skills

  • All versions of Windows, UNIX, LINUX, Macintosh HD, Sun Solaris
  • SQL, Python, Java, Spark
  • Microsoft SQL Server 2008,2010/2012, MySQL 4x/5x, Oracle 11g, 12c, DB2, Teradata
  • Jenkins, Azure Devops
  • Microsoft SQL Studio, IntelliJ, Eclipse, NetBeans

Timeline

Data Engineer

UTMB
04.2022 - Current

Data Engineer

J B Hunt
09.2020 - 03.2020

Data Engineer

Gilead Sciences
10.2019 - 09.2020

Data Engineer

CSS
05.2017 - 10.2019

Data Engineer

Fedex
02.2014 - 11.2015

Master’s Degree - MSIS

Wilmington University

Bachelor’s Degree - Information Technology

Vignan University, A.P, India
SUDHEER VUGGAM