Summary
Overview
Work History
Education
Skills
Certification
Interests
Timeline
Generic

Sushil Baddhan

Gurgaon,Haryana

Summary

Results-oriented Data Engineer with 12+ years of diverse IT experience, specializing in Data Engineering, Data Analysis, and Application Support & Deployment. Proficient in Azure Databricks, Azure Data Factory, Azure Data Lake, Azure DevOps, Pyspark, Spark SQL, Python, ETL, and Data Warehousing. Skilled in data governance, ETL/ELT processes, and modeling for actionable insights. Committed to delivering high-quality solutions that drive business success.

Overview

13
13
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

DXC
12.2022 - Current
  • Constructed end-to-end data ingestion, transformation, and loading processes utilizing Azure Databricks, Data Factory, pyspark, python, spark sql and ADLS to combine data from multiple sources
  • Analyzed and converted computationally intensive PL1 programs into optimized Python code, serving as the foundation for raw layer
  • Applied Complex CDC logic to compare daily data from raw layer & target table’s active records to load data differences (Insert, Update, Delete, Same) into staging table, UPSERT the target table with staging table to have latest data in target tables
  • Orchestrated data pipelines in Databricks workflow having dependency with upstream and downstream pipelines
  • Engineered robust CI/CD pipelines on GitHub and Azure DevOps to streamline data solution development, ensuring quality, efficiency, and auditability through automated testing, deployment, and version control
  • Engineered a Python-based data validation solution to compare source and target table structures and content, automating the correction of inconsistencies

Azure Data Engineer

WNS Global Services
12.2019 - 12.2022
  • Leveraged a star schema, Data Warehouses are structured into three layers: Data Staging for initial data processing, Data Warehouse for integrated data storage, and Data Marts for specific business-oriented subsets
  • Developed efficient Python scripts and optimized SQL queries to clean, transform and refine data within the Databricks environment
  • Employed MERGE statements to efficiently load and update data in DWH tables, adhering to SCD2 principles
  • Applied advanced data engineering techniques such as partitioning, coalescing, joining, and repartitioning to significantly improve query performance
  • Engineered a sophisticated ADF notebook scheduling system utilizing inter-notebook dependencies and parameterized execution
  • Implemented automated CI/CD pipelines using GitHub and Azure DevOps

Data Analyst - Credit Score Rating

TCS
06.2016 - 11.2019
  • Data exploration to examine the variable type, data structure, and missing values using PROC MEANS and PROC Contents, based upon the volume of missing values, replace them with related variables or mean values, or dummy variables (drop original)
  • Data validation was performed to ensure that all values were correct, and that the data matched the data dictionary and Automate the existing data extraction using SAS Macros and PROC SQL
  • Performing descriptive stats on each variable, PROC Univariate for continuous variable and PROC Freq for discrete variable and analyzing the variables for missing values and outliers
  • Based upon the % of missing values, replace them with related variable or mean values or dummy variables (drop original)

Application Development and Support

TCS
06.2012 - 06.2016
  • Capturing customer needs, translating them into code, and ensuring quality through unit testing
  • Performing code changes in Java Classes through mastercraft tool
  • Corresponding changes in DML and DDL through oracle SQL
  • Involved in preparation and execution of test cases by analyzing the Use Case and business Rule, Performed unit testing and version controlled code using Git before advancing to higher environments
  • Validation of changes through front end and SQL queries in test environment

Education

B.Tech -

National Institute of Technology
01.2012

Skills

  • Azure Data Factory
  • Azure Databricks
  • Azure SQL
  • Python
  • PySpark
  • Spark SQL
  • Oracle
  • SAS
  • Data Modeling
  • Data warehousing
  • Git
  • Azure Devops

Certification

  • Databricks Advance Data Engineering training
  • BASE SAS Programmer

Interests

  • Loves playing Tennis and Squash
  • Marathon runner

Timeline

Senior Data Engineer

DXC
12.2022 - Current

Azure Data Engineer

WNS Global Services
12.2019 - 12.2022

Data Analyst - Credit Score Rating

TCS
06.2016 - 11.2019

Application Development and Support

TCS
06.2012 - 06.2016

B.Tech -

National Institute of Technology
Sushil Baddhan