Data Engineer with 6+ years of experience across the data pipeline, from acquiring and validating large datasets (structured and unstructured) to building data models, developing reports, and utilizing visualization tools for impactful insights
Proficient in SQL, database design, and ETL pipeline implementation. Proven ability to leverage a wide range of cloud services, including Azure (Azure DevOps, Azure Data Lake, Azure Data Factory, Azure Databricks) and AWS (EC2, S3, RDS, Lambda, Glue, Athena, AWS Pipeline, Redshift) to design, build, and manage data solutions.
Experience in developing enterprise solutions utilizing batch processing with Data Bricks and streaming frameworks, including Spark Streaming, Apache Kafka, Apache Airflow, and Apache Flink, ensuring high throughput and low latency in data processing.
Experience in developing and deploying continuous integration/continuous delivery (CI/CD) pipelines for Snowflake objects, ensuring streamlined deployments and minimizing the risk of errors.
Actively involved in the migration of SQL databases to various Azure services, including Azure Data Lake, Azure Data Lake Analytics, Azure SQL Database, Data Bricks, and Azure SQL Data Warehouse.
Responsible for transferring on-premises databases to Azure Data Lake Store utilizing Azure Data Factory.
Proficient in data extraction, transformation, and loading (ETL) processes, as well as data modeling and visualization
Proficient in creating visually appealing and interactive dashboards in Power BI and Tableau, incorporating features like drill-down and dropdown menus.
Experience in database design and development using business intelligence. SQL Server, Integration Services (SSIS), DTS Packages, SQL Server Analysis Services (SSAS), DAX, OLAP Cubes, Star Schema, and Snowflake Schema.
Utilized various ETL and BI tools, including Informatica PowerCenter, Microsoft SQL Server Integration Services (SSIS), and Tableau, to deliver robust and scalable data solutions.
Strong skills in visualization tools Power BI, Tableau, complex Excel - formulas, Pivot Tables, Charts, and DAX Commands.
Strong experience in creating advanced chart types, visualizations, and complex calculations to manipulate the data as per business needs through Power BI/ Tableau.
Expertise in developing visualization solutions using Power BI/ Tableau with expertise in analysis, design, development, testing, and implementation in Data Warehouse/ BI Environment.
Demonstrated ability to engage with clients, comprehend business applications, understand data flow, and identify data relationships.
Strong expertise in database technologies, such as SQL, Oracle, and MySQL.
Experience in Data Analysis, Data Cleansing, and Data Verification using SQL query/MS Excel.
Overview
7
7
years of professional experience
1
1
Certification
Work History
Azure Data Engineer (contract)
PNC Bank
02.2024 - Current
Install and configure Azure services like virtual machines, app services, ADF, and Databricks
Collaborate in Extraction, Transformation, and Loading data from source systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics
Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks
Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool, and backward
Developed Azure Databricks notebooks to load data from Databricks into Azure SQL
Involved in implementing medium to large-scale BI solutions on Azure using Azure Data Platform Services (Azure Data Lake, Data Factory, Data Lake Analytics, Azure SQL DW)
Worked on developing the data pipelines in ADF to load data from on-premises Oracle to Azure SQL
Design and Develop event-driven architectures using blob triggers and Data Factory
Developing Batch processing solutions with Azure Databricks and Azure Event
Involve in Developing the Azure Databricks notebook to load the data from Databricks to Azure SQL
Design and develop data models in Power BI, ensuring accurate data representation and efficient performance
Create Sub-reports, drill Reports, Summary Reports, and Ad-hoc Reports using Power BI and developed complex stored procedures and functions to generate dashboard reports
Participate in database architecture and data modeling design
Implemented advanced analytics features in Power BI, such as DAX calculations, time intelligence, and forecasting, to provide valuable insights
Integrating Tableau with other BI tools, databases, and third-party applications
Employed Azure Data Lake Storage, leading to a 30% improvement in data storage and retrieval efficiency for large-scale datasets
Strong Experience in developing a Power BI dashboard to visualize key business KPIs, resulting in a weekly time savings of 2 hours on manual reporting
Implement and deploy Spark applications on Databricks clusters for data processing, achieving a 10% improvement in data processing speed
Solution Environment: Power BI Reports, SQL Server, Shell Scripting, JIRA, ADF, Snowflake, Tableau Azure data Lake, Airflow, Pyspark, Python, Azure data lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse
Data Engineer
T mobile
09.2022 - 12.2023
Developed and executed notebooks in Azure Databricks for interactive data exploration, visualization, and collaborative data analysis with data scientists
Employed Snowflake materialized views, data masking, and optimization to deliver efficient and secure data solutions for seamless data management and advanced big data analytics
Crafted and modified complex SQL queries, leveraging indexing and query optimization techniques to enhance database efficiency, leading to a 30% improvement in application response time
Achieved a 10% improvement in data processing efficiency by optimizing data flows within Azure Data Factory
Leveraged Databricks as a cloud-based platform for deploying and managing Apache Spark workloads, enabling efficient resource allocation and scalability for big data processing
Calculated AWS Data Pipeline workflows for efficient ETL Process, which reduced data processing time by 20%
Monitored and maintained the entire data pipeline infrastructure (Kafka, Snowflake, MongoDB) to ensure high availability and real-time data processing for fraud detection
Data Analyst / BI Consultant
Invitrogen Bio-Services India Pvt Ltd
08.2018 - 06.2021
Company Overview: India
Involved in creating statistical methods, data visualization tools, and data mining techniques to extract meaningful insights and identify trends and used SQL, Python, and R for data querying
Involved in Creating interactive Power BI reports with drill-throughs, slicers, and filters to enable users to explore data and answer ad-hoc questions
Collaborate with data engineers to ensure the availability and reliability of data sources for Power BI reporting
Implemented performance tuning techniques by identifying and resolving the bottlenecks in source, target, transformations, mappings, and sessions to improve performance Understanding the Functional Requirements
Responsible for identifying the missed records in different stages from source to target and resolving the issue
Collaborated with business analysts and data stakeholders to understand reporting requirements and translate them into effective Power BI solutions
Designed and developed interactive and visually appealing Power BI reports, dashboards, and scorecards to visualize complex data sets
Integrated Power BI with various data sources, including SQL databases, Excel files, and APIs, to create comprehensive and dynamic reports
Utilized advanced DAX calculations, measures, and KPIs to provide meaningful insights and actionable business intelligence
Implemented data transformations and modeling techniques in Power BI to ensure data accuracy, consistency, and optimal performance
Conducted data analysis and identified trends, patterns, and anomalies to support decision-making processes and improve business performance
India
Education
Masters - Big Data Analytics
University Of Central Missouri
Missouri
05.2024
Skills
Python
R
SQL
PyCharm
Jupyter Notebook
Hadoop
Hive
Apache Airflow
Apache Kafka
Apache Spark
Apache Flink
DataBricks
Azure
Azure DevOps
Azure Data Lake
Azure Data Factory
Azure Databricks
AWS
EC2
S3
RDS
Lambda
Glue
Athena
AWS Pipeline
Redshift
Tableau
Power BI
Excel
NumPy
Pandas
Matplotlib
Seaborn
TensorFlow
PySpark
Data Pipelines
Jenkins
GitHub
Git
SQL Server
PostgreSQL
MongoDB
DynamoDB
MySQL
Snowflake
Windows
MacOS
Certification
Certified Microsoft Azure Data Engineer Associate (DP-203)
International Trade Specialist at PNC Bank, Pittsburgh National Corporation BankInternational Trade Specialist at PNC Bank, Pittsburgh National Corporation Bank