Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Harish A

Southgate,MI

Summary

  • Data Engineer with 6+ years of experience across the data pipeline, from acquiring and validating large datasets (structured and unstructured) to building data models, developing reports, and utilizing visualization tools for impactful insights
  • Proficient in SQL, database design, and ETL pipeline implementation. Proven ability to leverage a wide range of cloud services, including Azure (Azure DevOps, Azure Data Lake, Azure Data Factory, Azure Databricks) and AWS (EC2, S3, RDS, Lambda, Glue, Athena, AWS Pipeline, Redshift) to design, build, and manage data solutions.
  • Experience in developing enterprise solutions utilizing batch processing with Data Bricks and streaming frameworks, including Spark Streaming, Apache Kafka, Apache Airflow, and Apache Flink, ensuring high throughput and low latency in data processing.
  • Experience in developing and deploying continuous integration/continuous delivery (CI/CD) pipelines for Snowflake objects, ensuring streamlined deployments and minimizing the risk of errors.
  • Actively involved in the migration of SQL databases to various Azure services, including Azure Data Lake, Azure Data Lake Analytics, Azure SQL Database, Data Bricks, and Azure SQL Data Warehouse.
  • Responsible for transferring on-premises databases to Azure Data Lake Store utilizing Azure Data Factory.
  • Proficient in data extraction, transformation, and loading (ETL) processes, as well as data modeling and visualization
  • Proficient in creating visually appealing and interactive dashboards in Power BI and Tableau, incorporating features like drill-down and dropdown menus.
  • Experience in database design and development using business intelligence. SQL Server, Integration Services (SSIS), DTS Packages, SQL Server Analysis Services (SSAS), DAX, OLAP Cubes, Star Schema, and Snowflake Schema.
  • Utilized various ETL and BI tools, including Informatica PowerCenter, Microsoft SQL Server Integration Services (SSIS), and Tableau, to deliver robust and scalable data solutions.
  • Strong skills in visualization tools Power BI, Tableau, complex Excel - formulas, Pivot Tables, Charts, and DAX Commands.
  • Strong experience in creating advanced chart types, visualizations, and complex calculations to manipulate the data as per business needs through Power BI/ Tableau.
  • Expertise in developing visualization solutions using Power BI/ Tableau with expertise in analysis, design, development, testing, and implementation in Data Warehouse/ BI Environment.
  • Demonstrated ability to engage with clients, comprehend business applications, understand data flow, and identify data relationships.
  • Strong expertise in database technologies, such as SQL, Oracle, and MySQL.
  • Experience in Data Analysis, Data Cleansing, and Data Verification using SQL query/MS Excel.

Overview

7
7
years of professional experience
1
1
Certification

Work History

Azure Data Engineer (contract)

PNC Bank
02.2024 - Current
  • Install and configure Azure services like virtual machines, app services, ADF, and Databricks
  • Collaborate in Extraction, Transformation, and Loading data from source systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics
  • Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks
  • Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool, and backward
  • Developed Azure Databricks notebooks to load data from Databricks into Azure SQL
  • Involved in implementing medium to large-scale BI solutions on Azure using Azure Data Platform Services (Azure Data Lake, Data Factory, Data Lake Analytics, Azure SQL DW)
  • Worked on developing the data pipelines in ADF to load data from on-premises Oracle to Azure SQL
  • Design and Develop event-driven architectures using blob triggers and Data Factory
  • Developing Batch processing solutions with Azure Databricks and Azure Event
  • Involve in Developing the Azure Databricks notebook to load the data from Databricks to Azure SQL
  • Design and develop data models in Power BI, ensuring accurate data representation and efficient performance
  • Create Sub-reports, drill Reports, Summary Reports, and Ad-hoc Reports using Power BI and developed complex stored procedures and functions to generate dashboard reports
  • Participate in database architecture and data modeling design
  • Implemented advanced analytics features in Power BI, such as DAX calculations, time intelligence, and forecasting, to provide valuable insights
  • Integrating Tableau with other BI tools, databases, and third-party applications
  • Employed Azure Data Lake Storage, leading to a 30% improvement in data storage and retrieval efficiency for large-scale datasets
  • Strong Experience in developing a Power BI dashboard to visualize key business KPIs, resulting in a weekly time savings of 2 hours on manual reporting
  • Implement and deploy Spark applications on Databricks clusters for data processing, achieving a 10% improvement in data processing speed
  • Solution Environment: Power BI Reports, SQL Server, Shell Scripting, JIRA, ADF, Snowflake, Tableau Azure data Lake, Airflow, Pyspark, Python, Azure data lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse

Data Engineer

T mobile
09.2022 - 12.2023
  • Developed and executed notebooks in Azure Databricks for interactive data exploration, visualization, and collaborative data analysis with data scientists
  • Employed Snowflake materialized views, data masking, and optimization to deliver efficient and secure data solutions for seamless data management and advanced big data analytics
  • Crafted and modified complex SQL queries, leveraging indexing and query optimization techniques to enhance database efficiency, leading to a 30% improvement in application response time
  • Achieved a 10% improvement in data processing efficiency by optimizing data flows within Azure Data Factory
  • Leveraged Databricks as a cloud-based platform for deploying and managing Apache Spark workloads, enabling efficient resource allocation and scalability for big data processing
  • Calculated AWS Data Pipeline workflows for efficient ETL Process, which reduced data processing time by 20%
  • Monitored and maintained the entire data pipeline infrastructure (Kafka, Snowflake, MongoDB) to ensure high availability and real-time data processing for fraud detection

Data Analyst / BI Consultant

Invitrogen Bio-Services India Pvt Ltd
08.2018 - 06.2021
  • Company Overview: India
  • Involved in creating statistical methods, data visualization tools, and data mining techniques to extract meaningful insights and identify trends and used SQL, Python, and R for data querying
  • Involved in Creating interactive Power BI reports with drill-throughs, slicers, and filters to enable users to explore data and answer ad-hoc questions
  • Collaborate with data engineers to ensure the availability and reliability of data sources for Power BI reporting
  • Implemented performance tuning techniques by identifying and resolving the bottlenecks in source, target, transformations, mappings, and sessions to improve performance Understanding the Functional Requirements
  • Responsible for identifying the missed records in different stages from source to target and resolving the issue
  • Collaborated with business analysts and data stakeholders to understand reporting requirements and translate them into effective Power BI solutions
  • Designed and developed interactive and visually appealing Power BI reports, dashboards, and scorecards to visualize complex data sets
  • Integrated Power BI with various data sources, including SQL databases, Excel files, and APIs, to create comprehensive and dynamic reports
  • Utilized advanced DAX calculations, measures, and KPIs to provide meaningful insights and actionable business intelligence
  • Implemented data transformations and modeling techniques in Power BI to ensure data accuracy, consistency, and optimal performance
  • Conducted data analysis and identified trends, patterns, and anomalies to support decision-making processes and improve business performance
  • India

Education

Masters - Big Data Analytics

University Of Central Missouri
Missouri
05.2024

Skills

  • Python
  • R
  • SQL
  • PyCharm
  • Jupyter Notebook
  • Hadoop
  • Hive
  • Apache Airflow
  • Apache Kafka
  • Apache Spark
  • Apache Flink
  • DataBricks
  • Azure
  • Azure DevOps
  • Azure Data Lake
  • Azure Data Factory
  • Azure Databricks
  • AWS
  • EC2
  • S3
  • RDS
  • Lambda
  • Glue
  • Athena
  • AWS Pipeline
  • Redshift
  • Tableau
  • Power BI
  • Excel
  • NumPy
  • Pandas
  • Matplotlib
  • Seaborn
  • TensorFlow
  • PySpark
  • Data Pipelines
  • Jenkins
  • GitHub
  • Git
  • SQL Server
  • PostgreSQL
  • MongoDB
  • DynamoDB
  • MySQL
  • Snowflake
  • Windows
  • MacOS

Certification

  • Certified Microsoft Azure Data Engineer Associate (DP-203)
  • Certified Microsoft Azure fundamentals (AZ)

Timeline

Azure Data Engineer (contract)

PNC Bank
02.2024 - Current

Data Engineer

T mobile
09.2022 - 12.2023

Data Analyst / BI Consultant

Invitrogen Bio-Services India Pvt Ltd
08.2018 - 06.2021

Masters - Big Data Analytics

University Of Central Missouri
Harish A