Around 24 months of extensive professional experience in full Software Development Life Cycle (SDLC), Agile Methodology and maintenance in Azure, Azure Databricks, Data Factory, Azure Synapse Analytics, AzureMigrate, CloudFormation, CloudWatch, Data Warehousing, Hive, PowerBI, PySpark and Python.
Exceptional understanding of relational databases, MySQL, and SQL Server.
Strong knowledge of core Spark components including - RDDs, Data frame and Dataset APIs, Data Streaming, in memory capabilities, DAG scheduling, data partitioning and tuning.
Good experience on working in an Agile Scrum environment or good familiarity with the Agile methodologies.
Good knowledge in data ingestion using Kafka and Flume.
Overview
3
3
years of professional experience
Work History
Azure Data Engineer
Digicloud LLC
05.2023 - Current
Designed and implemented data pipelines in Azure Synapse Analytics to support business intelligence and reporting requirements
Implemented the ETL process for loading data from various sources into Databricks tables andAzure Synapse Tables
Used Data Factory to develop pipelines and performed batch processing using Azure Batch processing
Designed and developed pipelines to move the data from Azure blob storage/file share to SQL Data Warehouse.
Developed Spark applications using Spark-SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
Responsible for estimating the cluster size, monitoring, and troubleshooting of the data bricks cluster.
Worked on Microsoft technologies using Azure Data Lake, Azure Databricks, Azure Data Factory, Azure SQL data warehouse.
Azure Data Engineer
Quiddity Infotech
05.2022 - 04.2023
Developed and maintained ETL processes to ingest, transform, and load data into Azure Synapse
Analytics using Azure Data Factory and Azure Logic Apps.
Developed and maintained scalable data pipelines and built out new data source integrations to support continuing increases in data volume and complexity.
Built processes supporting data transformation, data structures, metadata, dependency and workload management.
Connected multiple data sources in Power BI to implement working reports.
Implemented medium to large scale BI solutions on Azure using Azure Data Platform services (Azure Data Lake, Data Factory, Data Lake Analytics, Stream Analytics, Azure SQL DW, HDInsight/Databricks.
Developed advanced analytics solutions in collaboration with data scientists and analysts, providing business decision-makers with data-driven insights
Implemented data security and access controls in accordance with industry best practices and company policies to protect sensitive data
Regularly optimized and tuned the performance of SQL queries and stored procedures to save money and resources
Conducted regular monitoring and troubleshooting of data pipelines, addressing issues promptly to minimize disruptions.
Documented data architecture, data flows, and technical specifications to ensure knowledge transfer and maintain data lineage.
DATA ENGINEER
Scotline Tech
12.2020 - 07.2021
Developed Spark applications using Spark-SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analysing & transforming the data to uncover insights into the customer usage patterns.
Developed PySpark applications to extract data from multiple TPAs with python & shell scripting.
Developed python APIs to dump the array structures in the processor at the failure point for debugging.
Developed the scripts to create HIVE table DDL and analyze table from PySpark jobs.
Worked on SQL Databases and wrote SQL Scripts to create Stored Procedures, Views, Tables, etc.
Worked on generating and documenting Metadata while designing OLTP and OLAP systems environment.
Education
Master of Engineering -
University of Cincinnati
Bachelor of Technology -
CVR College of Engineering
Skills
Technical Summary:
Big Data Technologies:
Azure, Data Factory, Databricks, Spark, Pyspark, HIVE, Hadoop