Summary
Overview
Work History
Education
Skills
Certifications and awards
Timeline
Generic

Divya Arulprakash

Germantown,USA

Summary

A results-driven Azure Data Engineer with 4 years of experience in designing, developing, and optimizing data solutions on the Microsoft Azure platform. Expertise in creating end-to-end data pipelines using Azure Data Factory, Azure Databricks, and Azure Synapse Analytics to integrate and transform data across diverse sources. Proficient in ETL/ELT processes, data modeling, and performance optimization, enabling data-driven decision-making. Adept in working with SQL, Python, and PySpark, and well-versed in data governance, security, and compliance practices. Additionally, with 1 year of experience as an SQL DBA, I bring a holistic approach to data management. Certified in Azure Data Engineering and Azure Data Fundamentals, with a strong commitment to continuous innovation and operational excellence.

Overview

5
5
years of professional experience

Work History

Data Engineer (Contract)

Larish Consulting Services
01.2021 - Current
  • Designed and developed robust data pipelines in Azure Data Factory (ADF) and Databricks using PySpark, implementing Change Data Capture (CDC) and business logic transformations with Delta Lake to ensure data accuracy and integrity.
  • Built and optimized ETL/ELT workflows to extract, load, and transform data from diverse source systems into Azure Data Lake Storage (ADLS Gen2)using ADF, Databricks, and Synapse Analytics with PySpark and Spark SQL.
  • Created both batch and streaming data pipelines in ADF, integrating data from relational and streaming sources like Azure Event Hub and Apache Kafka into ADLS Gen2.
  • Optimized PySpark jobs by tuning configurations, partitioning strategies, caching mechanisms, and handling skewed data, ensuring high-performance processing.
  • Developed reusable ADF pipelines for incremental and full data ingestion, supporting truncate-and-reload and end-to-end workflows based on client requirements.
  • Designed and implemented star schemas in Azure Synapse Analytics for sales trend analysis, leveraging partitioning and column store indexing to improve query efficiency.
  • Directed migration of 2TB of historical sales data from on-premises OLAP systems to Azure Synapse Analytics, achieving seamless integration and reducing query response times.
  • Collaborated with data scientists to apply Azure Machine Learning for real-time predictive analytics, enhancing business forecasting and proactive sales strategies.
  • Transformed semi-structured JSON events from Event Hub into business-aligned formats using PySpark/Python before loading them into Snowflake and other data stores.
  • Implemented security policies using Azure RBAC and encryption mechanisms, ensuring compliance with data protection regulations and safeguarding sensitive data.
  • Monitored and improved pipeline performance via Azure Monitor and Log Analytics, addressing issues proactively to ensure consistent system reliability.
  • Established data governance frameworks with Azure Purview, enhancing data lineage, visibility, and compliance throughout the data lifecycle.
  • Migrated and managed ADF pipelines using ARM templates, integrating Azure Key Vaults, triggers, and Integration Runtime for enhanced flexibility and security.
  • Documented best practices and created knowledge-sharing materials to improve team collaboration and consistency.
  • Conducted proof of concepts to optimize the architecture of cloud data lakes, resulting in improved performance and cost-efficiency.

Data Engineer

Larish Consulting Solutions Inc(Volunteer Work)
01.2020 - 01.2021
  • Optimized data processing by implementing efficient ETL pipelines and streamlining database design.
  • Designed data models for complex analysis needs.
  • Reviewed project requests describing database user needs to estimate time and cost required to accomplish projects.
  • Fine-tuned query performance and optimized database structures for faster, more accurate data retrieval and reporting.

Education

MBA - Human Resources Management

Sathyabama Institute of Science And Technology
CHENNAI,TN,INDIA
05-2013

Bachelor of Science - Visual Communication

Sathyabama Institute of Science And Technology
CHENNAI,TN,INDIA
05-2011

Skills

  • Cloud Platforms: Azure Data Lake Storage Gen2, Azure Data Factory, Azure Databricks, Azure Synapse Analytics, Azure Stream Analytics, Azure Event Hub, Azure Logic Apps, AWS S3, AWS Glue, AWS Lambda, AWS EMR, Google Cloud Storage
  • Data Technologies: SQL, T-SQL, PL/SQL, Data Warehousing, Data Modeling (Star/Snowflake), Delta Lake, Change Data Capture (CDC), Data Governance (Azure Purview), Data Catalogs, ETL/ELT
  • Programming Languages: Python, PySpark, Scala, R, SQL, JavaScript, Shell Scripting
  • Big Data Frameworks & Processing: Apache Spark, Hive, Kafka, Hadoop, Snowflake, Apache Flink
  • Data Visualization Tools: Power BI, Tableau, Google Data Studio, Apache Superset, QlikView
  • Databases: SQL Server, Oracle Database, Cosmos DB, MySQL, PostgreSQL, Snowflake, AWS Redshift, DynamoDB, MongoDB, Cassandra
  • DevOps & Automation Tools: Azure DevOps (CI/CD), Jenkins, Git, GitHub, Bitbucket, Terraform, Ansible, ARM Templates, Kubernetes, Docker
  • Monitoring & Logging: Azure Monitor, Log Analytics, AWS CloudWatch, Splunk
  • Methodologies: Agile, DevOps
  • Tools & Technologies: Visual Studio Code, JIRA, Confluence, Postman, Fiddler, SSMS, Storage Explorer

Certifications and awards

  • Microsoft Azure Data Fundamentals (DP-900T00-A), 01/01/23
  • Data Engineering on Microsoft Azure (DP-203T00-A), 01/01/24
  • Databricks Generative AI Accreditation, 01/01/24

Timeline

Data Engineer (Contract)

Larish Consulting Services
01.2021 - Current

Data Engineer

Larish Consulting Solutions Inc(Volunteer Work)
01.2020 - 01.2021

MBA - Human Resources Management

Sathyabama Institute of Science And Technology

Bachelor of Science - Visual Communication

Sathyabama Institute of Science And Technology
Divya Arulprakash