Summary
Overview
Work History
Education
Skills
Certification
Websites
Timeline
Generic

Rahul M

Data Engineer/Data Analyst

Summary

Analytical and solution-oriented Data Analyst/Data Engineer with over 5 years of experience designing, developing, and optimizing scalable data pipelines and analytical solutions across cloud and distributed environments, primarily on Microsoft Azure and Databricks. Proven ability to transform structured and unstructured data into actionable business insights through ETL/ELT processes, advanced data modeling, and real-time processing frameworks using PySpark, Scala, and SQL. Strong expertise in applying statistical analysis, data mining, and predictive modeling to uncover trends and support data-driven decision-making. Adept at leveraging tools like Azure Synapse, Delta Lake, Power BI, and Azure Data Factory to deliver measurable business value. Skilled in collaborating across multidisciplinary teams, with excellent communication and problem-solving abilities. Committed to continuous learning, automation, and innovation to improve performance and enable data maturity across organizations.

Overview

5
5
years of professional experience

Work History

Sr. Data Engineer

Client: Geico Tech
, USA
06.2023 - Current
  • Improved data ingestion processes by 30% for insurance claims automation using Scala Spark and Spark SQL.
  • Developed scalable ETL pipelines on Azure, enhancing claims processing and underwriting automation.
  • Designed real-time and batch data pipelines to integrate diverse insurance data for actuarial risk modeling.
  • Managed Snowflake-based data warehouse, improving retrieval speeds for policyholder information significantly.
  • Implemented Medallion Architecture in Databricks Lakehouse, optimizing data ingestion and governance
    procedures.
  • Leveraged SQL performance tuning techniques to optimize large-scale insurance dataset handling in Snowflake.
  • Established interactive Power BI dashboards to deliver insights into claims performance and customer trends.
  • Integrated Azure Synapse Analytics with Power BI, enhancing actuarial risk analysis capabilities.

Data Engineer

Client: RxAdvance
, USA
08.2022 - 04.2023
  • Established scalable ETL processes on AWS employing EMR, Glue, S3, Redshift, Python, Scala, Spark, and SQL to
    manage extensive datasets.
  • Implemented advanced data warehousing in Snowflake that improved retrieval times by 40%.
  • Created efficient data lakes using Hadoop and Hive for robust big data management.
  • Incorporated NoSQL databases including MongoDB and DynamoDB to effectively store unstructured healthcare
    information.
  • Enhanced query performance by 25% through optimization of SQL across various platforms including SQL Server
    and PostgreSQL.
  • Devised real-time streaming solutions with Apache Kafka on AWS for immediate processing of urgent datasets.
  • Automated intricate workflows via Apache Airflow to improve operational efficiency with a 20% reduction in
    manual tasks.
  • Facilitated the integration of Kubernetes with Docker and AWS to optimize container workload deployments.

Data Engineer

Client: (Ikea) TCS
, India
06.2021 - 12.2021
  • Optimized ETL pipelines for seamless data extraction from ERP systems into Azure-hosted warehouses.
  • Designed effective ETL/ELT workflows via PySpark and Azure Data Factory to process extensive datasets from
    multiple retail sources.
  • Built scalable data pipelines using Azure Databricks, improving efficiency by 30% in data processing workflows.
  • Implemented Delta Lake solutions to enhance storage optimization and query performance within ADLS Gen2
    environments.
  • Created a large-scale data lake with ADLS Gen2 and Databricks for efficient ingestion processes, resulting in a 30%
    boost in query performance.
  • Developed an analytics platform leveraging Azure Synapse for real-time insights into customer behavior and sales
    metrics.
  • Oversaw SCRUM sprints and backlog management through JIRA to drive Agile project delivery.
  • Migrated legacy on-premise systems to Azure Synapse and ADLS Gen2 while ensuring minimal downtime.

Data Engineer/Data Analyst

Client: IntraLearn Software Corporation
, India
03.2019 - 05.2021
  • Created business requirement gathering framework in accordance with project scope and SDLC methodology.
  • Applied T-SQL for MS SQL Server and ANSI SQL extensively across multiple disparate databases.
  • Worked alongside senior data engineers to deploy Big Data solutions leveraging Hadoop and Hive.
  • Performed detailed data analysis and modeling to enhance decision-making processes using SQL Server, T-SQL, and
    Oracle.
  • Constructed end-to-end data solutions on Azure for ingestion, storage, processing, and visualization purposes.
  • Managed ETL pipelines using SSIS along with custom Python scripts to streamline workflows effectively.
  • Implemented advanced analytics and data visualizations within Power BI for real-time stakeholder insights.
  • Automated data entry tasks through custom Power Apps solutions, resulting in increased engagement.

Education

Master of Science - Data Science

University of Memphis
Memphis, TN

Skills

Data Visualization & Reporting: Power BI (DAX, Power Query/M, Semantic Model Design, Dashboard Development, Power BI Admin), SSRS, Azure Analysis Services, Tableau

Querying & Scripting: T-SQL, PL/SQL, PostgreSQL, MySQL, Oracle SQL, Power Query (M), DAX, Python, SQL for Data Warehousing Programming Languages: Python, Java, Scala

Big Data & Distributed Processing: Apache Spark, Hadoop, Kafka, Azure Databricks

Data Integration & ETL: Azure Data Factory (ADF), SSIS, Azure Databricks, SQL Server Agent, Power BI Dataflows

Databases & Warehousing:SQL Server, Azure SQL Database, PostgreSQL, MySQL, Oracle, Snowflake, Azure Synapse Analytics

Cloud & Data Platforms:Microsoft Azure (ADLS Gen2, Azure SQL, Synapse Analytics, Azure Logic Apps, Azure Monitor, Azure Key Vault, Azure Databricks), AWS (basic exposure)

Data Modeling & Architecture: Dimensional Modeling, Star and Snowflake Schema Design, Semantic Layer Design, Stored Procedures, Views, Triggers, Data Quality Assurance, Data Mining & Segmentation Techniques

Monitoring & Support: Data Load Monitoring (including after-hours and on-call), Data Pipeline Troubleshooting, Performance Tuning, Hybrid Cloud Environment Adaptability

Productivity & Collaboration: Microsoft Excel, PowerPoint, Outlook, Teams, Zoom, Agile Work Environments

Certification

Microsoft certified: DP 203 Azure Data Engineer Associate, DP 900 Azure Data Fundamentals.

Oracle Certification: Oracle Cloud Infrastructure 2024 Generative AI Certified Professional.

Snowflake Certification: Hands On Essentials – Data Warehouse & Data Engineering Badges.

Databricks Certification: Academy Accreditation – Generative AI Fundamentals, Azure Databricks Platform Architect Badge, Databricks Lakehouse Fundamentals Badge.

Timeline

Sr. Data Engineer

Client: Geico Tech
06.2023 - Current

Data Engineer

Client: RxAdvance
08.2022 - 04.2023

Data Engineer

Client: (Ikea) TCS
06.2021 - 12.2021

Data Engineer/Data Analyst

Client: IntraLearn Software Corporation
03.2019 - 05.2021

Master of Science - Data Science

University of Memphis
Rahul MData Engineer/Data Analyst