Summary
Overview
Work History
Education
Skills
Timeline
Generic

SANTOSH KARNA

Indianapolis,IN

Summary

Experienced Senior Data Engineer with over 6 years of hands-on experience designing and deploying robust data solutions using Azure, AWS, and GCP cloud platforms. Adept in building scalable data pipelines and analytical platforms for real-time and batch processing. Strong foundation in Python, R, SQL, and PySpark for transforming clinical, biological, and enterprise data. Proven track record of collaborating with cross-functional teams to deliver compliant, scalable data products using modern cloud-native tools. Holds a Master’s in Data Science (NJIT) and a Bachelor’s in Computer Science. Specialized in clinical data engineering and biological datasets, with deep understanding of data QC workflows and feature selection techniques. Proficient in Git-based version control, CI/CD using Azure DevOps & AWS CodePipeline. Experienced in delivering data visualizations using Tableau and Power BI (with knowledge of Spotfire equivalents), and in creating machine learning and deep learning workflows using Python, Spark MLlib, and Vertex AI. Open to up to 15% domestic and international travel.

Overview

7
7
years of professional experience

Work History

Azure Senior Data Engineer

Marriott International
04.2023 - Current
  • Built scalable ADF pipelines and PySpark ETL workflows to handle high-volume hotel and loyalty data across Azure Data Lake and Databricks.
  • Deployed Delta Lake and Synapse solutions to support real-time analytics and historical reporting for 20+ enterprise dashboards.
  • Reduced batch processing time by 60% through optimized job parallelization.
  • Collaborated with data governance teams to implement lineage tracking across 100+ data assets using Azure Purview.
  • Implemented SCD Type 2 logic and collaborated with Power BI developers for curated reporting datasets.
  • Automated deployments using Azure DevOps; governed data assets with Purview.
  • Developed CI/CD workflows, custom Python functions via Azure Functions, and dynamic ETL augmentation.

AWS Cloud/Data Engineer

American Express
10.2021 - 04.2023
  • Developed modular ETL pipelines in AWS Glue and Redshift for 5+ business domains.
  • Enabled real-time ingestion using Kinesis and Lambda, improving fraud detection response time.
  • Reduced monthly reporting latency by 50% through Redshift optimization and cost-efficient storage layers.
  • Delivered 25+ CI/CD deployments with minimal rollback incidents.

Data Analyst

Bank of America
09.2020 - 09.2021
  • Conducted in-depth analysis on financial and risk data using SQL and Python.
  • Built and maintained 15+ interactive Power BI dashboards to support compliance and performance metrics.
  • Automated daily reports, saving over 25 hours per week in manual efforts.

Data Analyst

Croissance Clinical Research
01.2018 - 08.2019
  • Processed clinical trial datasets and reconciled lab data with eCRF records to ensure data integrity.
  • Built Tableau dashboards to visualize patient enrollment, adverse events, and protocol deviations.
  • Applied machine learning to predict patient dropout with 85% accuracy, improving retention planning.

Education

Master of Science - Data Science

New Jersey Institute of Technology
Newark, NJ
05-2021

Bachelor of Technology - Computer Science

Mahatma Gandhi Institute of Technology
Hyderabad, India
05-2018

Skills

Databases - Oracle, MySQL, Hive, SQL Server, HBase, Cassandra, MongoDB

Bigdata Technologies - HDFS, Hive, PySpark, Map Reduce, Pig, YARN, Sqoop, Oozie, Zookeeper, Flume

Programming Languages - Python, Java, SQL, R, PL/SQL, Scala, JSON, XML, C#

Cloud Services - Azure, Cosmos, Blob storage, Kubernetes, Azure Synapse Analytics(DW), Azure Data Lake, Databricks, DWH, Data Factory

Techniques - Datamining, Clustering, Data Visualization, Data Analytics

Methodologies - Agile/Scrum, UML, Design Patterns, Waterfall

Container Platform - Docker, Kubernetes, CI/CD, Jenkins

Tools & Utilities - JIRA, GitHub, Tableau 91, Power BI, Control-M, PowerShell

Timeline

Azure Senior Data Engineer

Marriott International
04.2023 - Current

AWS Cloud/Data Engineer

American Express
10.2021 - 04.2023

Data Analyst

Bank of America
09.2020 - 09.2021

Data Analyst

Croissance Clinical Research
01.2018 - 08.2019

Master of Science - Data Science

New Jersey Institute of Technology

Bachelor of Technology - Computer Science

Mahatma Gandhi Institute of Technology