Summary
Overview
Work History
Education
Skills
Projects
Certification
Timeline
Generic

Dhruv Tyagi

South Windsor,CT

Summary

Data Science and Machine Learning student with internship experience at Travelers Insurance, specializing in NLP and large language models for automating unit test generation. Developed an Isolation Forest-based anomaly detection system for KPI time-series data. Proficient in Python, SQL, Databricks, PySpark, and MLflow, with expertise in building scalable pipelines and deploying ML models. Certified AWS Cloud Practitioner with a focus on predictive modeling and cloud-based analytics, dedicated to leveraging AI/ML for data-driven decision making.

Overview

1
1
year of professional experience
1
1
Certification

Work History

Engineering Development Program (EDP) Intern

Traveler Insurance
Hartford, CT
06.2025 - 08.2025
  • Spearheaded the redesign of a legacy reporting system by developing an unsupervised anomaly detection pipeline using Isolation Forest on KPI time series from insurance policy data.
  • Built an end-to-end PySpark pipeline in Databricks (Spark 3.5) integrating scikit-learn, with robust feature engineering (e.g., z-scores, percent change) and statistical rule-based comparisons.
  • Integrated MLflow for experiment tracking and model versioning; maintained full reproducibility across iterations and promoted top-performing models to registry.
  • Resolved compatibility issues across distributed Spark clusters and implemented data pre-processing pipelines (e.g., median imputation, column pruning) to ensure model integrity.
  • Automated anomaly filtering logic to output only high-impact records, enhancing report signal-to-noise ratio and surfacing explainable metrics (iforest_score).
  • Reduced false positive alerts by 60% and cut investigation lead time by 40%, significantly improving operational decision-making across teams.

Engineering Development Program (EDP) Intern

Travelers Insurance
Hartford
06.2024 - 08.2024
  • Designed AI-powered pipeline to generate Python unit test code from natural language user stories, reducing manual creation time by over 60%.
  • Applied NLP techniques with spaCy for requirement parsing and condition extraction for test cases.
  • Utilized GPT-based models (Hugging Face Transformers) to create Pytest-compatible test function templates.
  • Integrated generated test code into CI/CD pipelines (Jenkins, GitHub Actions) for automated execution during builds.
  • Established feedback loop to log and review failing test cases, refining model prompts to enhance accuracy.
  • Developed reporting dashboards in AWS QuickSight and SQL to monitor coverage, pass/fail rates, and efficiency improvements.

Education

Bachelor of Science - Computer Science

University of Connecticut
Storrs
05-2026

Skills

Programming & Languages: Python, R, Java, C, SQL
Machine Learning & AI: Scikit-learn, TensorFlow, Keras, XGBoost, MLflow, Hugging Face, spaCy
Data Engineering & Tools: Pandas, NumPy, PySpark, Databricks, Delta Lake
Cloud & DevOps: AWS (Certified Cloud Practitioner), Google Cloud Platform, Git, CI/CD (Jenkins, GitHub Actions)
Databases: MongoDB, Oracle, SQL
Visualization & Reporting: Matplotlib, Seaborn, AWS QuickSight

Projects

Detecting Parkinson’s Disease with XGBoost

  • Built a classification model to detect Parkinson’s disease from medical voice recordings.
  • Applied signal preprocessing and trained with XGBoost, tuning hyperparameters to optimize performance.
  • Achieved 93% accuracy, demonstrating potential for early clinical screening applications.

Handwritten Digit Recognition (CNN on MNIST)

  • Designed and implemented a Convolutional Neural Network (CNN) in TensorFlow/Keras for digit recognition.
  • Applied data augmentation to reduce overfitting and improve generalization.
  • Achieved 89% accuracy, validating the effectiveness of deep learning for image classification.

Historical Weather Data Analysis for Air Quality Prediction

  • Developed a predictive model using weather, urban activity, and traffic datasets to forecast air quality.
  • Conducted feature engineering and used ensemble models to improve accuracy by 78% over baseline.
  • Showcased ability to integrate multiple real-world datasets for environmental health analytics.

Music Genre Classification

  • Built an ML model to classify songs into genres using Librosa for feature extraction and Scikit-learn for classification.
  • Trained on audio datasets and achieved 75% accuracy, demonstrating skills in signal processing and supervised learning.

Certification

  • AWS Certified Cloud Practitioner
  • AWS Certified AI Practitioner
  • CS50’s Introduction to Artificial Intelligence with Python — edX

Timeline

Engineering Development Program (EDP) Intern

Traveler Insurance
06.2025 - 08.2025

Engineering Development Program (EDP) Intern

Travelers Insurance
06.2024 - 08.2024

Bachelor of Science - Computer Science

University of Connecticut
Dhruv Tyagi