Overview
Work History
Education
Skills
Work Preference
Timeline
Generic
Dhruv Shetty

Dhruv Shetty

San Jose,CA

Overview

3
3
years of professional experience

Work History

Research Assistant

Research Foundation, San Jose State University
07.2024 - Current
  • Harvested and merged 1TB Midlife In United States(M.I.D.U.S) datasets using Python, ensuring integrity for+ records
  • Applied data modeling, indexing techniques, and SQL to optimize database performance, reducing query run times by 55% across large datasets
  • Automated data cleaning and transformation using Python, Pandas, and data wrangling methodologies, ensuring high data quality and reducing manual tasks by 75% for merged M.I.D.U.S datasets
  • Enhanced workflow by utilizing Python and MySQL, employing ETL methodologies and batch processing, to handle 15M+ data points, increasing processing speed by 33%.

Data Engineer

Quantiphi Analytics Solutions Pvt Ltd
08.2022 - 07.2022
  • Deployed 15+ scalable data pipelines on AWS using PySpark and Snowflake, processing over 20TB of structured and unstructured data for key business insights
  • Streamlined data processing by implementing ETL pipelines with Apache Airflow and automated data validations using Python, ensuring 93% data accuracy and consistency across multiple workflows
  • Improved performance of data ingestion pipelines by 50% using MapReduce for parallel processing, enabling faster data ingestion and transformation across distributed systems
  • Optimized data retrieval and reduced storage costs by 45% using Hadoop and Snowflake for data compression and partitioning
  • Created APIs to support bulk operations on top of existing APIs which reduced client API calls significantly.

Data Engineer Intern | Tech

Mahindra
06.2021 - 08.2021
  • Processed and prepared over 100K data entries using Python, SQL, supporting predictive modeling and improving data accuracy through data cleaning and ETL processes
  • Developed and deployed data visualizations using Power BI, Docker, and Kubernetes, automating reporting pipelines and reducing manual efforts by 70%
  • Employed Apache Kafka for real-time data streaming, enabling efficient message processing between microservices and improving data consistency across distributed systems
  • Projects
  • Trayambakam

Education

Masters of Science - Data Analytics

San Jose State University
Dec 2025

Bachelors of Engineering - Information Technology

D.J Sanghvi College of Engineering
May. 2022

Skills

  • Technical Skills
  • Programming Languages:
  • Python,SQL, NoSQL,R, Java, JavaScript, Scala
  • Databases & Cloud:
  • MySQL, PostgreSQL, Amazon Web Service(AWS), Microsoft Azure, GCP, MongoDB, Hadoop
  • Tools: Tableau, PowerBI, REST APIs, Git, Docker, Kubernetes, Apache Airflow, Postman, Swagger, Apache Kafka
  • Frameworks: pandas, scikit-learn, matplotlib, seaborn, TensorFlow, YOLO, Keras, OpenCV, PyTorch, PySpark
  • Certification: Google Cloud Certified Associate Cloud Engineer

Teamwork and Collaboration

Microsoft Office

Problem-Solving

Data Collection

Analytical Thinking

Data Analysis

Research and analysis

Complex Problem-Solving

Documentation skills

SPSS Proficiency

Work Preference

Work Type

InternshipFull TimePart TimeGig WorkContract Work

Work Location

On-SiteRemoteHybrid

Timeline

Research Assistant

Research Foundation, San Jose State University
07.2024 - Current

Data Engineer

Quantiphi Analytics Solutions Pvt Ltd
08.2022 - 07.2022

Data Engineer Intern | Tech

Mahindra
06.2021 - 08.2021

Masters of Science - Data Analytics

San Jose State University

Bachelors of Engineering - Information Technology

D.J Sanghvi College of Engineering
Dhruv Shetty