Summary

Overview

Work History

Education

Skills

Timeline

Dhruv Gupta

United States

Summary

Over 4 years of hands-on experience in data engineering, focusing on designing and optimizing complex data pipelines.
Proficient in ETL processes, data integration, and real-time data processing using tools like Apache Spark, Kafka, and AWS Glue.
Expertise in cloud platforms including AWS and Azure, with a strong background in deploying scalable data solutions.
Skilled in working with Big Data technologies such as Hadoop, Hive, and MapReduce to manage and process large datasets.
Extensive experience in SQL and Python for data manipulation, analysis, and building data-driven solutions.
Strong knowledge of data warehousing concepts, with hands-on experience in Redshift, Snowflake, and similar technologies.
Adept at building CI/CD pipelines using Jenkins, Docker, and Kubernetes, ensuring efficient and reliable deployment processes.
Proficient in using Control-M for batch processing and job scheduling, optimizing workflow automation.
Experienced in Unix/Linux environments, leveraging shell scripting for automation and system management.
Familiar with machine learning frameworks like Keras and TensorFlow, integrating ML models into data pipelines.
Strong problem-solving skills with a focus on improving data processing efficiency and reducing operational costs.
Excellent collaboration and communication skills, working effectively with cross-functional teams to deliver data-driven insights.

Overview

years of professional experience

Work History

Data Engineer

Meditab Software Inc.

03.2023 - 07.2024

Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability
Collaborated with cross-functional teams to implement CI/CD pipelines using Jenkins and Terraform, streamlining deployment process by 30%
Built and optimized Snowflake data models, enhancing data warehousing efficiency and improving query performance by 15%
Developed Spark applications using PySpark and optimized processing efficiency, resulting in 20% reduction in processing time for large datasets
Orchestrated end-to-end AWS-based data pipelines, integrating data from various sources into Amazon S3, and performed data transformations using AWS Glue

Research Assistant

University Of Pittsburgh Joseph M. Katz Graduate School Of Business

09.2022 - 12.2023

Organized research materials, maintaining well-ordered workspace conducive to productivity.
Designed ML algorithms (TensorFlow, PyTorch, Bayesian Networks) for breast cancer severity assessment, achieving 95% accuracy and aiding early diagnosis
Streamlined ETL processes with Hive and designed scalable data pipelines using Apache Spark, boosting data processing efficiency by 30% and real-time analytics capabilities by 25%, enabling more informed business decisions
Integrated Linear Regression for sales forecasting and K-Means clustering for customer segmentation into marketing, sales, and customer service workflows, boosting departmental efficiency and overall operational effectiveness by 15%

Computer Vision Engineer

Human Engineering Research Laboratories

05.2023 - 09.2023

Enhanced YOLOv3 for real-time object detection to 99.93% accuracy, reducing navigation errors by 50%. Integrated ORB-SLAM for mapping and A* and RRT path planning algorithms, improving task completion time by 20%
Refined Random Forest models to boost precision in predicting user behaviors, achieving 30% accuracy gain and 25% efficiency improvement over prior algorithms, enhancing insights into user preferences and engagement patterns
Developed and integrated collaborative filtering and matrix factorization algorithms, increasing customer engagement by 18% and boosting sales by 15% through tailored recommendations

AI Developer

Dosepacker

04.2020 - 07.2022

Proficiency in configuring, working with big data technologies achieving a 20% progress in performance using Hadoop, Sqoop, Spark, Hive and HBase on AWS EMR clusters
Assisted in the migration of legacy data systems to AWS Redshift, reducing infrastructure costs by 15%
Analyzed large datasets to identify trends and patterns in customer behaviors.
Implemented data pipelines with Apache Airflow for scheduling and monitoring ETL workflows, ensuring data accuracy and timeliness
Applied Snowflake features like Snow Pipes and stages to process 100GB data every day in real- time and ingest the data into Snowflake tables
Employed Apache Spark with Random Forest and Gradient Boosting to enhance model accuracy. Optimized data processing with AWS Glue, enhancing efficiency. Enhanced sales forecasting, inventory management, and customer behavior analysis, boosting supply chain productivity
Developed targeted marketing campaigns using K-means clustering for advanced customer segmentation. Launched to optimize campaign effectiveness and personalize customer interactions, achieving a 20% boost in ROI

Education

Master of Science - Information Science

University of Pittsburgh

Pittsburgh, PA

04.2024

Bachelor of Science - Information Technology

VIT University

Vellore, India

06.2022

Skills

Scripting Languages: Python, SQL, C, Java, ROS
Big Data Processing: Hadoop, MapReduce, Spark
Machine Learning: TensorFlow, Keras, PyTorch, NumPy, Pandas
ETL Tools: AWS Glue, Apache Airflow, Control-M
Cloud Platforms: AWS ( S3, Lambda, Redshift, EMR, RDS), Azure ( Databricks, Data Lake)

Databases: MySQL, MongoDB, Snowflake
OS: Unix/Linux
CI/CD Tools: Docker, Kubernetes, Terraform, AWS CodePipeline
Version Control: Git, GitHub, GitLab

Timeline

Computer Vision Engineer

Human Engineering Research Laboratories

05.2023 - 09.2023

Data Engineer

Meditab Software Inc.

03.2023 - 07.2024

Research Assistant

University Of Pittsburgh Joseph M. Katz Graduate School Of Business

09.2022 - 12.2023

AI Developer

Dosepacker

04.2020 - 07.2022

Master of Science - Information Science

University of Pittsburgh

Bachelor of Science - Information Technology

VIT University

Dhruv Gupta

Summary

Overview

Work History

Data Engineer

Research Assistant

Computer Vision Engineer

AI Developer

Education

Master of Science - Information Science

Bachelor of Science - Information Technology

Skills

Timeline

Computer Vision Engineer

Data Engineer

Research Assistant

AI Developer

Master of Science - Information Science

Bachelor of Science - Information Technology

Similar Profiles

Smit R GosaiSmit R Gosai

VEDANG KHAMARVEDANG KHAMAR

Bhargav KirneBhargav Kirne

LILIAN SANGLILIAN SANG

Rashmi BairoliyaRashmi Bairoliya