Summary
Overview
Work History
Education
Skills
Projects
Timeline
ResearchAssistant
SARTHAK KAR

SARTHAK KAR

San Diego,CA

Summary

Dedicated professional skillful in sample collection, data recording and instrument calibration. Resourceful and adaptable individual with extensive experience developing and conducting experiments to document results for studies. Comfortable handling complex issues, meeting strict deadlines and adjusting to rapidly changing conditions.

Overview

5
5
years of professional experience
6
6
years of post-secondary education

Work History

Research Assistant

San Diego State University
11.2023 - Current
  • Established automated quality control processes for Brain Imaging Data Structure (BIDS) datasets focused on child development research, leveraging Python scripting and incorporating advanced statistical methods to ensure data integrity and reliability
  • Implemented rigorous data cleaning and feature engineering techniques to ensure dataset accuracy and reliability
  • Leveraged machine learning models for classification and anomaly detection, collaborating with domain experts to define tailored quality metrics and driving continuous improvement in data quality management to optimize organizational performance.

Data Engineer

Bank Of America
07.2020 - 06.2023
  • Streamlined application migration from CDH to CDP Cloudera using Cloudera Manager, successfully migrating six applications
  • Enhanced security measures by upgrading from Sentry to Ranger, resulting in an 83% reduction in accidental database changes through Kerberos and policy configurations
  • Crafted a Python script utilizing REST API for automated CSV file generation, reducing database setup time by 60%
  • Spearheaded the implementation of an ETL data pipeline using Python and HBase, resulting in a 30% increase in the application team's productivity
  • Successfully resolved critical issue tickets related to Hive, Kerberos, and Spark, addressing 15% of all issue tickets
  • Contributed to software development efforts, decreasing new database creation time from 3 days to 1 day
  • Managed PPP loan forgiveness process within an Agile development environment
  • Supported consumer applications team in schema creation and resolving query issues in Hadoop and Spark cluster
  • Implemented CI/CD methodologies for enhanced development workflows.

ML Intern

National University of Singapore
06.2019 - 07.2019
  • Developed an image filtering and contour-drawing application in Python using scikit-image and NumPy, leveraging a dataset of 14,000 sample images
  • Utilized the ResnetV2 model to train image data for enhanced application functionality
  • Collaborated on a project aimed at revolutionizing restaurant food ordering for individuals with hearing and speech impairments, resulting in a 70% reduction in order times.

Education

MSc in Big Data Analytics -

San Diego State University
San Diego, US
08.2023 - 05.2025

B. Tech in Computer Science - undefined

Vellore Institute of Technology
Vellore, India
07.2016 - 05.2020

Skills

undefined

Projects

  • Food Vision, 01/2024, 03/2024, Pioneered a Computer Vision project for food category classification in images, employing TensorFlow and the EfficientNet B0 model., Orchestrated data preprocessing using pandas, ensuring uniform image dimensions and scaling within the range of 0 to 1., Enhanced the EfficientNet B0 model's performance by fine-tuning its last five layers, achieving a validation accuracy of 60% with a fraction of the training data. Developed multiple prototypes with various layers of fine-tuning., Currently leading efforts to containerize the model using Docker for seamless deployment.
  • SkimLit, 02/2024, 04/2024, Developed SkimLit, an NLP model for abstract sentence classification, leveraging deep learning techniques and various embeddings to enhance accuracy., Created a multimodal architecture capable of handling diverse data types simultaneously, resulting in improved model performance and versatility., Successfully applied SkimLit to make predictions on PubMed abstracts, demonstrating its practical utility for researchers in literature review tasks., Currently working on implementing the model in AWS SageMaker for scalable deployment.
  • Movie Prediction, 01/2024, 05/2024, Engineered a movie recommendation system using PyTorch and a dataset of one million movies., Implemented logic to predict movie genres and utilized unexplored data for user recommendations., Transformed user-movie rating data into array format, employing encoding techniques to generate lower-dimensional representations, and decoded them to reproduce the original input vectors., Evaluated data reconstruction fidelity by computing the difference between replicated and original input vectors, ensuring decoding accuracy.
  • Heart Disease Analytics, 09/2023, 12/2023, Developed and implemented advanced AI models for heart disease prediction, working closely with data science and engineering teams., Utilized Python programming skills to manipulate, analyze, and visualize heart disease datasets, ensuring data quality and accuracy., Integrated heart disease datasets into common database for centralized data management and analysis, leveraging its powerful data exploration tools., Optimized heart disease analytics pipelines within Snowflake data warehouse, ensuring efficient data processing and query performance for real-time insight.
  • Comparison of Diabetes Detection Techniques in Females, 09/2019, 12/2019, Evaluated diabetes detection methods in females with an 80% accuracy rate using machine learning algorithms, including decision trees, Naive Bayes, Logistic Regression, and Neural Network., Adopted Logistic Regression, which achieved approximately 80% accuracy, as the preferred method., Research Paper link given here.

Timeline

Research Assistant

San Diego State University
11.2023 - Current

MSc in Big Data Analytics -

San Diego State University
08.2023 - 05.2025

Data Engineer

Bank Of America
07.2020 - 06.2023

ML Intern

National University of Singapore
06.2019 - 07.2019

B. Tech in Computer Science - undefined

Vellore Institute of Technology
07.2016 - 05.2020
SARTHAK KAR