Summary
Overview
Work History
Education
Websites
Timeline
Generic

Jung-A Kim

Seattle,WA

Summary

Results-driven problem solver with a keen eye for identifying and addressing challenges, always seeking improvement opportunities. Engages in direct, collaborative conversations with team members and approaches POC tasks methodically to see data science solutions evolve.

Holds a solid foundation in Statistics from San Jose State University and tackles data with a rigorous mindset.

Demonstrates a proven ability to quickly grasp new programming concepts, evidenced by mastering APIs at a previous company, leading to a well-deserved promotion.

Emphasizes clear communication with business partners to ensure alignment and minimize wasted time and effort. Proactively seeks clarification and assistance to efficiently expedite task completion.

Overview

4
4
years of professional experience

Work History

Senior Data Scientist

Thermo Fisher Scientific
10.2024 - Current
  • Clinical Research Group: CRG in Biotechnology global company providing life science services and clinical research solutions
  • Predictive modeling for clinical trial process using Databricks
  • Enhancing solutions with LLM prompt and helpful explanations for client experience
  • AutoML for prototyping models for predictive analytics and fine-tuning for explanation
  • Tools: Databricks, Snowflake, AzureDevOps, AWS S3

Senior Data Scientist

State Farm
10.2023 - 05.2024
  • Company Overview: US Largest Auto Insurance company in Property and Casualty
  • Generative AI prompt engineering for summary and extraction
  • Mistral AI 7B Instruct 2.0 to extract and parse OCR text to populate fields with important deadlines and numbers from attorney’s letters in higher tier cases.
  • Refine prompts with active communication with claim handlers.
  • Optimize memory use by distributed data parallel code.

Data Scientist

State Farm
02.2021 - 05.2024
  • Worked for AI solutions in Auto Injury & Vision & NLP to Enhance prioritization for claim handling process.
  • Computer Vision model deployed with F1 average above 92% on 11 categories of fire claims categories which is the highest performance of vision model developed so far considering the noise in image data from customer/agent uploads. Significantly shortened labeling time and cost with the millions of data in Auto claims.
  • Document Classifier with a transformer-base discriminator google ELECTRA model for multiclass classification.
  • Data Validation: voxel51 Fiftyone app to validate the labeled data with business partners.
  • AWS Training job: Explored and leveraged AWS Training Job to optimize instance usage time for training multiple models in parallel. Engaged in communication with MLE, training job in script mode successfully ran transformer models efficiently both time and cost-wise.
  • AutoGluon: Baseline modeling using AutoMM from AutoGluon.
  • PyTorch/PyTorch Lightning/Huggingface: For training large models,explored and developed distributed data processing in these three different APIs. Tensorflow Logger were used to monitor the distributed training in all cases. ResNet50, Densenet161,and Swin Transformers were mainly developed and compared.
  • XGBoost for predicting time lags between claim processes to expedite claim settlements which saves a lot of cost from delay. Bayesian Optimization, faster hyperparameter tuning with performance improvement. Object oriented programming in Python.
  • Injury Severity scale prediction made with injury text using BertModel.
  • Proactive communication with business partners during error analysis identifying labeling inconsistency. Visual aids of matplotlib for exploratory data analysis convinced the model performance.

Education

Master's degree - Statistics

San José State University
01.2020

Computer Science (Coursework only)

De Anza College
01.2018

Bootcamp - Programming in Java

Timo Academy
01.2014

Bachelor of Arts - BA - English Interpretation and Translation (English for International Conferences and Communication)

Hankuk University of Foreign Studies
01.2014

Timeline

Senior Data Scientist

Thermo Fisher Scientific
10.2024 - Current

Senior Data Scientist

State Farm
10.2023 - 05.2024

Data Scientist

State Farm
02.2021 - 05.2024

Computer Science (Coursework only)

De Anza College

Bootcamp - Programming in Java

Timo Academy

Bachelor of Arts - BA - English Interpretation and Translation (English for International Conferences and Communication)

Hankuk University of Foreign Studies

Master's degree - Statistics

San José State University
Jung-A Kim