Overview
Work History
Education
Skills
Websites
Projects
Certification
Timeline
Generic

Thomas Bui

San Diego,CA

Overview

3
3
years of professional experience
1
1
Certification

Work History

Graduate Researcher

Lab Informatics, San Diego State University
2022.08 - Current
  • Piloted new visualization tool (XSLICE) for 4 Dimensional space-time data in Python
  • Quantified strength of climate variability over time series through Singular Value Decomposition (SVD) of large datasets (~3 TB) to rank atmospheric climate phenomena
  • Reconstructed oceanic and atmospheric data (~3 TB) through Principal Component Regression
  • Led weekly team meetings (of 4 people) to identify issues and maintain study alignment for undergraduate students with projects
  • Conducted thorough literature reviews to better understand research topics and prepare for studies.

Data Science and Insights Intern

St. Jude Children's Research Hospital - ALSAC
2023.06 - 2023.08
  • Slashed data collection time by over 90% by optimizing Large Language Model prompts
  • Streamlined C-Suite decisions by aggregating 1000+ Excel sheets into 1 Tableau dashboard
  • Fixed ~5000 rows of company data by comparing Dun and Bradstreet data in Python, SQL, and Excel
  • Presented group project to CEO about expansion of company web app for increased user engagement
  • Created data visualization graphics, translating complex data sets into comprehensive visual representations

Undergraduate Fellow

National Oceanic and Atmospheric Administration
2021.02 - 2022.05
  • Preprocessed series of data files (~3 TB) in Python using Pandas and NumPy
  • Visualized 4 Dimensional space-time climate data using Python and R methods including 3D Interactive HTML dashboards in Plotly and GIFs using Pillow and Matplotlib
  • Built and maintained professional relationships with students and faculty members

Education

Master of Science - Big Data Analytics

San Diego State University
San Diego, California
05.2024

Bachelor of Science - Applied Mathematics, Physics

San Diego State University
San Diego, California
05.2022

Skills

Proficient in

  • Python
  • Machine Learning
  • Tableau
  • SQL
  • R
  • GitHub
  • Statistical Analysis

Experience in

  • JIRA
  • Confluence
  • Docker
  • Microsoft Office (Word, Excel, PowerPoint, Teams, and Outlook)

Familiar with

  • Neural Networks
  • Computer Vision
  • Natural Language Processing
  • AWS

Projects

Machine Learning Engineer Baseball Wins Prediction
● Analyzed over 1 GB of historical baseball data containing tables with over 7 million rows
● Developed Feature Engineering software for feature selection and Machine Learning optimization for
outcome prediction in sci-kit learn using Pyspark to connect MariaDB SQL database to Python
● Created an automated Docker pipeline for reproducibility using Dockerfile and docker-compose


Certification

Udemy - Python for Computer Vision with OpenCV and Deep Learning


https://www.udemy.com/certificate/UC-87b8ea56-d787-48ae-8180-2ba22177f710/


Udemy - NLP Natural Language Processing with Python


https://www.udemy.com/certificate/UC-e29abbcb-0225-49c7-932e-a72c0d5ba1e6/

Timeline

Data Science and Insights Intern

St. Jude Children's Research Hospital - ALSAC
2023.06 - 2023.08

Graduate Researcher

Lab Informatics, San Diego State University
2022.08 - Current

Undergraduate Fellow

National Oceanic and Atmospheric Administration
2021.02 - 2022.05

Master of Science - Big Data Analytics

San Diego State University

Bachelor of Science - Applied Mathematics, Physics

San Diego State University

Udemy - Python for Computer Vision with OpenCV and Deep Learning


https://www.udemy.com/certificate/UC-87b8ea56-d787-48ae-8180-2ba22177f710/


Udemy - NLP Natural Language Processing with Python


https://www.udemy.com/certificate/UC-e29abbcb-0225-49c7-932e-a72c0d5ba1e6/

Thomas Bui