Summary
Overview
Work History
Education
Extracurricular Activities
Project
Publication
Languages
Timeline
Generic

Yumeng Li

New York,USA

Summary

Joint major in Data Science and Computer Science with internship experience at Amazon, ColAI, and Pristine Pharma Tech. Developed skills in data analysis, machine learning, and software engineering. Proficient in developing predictive models, processing large datasets, and improving data visualization in healthcare and AI sectors. Passionate about solving complex problems through innovative and data-driven solutions.

Overview

1
1
year of professional experience

Work History

Research Assistant

ColAI
05.2023 - Current
  • Description: Collaborating on the development of a search engine
  • Contribute to regular team meetings for project updates; Lead the collection of resources, including large models and vector databases, and initiated development using LLMs like Llama and Vicuna;
  • Modularize crawler code, integrate it with vector databases, and optimize embedding processes for seamless integration with Milvus and MongoDB;
  • Update system architecture and UML diagrams, expanded dataset repositories, and implemented new data sources; Cleaned and restructured the codebase, enhancing documentation and creating instruction manuals;
  • Skills: web development; software engineering; programming (python); Database management

Host

Volo Sport
06.2024 - 08.2024
  • Manage staff training, scheduling, and communication; Oversee field conditions to ensure play readiness; Direct grassroots marketing initiatives and streamline league operations.
  • Skills: communication, organization

Data Analyst

Amazon
01.2024 - 06.2024
  • Process and clean a skin cancer dataset using Python Pandas, ensuring data accuracy and reliability;
  • Conduct descriptive statistics and apply machine learning models like Random Forest and SVM for model development and training.
  • Integrate Scikit-learn and TensorFlow to enhance model performance, contributing to metadata and image data association for TensorFlow and Keras models.
  • Implemented image classification tasks, refining model selection and tuning.
  • Skills: data preprocessing, data visualization, predictive modeling, machine learning, deep learning, image classification using Python and AI libraries.

Biostatistician Assistant Intern

Pristine Pharma Tech Development Co., Ltd.
06.2023 - 07.2023
  • Analyze clinical data using SQL for merging and filtering datasets, and R for hypothesis testing and regression.
  • Conduct a Phase I Clinical Study on LY01021, preparing detailed SAP reports.
  • Create custom visualizations in Tableau and develop MOCKUP reports with MATLAB and R to enhance data clarity.
  • Skills: SAP, MOCKUP, Tableau, SQL, R, clinical study design, data visualization.

Education

Bachelor of Arts - Data Science And Computer Science Joint Major

New York University
05.2026

Extracurricular Activities

Active Member of "Women in Tech"

  • Participate in coding workshops, career panels, and networking events, connect with industry professionals.

Project

Capstone Project: "Spotify Song Popularity Analysis" - Data Analyst  

  • Analyze database of 52,000 songs to examine how features influence Spotify's song popularity; Conduct data cleaning, feature engineering, plot generating, linear regression, hypothesis testing. Developed regression models and applied PCA to improve prediction accuracy, concluding that multi-feature models outperform individual features in predicting popularity and genre classification.
  • Skills: Data Cleaning and Preprocessing; Regression Modeling; PCA; Statistical Analysis; Machine Learning; Data Visualization

Publication

  • "Exploration of Tumor Immunotherapy and Cancer Vaccine Development Process"-Team Leader and First Author published as the proceedings of the 3rd International Conference on Biological Engineering and Medical Science (ICBioMed2023), http://dx.doi.org/10.1117/12.3012990
  • Lead research to enhance CAR-T therapy by dissecting its mechanisms within tumor immunotherapy, aiming to address current limitations and proposed advancements through literature review and empirical data analysis using Python and R. Develop novel methodologies and tailor treatment strategies, contributing to a published study. Managed a diverse research team, ensuring effective collaboration from project initiation to publication.

Languages

English (native); Chinese (native); Japanese (fluent)

Timeline

Host

Volo Sport
06.2024 - 08.2024

Data Analyst

Amazon
01.2024 - 06.2024

Biostatistician Assistant Intern

Pristine Pharma Tech Development Co., Ltd.
06.2023 - 07.2023

Research Assistant

ColAI
05.2023 - Current

Bachelor of Arts - Data Science And Computer Science Joint Major

New York University
Yumeng Li