Summary
Overview
Work History
Education
Skills
Research Interests
Projects/Research Experience
Core Courses
Timeline
Generic

Hengwei Xing

Tallahassee,Florida

Summary

Experienced with statistical programming for data analysis and model building. Comfortable working with SAS, R, and Python to handle data accurately and efficiently. Passionate about creating clear visualizations and insights that help guide better decisions.

Overview

1
1
year of professional experience

Work History

Teacher's Assistant

Florida State University
Tallahassee, Florida
08.2023 - Current
  • Collaborating with teachers to prepare materials and set up equipment for 3 lessons.
  • Delivering constructive feedback on assignments for 60 students.
  • Promoting active participation through open-ended questions and group discussions, enhancing student collaboration.

Education

Doctoral Degree - Biostatistics (3.85 so far)

Florida State University
Tallahassee, FL
12-2027

Master of Science - Computer Science (3.8 GPA)

Florida State University
Tallahassee, FL
05-2023

Bachelor of Science - Computer Science (3.7 GPA)

Florida State University
Tallahassee, FL
05-2021

Skills

  • Programming language: C, C, Python, R , SAS
  • Operating system: Android Mobile Development, Unix
  • Second language: Mandarin (Native speaker)

Research Interests

• Deep Learning          • Machine learning          •Time Series Data Analysis          •High-dimensional Data Analysis

• Clinical Trials Designs and Survival Data Analysis

Projects/Research Experience

STA5167 Applied Linear Regression II Project: 2024 Spring

  • Developed and implemented a logistic regression model with 16 predictors to accurately predict obesity risk. Sourced relevant and reliable data from the Kaggle website.
  • Identified a significant correlation between obesity and 16 relevant predictors in a large sample of 20,758 individuals.
  • Assessed multicollinearity using the Variance Inflation Factor (VIF) and identified outliers through Cook's Distance analysis, ensuring the integrity and reliability of the data
  • Utilized Stepwise Selection Methods and the Bayesian Information Criterion (BIC) to compare and evaluate different models, achieving an accuracy of up to 97 percent.

STA5856 Time Series&Forecast Project: 2024 Spring

  • Utilized time series models to analyze data and generate accurate forecasts for future years of four daily physico-chemical variables over the period from June 1992 to June 1993 at the Cat-Point station in the Apalachicola Bay area
  • Applied the Box-Cox transformation to the river flow using the logarithmic form (λ = 0) and to the rainfall data using the square root form (λ = 0.5)
  • Fitted an ARIMA model to the salinity data alone and applied both a multiple regression model and a regression-time series model to the four variables using the first 375 observations.
  • Generated 20 forecasts for salinity and compared the three sets of forecasts to identify the most effective time series model for the data.

COP5570 Parallel and Distributed Calculator Project: 2023 Spring

  • Designed a multiple-threaded, Backend-Frontend calculator using JavaScript programming language.
  • Created a user-friendly and intuitive computer UI using JavaScript and Implemented multiple threads, enabling the calculator to perform computations concurrently while effectively distributing the workload among different threads.
  • Included a memory or history feature to recall previous calculations, allowing the user to review and modify previous inputs.

CIS5930 Data Science for Smart Cities Project: 2022 Fall

  • Used ConvGRU models to predict COVID-19 cases and compared it with other models, like LSTM.
  • Processed raw data from different counties by date and encoded the data from three counties using an encoder.
  • Inputted encoded data into the ConvGRU model for robust prediction, directing the output into a decoder to effectively present the results.
  • Compared our model's results with those of LSTM—commonly used by researchers to predict COVID-19 cases—and found that our ConvGRU model offers significant advantages, utilizing fewer training parameters, requiring less memory, and executing and training faster than LSTM.

CAP5540 Bioinformatics Sequence Analysis:2022 Fall

  • Developed a solid understanding of computational algorithms and machine learning tools used in genomic sequencing data analysis. This includes dynamic programming, Hidden Markov Models, maximum likelihood estimation, and Bayesian inference.
  • Utilized "samtools" to inspect the DNA sequence in the BAM file and organize and index the BAM file.
  • employed BOWTIE, a read mapping tool, to align the reads to the reference and generated a SAM file to display the mutated DNA.

Core Courses

• Time Series and Forecasting Methods  •Concurrent, Parallel and Distributed Programming   •Computer Security •Bioinformatics: Sequence Analysis • Complexity and Analysis of Data Structures and Algorithms  •Compiler Construction   •Software Reverse Engineering and Malware Analysis • Internet Application Programming with Java   

Timeline

Teacher's Assistant

Florida State University
08.2023 - Current

Doctoral Degree - Biostatistics (3.85 so far)

Florida State University

Master of Science - Computer Science (3.8 GPA)

Florida State University

Bachelor of Science - Computer Science (3.7 GPA)

Florida State University
Hengwei Xing