STA5167 Applied Linear Regression II Project: 2024 Spring
- Developed and implemented a logistic regression model with 16 predictors to accurately predict obesity risk. Sourced relevant and reliable data from the Kaggle website.
- Identified a significant correlation between obesity and 16 relevant predictors in a large sample of 20,758 individuals.
- Assessed multicollinearity using the Variance Inflation Factor (VIF) and identified outliers through Cook's Distance analysis, ensuring the integrity and reliability of the data
- Utilized Stepwise Selection Methods and the Bayesian Information Criterion (BIC) to compare and evaluate different models, achieving an accuracy of up to 97 percent.
STA5856 Time Series&Forecast Project: 2024 Spring
- Utilized time series models to analyze data and generate accurate forecasts for future years of four daily physico-chemical variables over the period from June 1992 to June 1993 at the Cat-Point station in the Apalachicola Bay area
- Applied the Box-Cox transformation to the river flow using the logarithmic form (λ = 0) and to the rainfall data using the square root form (λ = 0.5)
- Fitted an ARIMA model to the salinity data alone and applied both a multiple regression model and a regression-time series model to the four variables using the first 375 observations.
- Generated 20 forecasts for salinity and compared the three sets of forecasts to identify the most effective time series model for the data.
COP5570 Parallel and Distributed Calculator Project: 2023 Spring
- Designed a multiple-threaded, Backend-Frontend calculator using JavaScript programming language.
- Created a user-friendly and intuitive computer UI using JavaScript and Implemented multiple threads, enabling the calculator to perform computations concurrently while effectively distributing the workload among different threads.
- Included a memory or history feature to recall previous calculations, allowing the user to review and modify previous inputs.
CIS5930 Data Science for Smart Cities Project: 2022 Fall
- Used ConvGRU models to predict COVID-19 cases and compared it with other models, like LSTM.
- Processed raw data from different counties by date and encoded the data from three counties using an encoder.
- Inputted encoded data into the ConvGRU model for robust prediction, directing the output into a decoder to effectively present the results.
- Compared our model's results with those of LSTM—commonly used by researchers to predict COVID-19 cases—and found that our ConvGRU model offers significant advantages, utilizing fewer training parameters, requiring less memory, and executing and training faster than LSTM.
CAP5540 Bioinformatics Sequence Analysis:2022 Fall
- Developed a solid understanding of computational algorithms and machine learning tools used in genomic sequencing data analysis. This includes dynamic programming, Hidden Markov Models, maximum likelihood estimation, and Bayesian inference.
- Utilized "samtools" to inspect the DNA sequence in the BAM file and organize and index the BAM file.
- employed BOWTIE, a read mapping tool, to align the reads to the reference and generated a SAM file to display the mutated DNA.