Experienced bioinformatics graduate student with 6 years in different biomedical fields, from cancer-related multi-omics data to human genome research. I gained medical and data analysis experience from undergraduate, work experience from internships and exposed in the latest genome research progress during graduate study.
Pairwise alignment, SNP analysis, structural variation analysis were done between HG38 and CHM13 given the big difference between these two references with respect to 1q21.1 region.
Pipeline was developed to finish highly contiguous de novo dual assembly of the repetitive region 1q21.1 with ultra-long Nanopore sequencing data. Benchmarked with HPRC (Human Pangenome Reference Consortium) data.
The potential presence of focal changes affecting NOTCH2NL copy number will be assessed in large collections of autism families by re-analyzing short-read, whole-genome sequencing (WGS) data from the Simons Simplex Collection and the Autism Genetic Resource Exchange (AGRE)
Evaluated 10 popular algorithms for fusion detection by their sensitivity, false discovery rate, computing time, and memory usage
Batch processed the RNA-seq data of the patients; combined top performing methods (SOAPfuse, FusionCatcher and JAFFA) to identify candidate fusion transcripts with high confidence
Utilized various machine learning models to predict the efficacy of the treatment methods based on the symptoms, age, gender, etc.
Classifier Chains algorithm was used to accurately predict the optimal treatment methods for GAD7 patients, with an accuracy of 81% in test set
GAPLINC was identified a negative regulator of inflammation in mice1, we tried to find whether it played a similar role in other species (RNA-seq data of various species provided by 2)
Used a nextflow pipeline to preprocess the data (QC and trimmed)
Located GAPLINC using its syntenic relationship with Tgif1 and Dlgap1
Intro Comp Genomics Adv Comp Genomics Evol Genomics
Comp Sys and C Prog Applied Bayesian Machine Learning