Web Scraping: Scraped English Premier League (EPL) data with BeautifulSoup using class descriptors from industry websites like Sofascore and collected it in csv format.
Feature Engineering: Processed and engineered predictive features such as form, xG, home/away performance, and goal distributions to enhance model accuracy.
Machine Learning Approach: Applied Tree-based methods (Random Forest) to classify match results and predict probabilities based on different features.
Entropy & Information Gain: Optimized decision trees using entropy-based splitting criteria to maximize predictive power and reduce overfitting.
Overview
4
4
years of professional experience
Work History
REU Research Participant
Georgia State University
05.2024 - 07.2024
Collaborated with a professor in order to solve an expert finding problem on online platforms
Coded extensively to form a foundational set of user-profile objects from unorganized XML data and implemented a scoring system for users in order to form a sheaf Graph structure
Utilized NetworkX to create connected components and tested various algorithms (PageRank, SIR) for performance comparison
Showed that our graph-based approach provided tag-specific expert results, a novel contribution to the industry, and optimized data requirements because of the generalizable nature of our mathematical model.
Co-authored and submitted a paper on the research to the SIGMOD Conference.
DATA JOURNALIST
Junior Economics Club
10.2020 - 05.2021
Published several articles in Medium for the NYC branch of the Junior Economics Club
Wrote articles related to the intersection of big data and various industries
Analyzed the emerging trend of using data science in sports
Education
Bachelor of Science (B.S.) - Applied Mathematics, Computer Science (Double)