Summary
Overview
Work History
Education
Skills
Custom
Timeline
Generic

Li Wang

Idaho Falls,ID

Summary

M.S. Statistics, 3+ years of experience with relational database ( SQL, ETL, data warehouse), programming (Python, R, SAS), and big data processing (Spark, Hadoop), skilled in data analysis, data management, data visualization, data reporting, and modeling.

Overview

11
11
years of professional experience

Work History

Data Analyst Intern

University Of Idaho
01.2023 - 08.2023
  • Cleaned and prepped data using Python, SQL, and Excel to assist researchers in making data analyses and building models to predict the weight change during the reaction of silicon carbide with oxygen at different temperatures and pressures.
  • Built data visualization using Power BI and made weekly and monthly reports for researchers.

Business Data Analyst

The Administration For Industry And Commerce
08.2012 - 08.2015
  • Engineered 282K sales data from the database and extracted 15+ usable features including product category, amount, records, etc.
  • Monitored the major KPIs and created the daily summary statistics reports of the sales data with complex SQL queries
  • Explored and analyzed business data with visualization using Tableau, and generated reports for other departments.
  • Conducted business analyses of market trends, and operational performance, providing valuable insights for strategic decision-making.

Education

Master of Science - Statistics

North Carolina State University
Raleigh, NC
05.2023

Bachelor of Arts - Chinese Language And Literature

China West Normal University
Nanchong, Sichuan, China
06.2012

Skills

  • Programming Languages: Python(Numpy, Pandas, Matplotlib, Scikit-Learn), R (GGplots2, Dplyr, Tidyr, Caret), SAS (PROC SQL, Macro, Data Steps for data manipulations), SQL
  • Software & Developer Tools: Tableau, Power BI, GitHub, Spark, MySQL, Microsoft Word, Excel, PowerPoint

Custom

COVID-19 Data Visualization in Power BI

  • Developed complex and efficient SQL scripts to query COVID-19 data from www.ourworldindata.org
  • Created an interactive dashboard to present active, deceased, and recovered COVID-19 cases, mortality, and recovery rates by nation and day/month/year

Breast Cancer Classification and Prediction in Python

  • Used Python Pandas library for data cleaning and data manipulation, reduced the size of variables using principal component analysis (PCA)
  • Fitted classification models with Logistic Regression, Decision Tree, and Random Forest models in Python to predict if the cancer is benign or malignant
  • Compared classification and prediction results, and obtained the highest average accuracy (96.6%) from the Logistic Regression model

Interactive Dashboards for Bike Rental Business in R

  • Developed interactive dashboards in R to visualize data and present analysis results
  • Adopted Random Forest Regression, Linear Regression, and Decision Tree Regression models to predict the number of daily bike rentals, and returned the Random Forest Regression model with the highest 92.3% accuracy

Timeline

Data Analyst Intern

University Of Idaho
01.2023 - 08.2023

Business Data Analyst

The Administration For Industry And Commerce
08.2012 - 08.2015

Master of Science - Statistics

North Carolina State University

Bachelor of Arts - Chinese Language And Literature

China West Normal University
Li Wang