Summary
Overview
Work History
Education
Skills
Timeline
Generic

Shumin(Krystal) Liu

Rockville Centre,NY

Summary

Data engineer with 3+ years of experience building scalable pipelines and analytics platforms. Passionate about solving business problems through data and automation. Seeking opportunities to lead technical strategy in data-driven teams.

Overview

5
5
years of professional experience

Work History

Data Engineer

McKinsey &Company
02.2025 - Current
  • Extracted and processed data from diverse systems into scalable Big Data platforms tailored to client needs.
  • Collaborated with data scientists to align data fields with hypotheses and developed modular data pipelines.
  • Designed data models and implemented automated validations for rigorous data cleaning and transformation.
  • Managed cloud-based infrastructure using AWS and Databricks to support comprehensive data workflows.

Data Science Intern

Biz2Credit
01.2021 - 08.2021
  • Built models with NLP tools in Python to automatically classify applicants’ cash flows into categories based on bank statements transaction description data, achieved 75% dollar amount accuracy; implemented regression models with bank statement and tax statement data to adjust annual revenue for small businesses to support underwriting
  • Identified 3 independent predictors of risk reduction associated with the revenues of a small business in Healthcare using AI-based methodologies; enabled automated identification and visualization of the presence of factors predictive of reduced credit risk for Healthcare businesses
  • Developed multi-page web app with Python and Streamlit to demonstrate company’s bank statement analyzer product

ML Researcher: Social Unrest Anticipation

University of Nebraska-Lincoln
01.2020 - 05.2020
  • Improved an agent-based simulation framework with other researchers to anticipate unrest events using diverse datasets to generate unrest susceptibility heat map; presented weekly to 10+ faculty members
  • Computed localized spatial distribution of unrest events using spatial statistics tools in R; reduced 70% of over-counting of unrest events at a given location due to geo-coding issues based on test dataset
  • Performed density-based and distance-based analysis for point patterns generated by 3 different methods to distribute events in polygons

Education

Master of Science - Business Analytics

Columbia University
New York, NY
12-2021

Bachelor of Science - Actuarial Science And Statistics

University of Illinois, Urbana-Champaign
Champaign, IL
12-2019

Skills

  • Technical skills: SQL, Python, Java, Tableau, Snowflake, Spark, Dagster, Airflow, AWS, Databricks
  • Data pipeline design, ETL development, API development

Timeline

Data Engineer

McKinsey &Company
02.2025 - Current

Data Science Intern

Biz2Credit
01.2021 - 08.2021

ML Researcher: Social Unrest Anticipation

University of Nebraska-Lincoln
01.2020 - 05.2020

Master of Science - Business Analytics

Columbia University

Bachelor of Science - Actuarial Science And Statistics

University of Illinois, Urbana-Champaign
Shumin(Krystal) Liu