Summary
Overview
Work History
Education
Skills
Data Science Publications
Honors And Awards
Timeline
Generic

Bailey Kuehl

Seattle,WA

Summary

Applied Data Scientist familiar with gathering, cleaning, analyzing, and communicating data to technical and non-technical stakeholders. Advanced understanding of statistical modeling, machine learning, and other analytical techniques. Highly organized, motivated and diligent with significant background in artificial intelligence and NLP.

Overview

4
4
years of professional experience

Work History

SENIOR DATA SCIENCE ANALYST

Allen Institute For AI
07.2023 - Current
  • Built and fine-tuned a BERT transformer language model on token classification task to predict sentences in academic papers, resulting in a 12% increase in F1 compared to baseline
  • Lead 3 junior analysts, including distributing tasks, leading weekly stand-up, and providing technical mentorship
  • Proposed and implemented framework for standardizing the evaluation of large language models, including dimensions (model capabilities, ethical / safety concerns, etc.) and relevant datasets

DATA SCIENCE ANALYST II

Allen Institute For AI
06.2021 - 07.2023
  • Built and deployed end-to-end machine learning pipeline for paper reference prediction model
  • Created data dashboards using RedshiftSQL to track annual goals and track patterns and identify potential data issues
  • Designed and conducted hypothesis (A/B) test to evaluate Paper QA feature on Semantic Scholar, resulting in a 10% relative increase in library saves per user
  • Implemented linear regression statistical model to identify factors contributing to long PDF load times, resulting in 15% increase in user satisfaction

DATA SCIENCE ANALYST I

Allen Institute For AI
08.2020 - 06.2021
  • Generated expert training data for TL;DR summaries and present insights and errors to research team
  • Proposed and implemented project intake, tracking, and distribution process for the data science team
  • Evaluated scientific fact checking machine learning model on COVID-19 claims

Education

MASTER OF INFORMATION AND DATA SCIENCE -

UNIVERSITY OF CALIFORNIA, BERKELEY
08.2024

B.S. MATERIALS SCIENCE ENGINEERING -

UNIVERSITY OF WISCONSIN-MADISON
05.2019

Skills

  • Python, SQL, R
  • PyTorch, TensorFlow, Spark
  • Machine Learning, NLP, Deep Learning
  • A/B testing, Statistical Analysis, Predictive Modeling
  • Docker, AWS, Redshift
  • Databricks (Azure)
  • HuggingFace

Data Science Publications

PaperMage: A Unified Toolkit for Processing, Representing, and Manipulating Visually-Rich Scientific Documents, EMNLP 2023. Kyle Lo, Zejiang Shen, Benjamin Newman, Joseph Chee Chang, Russell Authur, Erin Bransom, Stefan Candra, Yoganand Chandrasekhar, Regan Huff, Bailey Kuehl**, Amanpreet Singh, Chris Wilhelm, Angele Zamarron, Marti A. Hearst, Daniel S. Weld, Doug Downey, Luca Soldaini.


LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization, EACL 2023. Kalpesh Krishna, Erin Bransom, Bailey Kuehl**, Mohit Iyyer, Pradeep Dasigi, Arman Cohan, Kyle Lo.


Generating Scientific Claims for Automated Scientific Fact Checking, ACL 2022. Dustin Wright, David Wadden, Kyle Lo, Bailey Kuehl**, Isabelle Augenstein, Lucy Lu Wang.


SciA11y: Converting scientific papers to accessible HTML, ASSETS 2021. Lucy Lu Wang, Isabel Cachola, Jonathan Bragg, Evie Cheng, Chelsea Haupt, Matt Latzke, Bailey Kuehl**, Madeleine van Zuylen, Linda Wagner, Dan S Weld.

Honors And Awards

  • EACL 2023 Outstanding Paper Award
  • EMNLP 2023 Best Demo Award
  • ASSETS 2021 Best Artifact Award

Timeline

SENIOR DATA SCIENCE ANALYST

Allen Institute For AI
07.2023 - Current

DATA SCIENCE ANALYST II

Allen Institute For AI
06.2021 - 07.2023

DATA SCIENCE ANALYST I

Allen Institute For AI
08.2020 - 06.2021

MASTER OF INFORMATION AND DATA SCIENCE -

UNIVERSITY OF CALIFORNIA, BERKELEY

B.S. MATERIALS SCIENCE ENGINEERING -

UNIVERSITY OF WISCONSIN-MADISON
Bailey Kuehl