Overview
Work History
Education
Affiliations
Timeline
Generic

Leqi Zou

Mountain View,CA

Overview

8
8
years of professional experience

Work History

Research Engineer

Character AI
Menlo Park, CA
11.2023 - Current

Post Training

I am working on Preference Alignment on both algorithm side and infrastructure side.

On the algorithm side, I improved the existing RL recipe by making the process more iterativable. The new recipe works on several different base models and effectively increase the user time spend by xx%.

On the infra side, I mainly worked on RL related infrastructure. Specifically, I built the first offline labeling system, improved efficiency on offline pipeline, implemented xx-parallelism to improve memory efficiency and improved the transfers between different RL components.

Machine Learning Engineer

Bytedance
San Jose, CA
05.2020 - 11.2023

Monolith: Designed and implemented the core system of Monolith, ByteDance's next-generation recommendation system framework. Led a team of approximately 10 individuals in this project, overseeing the development of key components such as the embedding table, sharding, and work communications. Monolith is widely utilized by the volcano engine in Bytedance's public cloud, catering to various external businesses in China, including medium-sized tech companies and traditional enterprises. Internally, it serves as the go-to solution for all cloud services requiring recommendation techniques. Notably, Monolith has been open-sourced and is available at https://github.com/bytedance/monolith. Some technical highlights on Monolith:

  • Designed and implemented CPU/GPU hash table solutions.
  • Designed and implemented multi-host severing framework with CPU, GPU and mixed.

-

Large Language Model Training related.

  • Designed a framework focusing on reducing loading/saving time, and automatic failure recovery on a few thousands of GPUs.
  • Optimized existing open source pre-train frameworks (mainly on PyTorch FSDP) and serving frameworks (vllm)

Software Engineer

Google
Mountain View, CA
08.2016 - 05.2020

Auto ML (Google Brain): Leveraged the genetic algorithm to intelligently select hyperparameters and conducted in-depth research on feature engineering techniques for forecasting tasks.

-

ACLed Search (Google Cloud): Spearheaded the redesign of the offline process, resulting in a remarkable performance improvement of 50x. Successfully scaled the system from 2000 containers running for 48 hours to 400 machines completing the task in just 4 hours.

Education

Master of Science - Computer Science

University of California - Los Angeles
Los Angeles, CA
06.2016

Bachelor of Science - Computer Science

Peking University
Beijing, China
06.2015

Affiliations

  • ACM/ICPC Hangzhou Region, gold award
  • National Olympiad in Informatics of China, 2010, Silver Medal (Rank #43)

Timeline

Research Engineer

Character AI
11.2023 - Current

Machine Learning Engineer

Bytedance
05.2020 - 11.2023

Software Engineer

Google
08.2016 - 05.2020

Master of Science - Computer Science

University of California - Los Angeles

Bachelor of Science - Computer Science

Peking University
Leqi Zou