Post Training
I am working on Preference Alignment on both algorithm side and infrastructure side.
On the algorithm side, I improved the existing RL recipe by making the process more iterativable. The new recipe works on several different base models and effectively increase the user time spend by xx%.
On the infra side, I mainly worked on RL related infrastructure. Specifically, I built the first offline labeling system, improved efficiency on offline pipeline, implemented xx-parallelism to improve memory efficiency and improved the transfers between different RL components.
Monolith: Designed and implemented the core system of Monolith, ByteDance's next-generation recommendation system framework. Led a team of approximately 10 individuals in this project, overseeing the development of key components such as the embedding table, sharding, and work communications. Monolith is widely utilized by the volcano engine in Bytedance's public cloud, catering to various external businesses in China, including medium-sized tech companies and traditional enterprises. Internally, it serves as the go-to solution for all cloud services requiring recommendation techniques. Notably, Monolith has been open-sourced and is available at https://github.com/bytedance/monolith. Some technical highlights on Monolith:
-
Large Language Model Training related.
Auto ML (Google Brain): Leveraged the genetic algorithm to intelligently select hyperparameters and conducted in-depth research on feature engineering techniques for forecasting tasks.
-
ACLed Search (Google Cloud): Spearheaded the redesign of the offline process, resulting in a remarkable performance improvement of 50x. Successfully scaled the system from 2000 containers running for 48 hours to 400 machines completing the task in just 4 hours.