Research: VLMs on multimodal football dataset
Professor Chenyan Xiong
02.2025 - Current
- Built continuous pretraining + SFT framework to adapt Qwen2.5-VL for football video–text analysis.
- Improved task efficiency across play recognition, directional perception, categorical data analysis, and natural-language description logic.
- Optimized training with DeepSpeed, FlashAttention, and LoRA (quantized adapters) to reduce memory/cost and speed iteration.
- Presented to NFL officials for dataset licensing; translated technical methods and results for non-technical stakeholders.
