Summary
Overview
Work History
Education
Skills
Languages
Timeline
Generic

Zheyi Xue

New York City

Summary

Experienced in designing, reproducing, and evaluating graph-based retrieval and reasoning pipelines across multiple QA and enterprise-style knowledge datasets. Familiar with knowledge graph construction, multi-hop reasoning, and evaluation-driven model iteration, with hands-on experience benchmarking multiple Graph RAG methods under real-world constraints.

Overview

1
1
year of professional experience

Work History

Agentic Graph-based RAG Benchmark

Undergraduate Researcher, DIR-Lab
New York City
09.2025 - Current
  • Designed and implemented a systematic benchmark comparing pure graph retrieval, agentic dense retrieval, and agentic graph retrieval paradigms for GraphRAG systems
  • Reproduced 5 GraphRAG methods end-to-end (RAPTOR, GraphRAG, HypergraphRAG, LinearRAG and HippoRAG2) covering: 1) Corpus selection and preprocessing; 2) Graph schema design and construction; 3) Retrieval strategies and LLM inference pipelines
  • Evaluated methods on both single hop and multi-hop QA benchmarks (Natural Questions and HotpotQA) across multiple LLM backbones (Llama, Qwen, Mistral, Gemma), enabling controlled cross-model analysis
  • Extended experiments to additional datasets (PopQA, TriviaQA and MuSiQue) to study generalization and retrieval robustness
  • Conducted fine-grained error analysis to understand performance differences attributable to graph structure and retrieval mechanisms

Keywords:Graph RAG, Knowledge Graph, RAG, Multi-hop Reasoning, Becnmark, Graph Construction

Graph Structure Analysis in GraphRAG

Undergraduate researcher, DIR-Lab
Shanghai
06.2025 - 09.2025

Conducted an in-depth study of graph structure contribution in GraphRAG systems inspired by GraphICL: Unlocking Graph Learning Potential in LLMs through Structured Prompt Design.

  • Reproduced LLaGA (LLaGA: Large Language and Graph Assistant) by training from scratch on 4 datasets: Arxiv, Amazon-Products, PubMed and Cora
  • Designed controlled ablation experiments by replacing k-hop neighbor subgraphs and Laplacian-based structural signals with semantic similarity-based node selection
  • Retrained models under identical settings to isolate the impact of explicit graph structure vs semantic-only context
  • Demonstrated that graph structural information remains critical for LLaGA performance across datasets
  • Authored internal technical reports analyzing workflows, assumptions, and empirical results of 10+ GraphRAG-related papers

Keywords: Knowledge Graph, Graph Language Models, Ablation Study, Graph RAG

Graph-based RAG for Long-Context Videos

Undergraduate Researcher, DIR-Lab
Shanghai
01.2025 - 05.2025
  • Reproduced VideoRAG (VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos) for extremely long video (1 h+) comprehension, focusing on graph-enhanced retrieval over multimodal content
  • Conducted quantitative comparisons against baselines including NotebookLM and VideoAgent under offline HPC constraints
  • Adapted the VideoRAG pipeline to use Gemma-2-9B instead of GPT-4o-mini due to restricted online API access, enabling fully local inference
  • Implemented lightweight architectural modifications to improve retrieval and reasoning performance on selected evaluation subsets
  • Analyzed bias and variance introduced by limited data regimes, documenting limitations and trade-offs in experimental design

Keywords: Multimodal RAG, Video Understanding, Graph RAG, Long-Context LLMs

Education

Bachelor of Science - Data Science

New York University
New York (City), NY
05-2027

Skills

  • Programming: Python (familiar); C (basic)
  • Graph RAGs: RAPTOR; GraphRAG; HippoRAG2; HypergraphRAG; LinearRAG; LightRAG; LLaGA
  • Research: paper reproduction; benchmarking; ablation studies; error analysis
  • Frameworks & Relevant Libraries: pytorch; numpy; pandas; torch_geometric; netwowrkx; igraph

Languages

English
Professional

Timeline

Agentic Graph-based RAG Benchmark

Undergraduate Researcher, DIR-Lab
09.2025 - Current

Graph Structure Analysis in GraphRAG

Undergraduate researcher, DIR-Lab
06.2025 - 09.2025

Graph-based RAG for Long-Context Videos

Undergraduate Researcher, DIR-Lab
01.2025 - 05.2025

Bachelor of Science - Data Science

New York University
Zheyi Xue