RAVEN: real-time autoregressive video extrapolation
RAVEN is a training-time test framework that repacks self rollouts into interleaved clean …
Read moreRAVEN is a training-time test framework that repacks self rollouts into interleaved clean …
Read moreMVP Engine is a lightweight multimodal training framework that keeps orchestration small, moves …
Read moreLLaVA-OneVision-2 extends fully open multimodal training toward long-video understanding, using …
Read moreWork on MLLM training, model architecture, and open-source training framework optimization for frontier multimodal systems.
Build the systems/frameworks that makes MLLM training, serving, and model engineering easy-to-use and efficient at scale.