arxiv:2605.12500
Zhongang Cai
AI & ML interests
Multimodal, Video Reasoning, Spatial Intelligence, Virtual Humans.
Recent Activity
upvoted a paper about 8 hours ago
Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion liked a model 7 days ago
sensenova/SenseNova-U1-8B-MoT-Infographic upvoted a paper 20 days ago
From Pixels to Words -- Towards Native One-Vision Models at Scale