U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences Paper • 2512.02982 • Published Dec 2, 2025 • 2
EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing Paper • 2512.11715 • Published Dec 12, 2025
WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World Paper • 2512.10958 • Published Dec 11, 2025
Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future Paper • 2512.16760 • Published Dec 18, 2025 • 15
Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems Paper • 2512.24385 • Published Dec 30, 2025 • 8
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Paper • 2604.18486 • Published 4 days ago • 79
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Paper • 2604.18486 • Published 4 days ago • 79
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2604.12374 • Published 10 days ago • 36
Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism Paper • 2604.09544 • Published 14 days ago • 6
MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines Paper • 2603.06679 • Published 25 days ago • 6
Colon-Bench: An Agentic Workflow for Scalable Dense Lesion Annotation in Full-Procedure Colonoscopy Videos Paper • 2603.25645 • Published 28 days ago • 4
AVO: Agentic Variation Operators for Autonomous Evolutionary Search Paper • 2603.24517 • Published 29 days ago • 10
SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models Paper • 2311.16933 • Published Nov 28, 2023 • 1
Automated Conversion of Music Videos into Lyric Videos Paper • 2308.14922 • Published Aug 28, 2023
Light-A-Video: Training-free Video Relighting via Progressive Light Fusion Paper • 2502.08590 • Published Feb 12, 2025 • 42
Light of Normals: Unified Feature Representation for Universal Photometric Stereo Paper • 2506.18882 • Published Jun 23, 2025 • 89
Mindalogue: LLM-Powered Nonlinear Interaction for Effective Learning and Task Exploration Paper • 2410.10570 • Published Oct 14, 2024
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers Paper • 2305.17455 • Published May 27, 2023