Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Matrix-3D: Omnidirectional Explorable 3D World Generation
[go: Go Back, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-08-14T01:37:17.092Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6769846081733704},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}},{"id":"689e4f5731daebb962b38544","author":{"_id":"65d9fc2a0e6ad24551d87a1e","avatarUrl":"/avatars/3aedb9522cc3cd08349d654f523fd792.svg","fullname":"Grant Singleton","name":"grantsing","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isUserFollowing":false},"createdAt":"2025-08-14T21:04:23.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"arXiv explained breakdown of this paper ๐Ÿ‘‰ https://arxivexplained.com/papers/matrix-3d-omnidirectional-explorable-3d-world-generation","html":"

arXiv explained breakdown of this paper ๐Ÿ‘‰ https://arxivexplained.com/papers/matrix-3d-omnidirectional-explorable-3d-world-generation

\n","updatedAt":"2025-08-14T21:04:23.468Z","author":{"_id":"65d9fc2a0e6ad24551d87a1e","avatarUrl":"/avatars/3aedb9522cc3cd08349d654f523fd792.svg","fullname":"Grant Singleton","name":"grantsing","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7179397344589233},"editors":["grantsing"],"editorAvatarUrls":["/avatars/3aedb9522cc3cd08349d654f523fd792.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2508.08086","authors":[{"_id":"689ac0a0fab6fdd2e52ac4e0","user":{"_id":"6858f55c4f9d9853a2d3d12e","avatarUrl":"/avatars/a4739436deb2a0f2b327181fd31a5056.svg","isPro":false,"fullname":"Zhongqi.Yang","user":"Inn1917","type":"user"},"name":"Zhongqi Yang","status":"claimed_verified","statusLastChangedAt":"2025-08-14T13:39:33.067Z","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4e1","user":{"_id":"64758ee7d815855e4efa206b","avatarUrl":"/avatars/105ada08a9a982fc7b723bdc678f7e72.svg","isPro":false,"fullname":"wenhang ge","user":"spongy","type":"user"},"name":"Wenhang Ge","status":"claimed_verified","statusLastChangedAt":"2025-08-14T13:39:31.249Z","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4e2","name":"Yuqi Li","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4e3","name":"Jiaqi Chen","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4e4","name":"Haoyuan Li","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4e5","user":{"_id":"67a9664af5f1253c64259c50","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/Jk9W1C8704pqxNOsqZ0d9.png","isPro":false,"fullname":"an","user":"dearamy","type":"user"},"name":"Mengyin An","status":"claimed_verified","statusLastChangedAt":"2025-08-13T07:19:49.471Z","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4e6","user":{"_id":"67a9b36a2fbf63093c19d3de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67a9b36a2fbf63093c19d3de/LbB2XH0ezcJ_tdu4hU0Of.png","isPro":false,"fullname":"่€k","user":"kangfei","type":"user"},"name":"Fei Kang","status":"claimed_verified","statusLastChangedAt":"2025-08-13T07:19:46.490Z","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4e7","name":"Hua Xue","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4e8","name":"Baixin Xu","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4e9","name":"Yuyang Yin","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4ea","name":"Eric Li","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4eb","name":"Yang Liu","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4ec","name":"Yikai Wang","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4ed","user":{"_id":"6658589605d3e3058e469d34","avatarUrl":"/avatars/c437e92684fa87af411a06d0c5ef6b9e.svg","isPro":false,"fullname":"haoxiang guo","user":"howardguo222","type":"user"},"name":"Hao-Xiang Guo","status":"claimed_verified","statusLastChangedAt":"2025-08-14T13:39:35.091Z","hidden":false},{"_id":"689ac0a0fab6fdd2e52ac4ee","name":"Yahui Zhou","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/64758ee7d815855e4efa206b/gD6oavgX7cDu24qQLbXir.png"],"publishedAt":"2025-08-11T15:29:57.000Z","submittedOnDailyAt":"2025-08-13T00:45:23.058Z","title":"Matrix-3D: Omnidirectional Explorable 3D World Generation","submittedOnDailyBy":{"_id":"64758ee7d815855e4efa206b","avatarUrl":"/avatars/105ada08a9a982fc7b723bdc678f7e72.svg","isPro":false,"fullname":"wenhang ge","user":"spongy","type":"user"},"summary":"Explorable 3D world generation from a single image or text prompt forms a\ncornerstone of spatial intelligence. Recent works utilize video model to\nachieve wide-scope and generalizable 3D world generation. However, existing\napproaches often suffer from a limited scope in the generated scenes. In this\nwork, we propose Matrix-3D, a framework that utilize panoramic representation\nfor wide-coverage omnidirectional explorable 3D world generation that combines\nconditional video generation and panoramic 3D reconstruction. We first train a\ntrajectory-guided panoramic video diffusion model that employs scene mesh\nrenders as condition, to enable high-quality and geometrically consistent scene\nvideo generation. To lift the panorama scene video to 3D world, we propose two\nseparate methods: (1) a feed-forward large panorama reconstruction model for\nrapid 3D scene reconstruction and (2) an optimization-based pipeline for\naccurate and detailed 3D scene reconstruction. To facilitate effective\ntraining, we also introduce the Matrix-Pano dataset, the first large-scale\nsynthetic collection comprising 116K high-quality static panoramic video\nsequences with depth and trajectory annotations. Extensive experiments\ndemonstrate that our proposed framework achieves state-of-the-art performance\nin panoramic video generation and 3D world generation. See more in\nhttps://matrix-3d.github.io.","upvotes":76,"discussionId":"689ac0a0fab6fdd2e52ac4ef","projectPage":"https://matrix-3d.github.io/","githubRepo":"https://github.com/SkyworkAI/Matrix-3D","githubRepoAddedBy":"user","ai_summary":"Matrix-3D generates wide-coverage 3D worlds from single images or text using panoramic video diffusion and reconstruction models.","ai_keywords":["panoramic representation","wide-coverage","omnidirectional","explorable 3D world generation","conditional video generation","panoramic 3D reconstruction","trajectory-guided","panoramic video diffusion model","scene mesh renders","feed-forward large panorama reconstruction model","optimization-based pipeline","Matrix-Pano dataset","panoramic video generation","3D world generation"],"githubStars":648},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6658589605d3e3058e469d34","avatarUrl":"/avatars/c437e92684fa87af411a06d0c5ef6b9e.svg","isPro":false,"fullname":"haoxiang guo","user":"howardguo222","type":"user"},{"_id":"6858f55c4f9d9853a2d3d12e","avatarUrl":"/avatars/a4739436deb2a0f2b327181fd31a5056.svg","isPro":false,"fullname":"Zhongqi.Yang","user":"Inn1917","type":"user"},{"_id":"687605889fd22b3c3af503fa","avatarUrl":"/avatars/9b16579449d3d44690699cb937169077.svg","isPro":false,"fullname":"cierra runis","user":"yukisdf","type":"user"},{"_id":"650811283fc966d1bbba94fc","avatarUrl":"/avatars/32ba6f4041887b9bb243cb2008298993.svg","isPro":false,"fullname":"baixin","user":"NoOOM1","type":"user"},{"_id":"67a9b36a2fbf63093c19d3de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67a9b36a2fbf63093c19d3de/LbB2XH0ezcJ_tdu4hU0Of.png","isPro":false,"fullname":"่€k","user":"kangfei","type":"user"},{"_id":"688c862992363329e57075ef","avatarUrl":"/avatars/5edee94acade0ce32b545bf1fe922f84.svg","isPro":false,"fullname":"HF_BU","user":"HFSUN123","type":"user"},{"_id":"65464049e70ffa3c07f22e92","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/EAYwKThe-BGeza-CmzIdx.jpeg","isPro":false,"fullname":" Li Jiaxing","user":"LiJiaxing","type":"user"},{"_id":"64054d8a3d49e1e066bfa32b","avatarUrl":"/avatars/9044f937145cc5aa4bc3a5ffa751f724.svg","isPro":false,"fullname":"Fuxiang Zhang","user":"sicer","type":"user"},{"_id":"63120517ae8896941da4c5da","avatarUrl":"/avatars/10e1be026035f3e24225e6782a710083.svg","isPro":false,"fullname":"Lingrui Mei","user":"Chevalier","type":"user"},{"_id":"62308ae8be7833fa98620372","avatarUrl":"/avatars/220ed57376202c61ec1afbf5a4f45b67.svg","isPro":false,"fullname":"random","user":"fakerbaby","type":"user"},{"_id":"65bf81adc6d92daea562f3cb","avatarUrl":"/avatars/ba4c9047a04367108c0b5488b298620f.svg","isPro":false,"fullname":"wch","user":"wchstrife","type":"user"},{"_id":"68513e0c74eaf9f0a4d7dcb2","avatarUrl":"/avatars/01460d085684b206ab7c506e1f6d5185.svg","isPro":false,"fullname":"safasdadfa","user":"dafdda","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2508.08086

Matrix-3D: Omnidirectional Explorable 3D World Generation

Published on Aug 11, 2025
ยท Submitted by
wenhang ge
on Aug 13, 2025
Authors:
,
,
,
,
,
,
,
,
,

Abstract

Matrix-3D generates wide-coverage 3D worlds from single images or text using panoramic video diffusion and reconstruction models.

AI-generated summary

Explorable 3D world generation from a single image or text prompt forms a cornerstone of spatial intelligence. Recent works utilize video model to achieve wide-scope and generalizable 3D world generation. However, existing approaches often suffer from a limited scope in the generated scenes. In this work, we propose Matrix-3D, a framework that utilize panoramic representation for wide-coverage omnidirectional explorable 3D world generation that combines conditional video generation and panoramic 3D reconstruction. We first train a trajectory-guided panoramic video diffusion model that employs scene mesh renders as condition, to enable high-quality and geometrically consistent scene video generation. To lift the panorama scene video to 3D world, we propose two separate methods: (1) a feed-forward large panorama reconstruction model for rapid 3D scene reconstruction and (2) an optimization-based pipeline for accurate and detailed 3D scene reconstruction. To facilitate effective training, we also introduce the Matrix-Pano dataset, the first large-scale synthetic collection comprising 116K high-quality static panoramic video sequences with depth and trajectory annotations. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art performance in panoramic video generation and 3D world generation. See more in https://matrix-3d.github.io.

Community

Paper author Paper submitter

Explorable 3D world generation from a single image or text prompt forms a cornerstone of spatial intelligence. Recent works utilize video model to achieve wide-scope and generalizable 3D world generation. However, existing approaches often suffer from a limited scope in the generated scenes. In this work, we propose Matrix-3D, a framework that utilize panoramic representation for wide-coverage omnidirectional explorable 3D world generation that combines conditional video generation and panoramic 3D reconstruction. We first train a trajectory-guided panoramic video diffusion model that employs scene mesh renders as condition, to enable high-quality and geometrically consistent scene video generation. To lift the panorama scene video to 3D world, we propose two separate methods: (1) a feed-forward large panorama reconstruction model for rapid 3D scene reconstruction and (2) an optimization-based pipeline for accurate and detailed 3D scene reconstruction. To facilitate effective training, we also introduce the Matrix-Pano dataset, the first large-scale synthetic collection comprising 116K high-quality static panoramic video sequences with depth and trajectory annotations. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art performance in panoramic video generation and 3D world generation.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.08086 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.08086 in a Space README.md to link it from this page.

Collections including this paper 7