Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - SAGE: Scalable Agentic 3D Scene Generation for Embodied AI
\n","updatedAt":"2026-02-11T03:15:58.740Z","author":{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","fullname":"taesiri","name":"taesiri","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":235,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6142451763153076},"editors":["taesiri"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg"],"reactions":[],"isReport":false}},{"id":"698d30994c2192694bb79c05","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2026-02-12T01:44:57.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [V-CAGE: Context-Aware Generation and Verification for Scalable Long-Horizon Embodied Tasks](https://huggingface.co/papers/2601.15164) (2026)\n* [SceneSmith: Agentic Generation of Simulation-Ready Indoor Scenes](https://huggingface.co/papers/2602.09153) (2026)\n* [AgenticLab: A Real-World Robot Agent Platform that Can See, Think, and Act](https://huggingface.co/papers/2602.01662) (2026)\n* [Genie Sim 3.0 : A High-Fidelity Comprehensive Simulation Platform for Humanoid Robot](https://huggingface.co/papers/2601.02078) (2026)\n* [AnyTask: an Automated Task and Data Generation Framework for Advancing Sim-to-Real Policy Learning](https://huggingface.co/papers/2512.17853) (2025)\n* [VirtualEnv: A Platform for Embodied AI Research](https://huggingface.co/papers/2601.07553) (2026)\n* [SceneReVis: A Self-Reflective Vision-Grounded Framework for 3D Indoor Scene Synthesis via Multi-turn RL](https://huggingface.co/papers/2602.09432) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
\n
The following papers were recommended by the Semantic Scholar API
Please give a thumbs up to this comment if you found it helpful!
\n
If you want recommendations for any Paper on Hugging Face checkout this Space
\n
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
\n","updatedAt":"2026-02-12T01:44:57.836Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7441803812980652},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2602.10116","authors":[{"_id":"698bf1aa6052d3bed9630a2d","name":"Hongchi Xia","hidden":false},{"_id":"698bf1aa6052d3bed9630a2e","name":"Xuan Li","hidden":false},{"_id":"698bf1aa6052d3bed9630a2f","name":"Zhaoshuo Li","hidden":false},{"_id":"698bf1aa6052d3bed9630a30","name":"Qianli Ma","hidden":false},{"_id":"698bf1aa6052d3bed9630a31","name":"Jiashu Xu","hidden":false},{"_id":"698bf1aa6052d3bed9630a32","name":"Ming-Yu Liu","hidden":false},{"_id":"698bf1aa6052d3bed9630a33","name":"Yin Cui","hidden":false},{"_id":"698bf1aa6052d3bed9630a34","name":"Tsung-Yi Lin","hidden":false},{"_id":"698bf1aa6052d3bed9630a35","name":"Wei-Chiu Ma","hidden":false},{"_id":"698bf1aa6052d3bed9630a36","name":"Shenlong Wang","hidden":false},{"_id":"698bf1aa6052d3bed9630a37","name":"Shuran Song","hidden":false},{"_id":"698bf1aa6052d3bed9630a38","name":"Fangyin Wei","hidden":false}],"publishedAt":"2026-02-10T18:59:55.000Z","submittedOnDailyAt":"2026-02-11T00:45:34.378Z","title":"SAGE: Scalable Agentic 3D Scene Generation for Embodied AI","submittedOnDailyBy":{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},"summary":"Real-world data collection for embodied agents remains costly and unsafe, calling for scalable, realistic, and simulator-ready 3D environments. However, existing scene-generation systems often rely on rule-based or task-specific pipelines, yielding artifacts and physically invalid scenes. We present SAGE, an agentic framework that, given a user-specified embodied task (e.g., \"pick up a bowl and place it on the table\"), understands the intent and automatically generates simulation-ready environments at scale. The agent couples multiple generators for layout and object composition with critics that evaluate semantic plausibility, visual realism, and physical stability. Through iterative reasoning and adaptive tool selection, it self-refines the scenes until meeting user intent and physical validity. The resulting environments are realistic, diverse, and directly deployable in modern simulators for policy training. Policies trained purely on this data exhibit clear scaling trends and generalize to unseen objects and layouts, demonstrating the promise of simulation-driven scaling for embodied AI. Code, demos, and the SAGE-10k dataset can be found on the project page here: https://nvlabs.github.io/sage.","upvotes":9,"discussionId":"698bf1ab6052d3bed9630a39","projectPage":"https://nvlabs.github.io/sage","ai_summary":"SAGE is an agentic framework that automatically generates simulation-ready 3D environments for embodied AI by combining layout and object composition generators with evaluative critics for semantic plausibility and physical stability.","ai_keywords":["embodied agents","simulation-ready environments","scene-generation systems","agentic framework","layout generation","object composition","semantic plausibility","visual realism","physical stability","iterative reasoning","adaptive tool selection","policy training","embodied AI"],"organization":{"_id":"60262b67268c201cdc8b7d43","name":"nvidia","fullname":"NVIDIA","avatar":"https://cdn-uploads.huggingface.co/production/uploads/1613114437487-60262a8e0703121c822a80b6.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"6375d136dee28348a9c63cbf","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6375d136dee28348a9c63cbf/gK465HBrQWIOZ-qtHb-Vh.jpeg","isPro":true,"fullname":"zehuan-huang","user":"huanngzh","type":"user"},{"_id":"673fa447ea40fe8fba271077","avatarUrl":"/avatars/6c0c996a5575134a779431f14f6256ab.svg","isPro":false,"fullname":"Fangyin Wei","user":"weify627","type":"user"},{"_id":"6407e5294edf9f5c4fd32228","avatarUrl":"/avatars/8e2d55460e9fe9c426eb552baf4b2cb0.svg","isPro":false,"fullname":"Stoney Kang","user":"sikang99","type":"user"},{"_id":"63c1699e40a26dd2db32400d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63c1699e40a26dd2db32400d/3N0-Zp8igv8-52mXAdiiq.jpeg","isPro":false,"fullname":"Chroma","user":"Chroma111","type":"user"},{"_id":"63b738acbd2d1535227daa4c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63b738acbd2d1535227daa4c/dbPQFvHwC-Cf-ssMGYUo6.jpeg","isPro":false,"fullname":"Tsung-Yi Lin","user":"tsungyi","type":"user"},{"_id":"684d57f26e04c265777ead3f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/cuOj-bQqukSZreXgUJlfm.png","isPro":false,"fullname":"Joakim Lee","user":"Reinforcement4All","type":"user"},{"_id":"6947f69751d7ae7c3c7b6908","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/PuIDZB9XDShHohKhYmdmp.png","isPro":true,"fullname":"Ben Kelly","user":"YellowjacketGames","type":"user"},{"_id":"69981e94f58de2eadd0894cc","avatarUrl":"/avatars/bd7b682915a8208879e7857c2daf2f0f.svg","isPro":true,"fullname":"Darrell Porcher","user":"dporcher","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0,"organization":{"_id":"60262b67268c201cdc8b7d43","name":"nvidia","fullname":"NVIDIA","avatar":"https://cdn-uploads.huggingface.co/production/uploads/1613114437487-60262a8e0703121c822a80b6.png"}}">
SAGE is an agentic framework that automatically generates simulation-ready 3D environments for embodied AI by combining layout and object composition generators with evaluative critics for semantic plausibility and physical stability.
AI-generated summary
Real-world data collection for embodied agents remains costly and unsafe, calling for scalable, realistic, and simulator-ready 3D environments. However, existing scene-generation systems often rely on rule-based or task-specific pipelines, yielding artifacts and physically invalid scenes. We present SAGE, an agentic framework that, given a user-specified embodied task (e.g., "pick up a bowl and place it on the table"), understands the intent and automatically generates simulation-ready environments at scale. The agent couples multiple generators for layout and object composition with critics that evaluate semantic plausibility, visual realism, and physical stability. Through iterative reasoning and adaptive tool selection, it self-refines the scenes until meeting user intent and physical validity. The resulting environments are realistic, diverse, and directly deployable in modern simulators for policy training. Policies trained purely on this data exhibit clear scaling trends and generalize to unseen objects and layouts, demonstrating the promise of simulation-driven scaling for embodied AI. Code, demos, and the SAGE-10k dataset can be found on the project page here: https://nvlabs.github.io/sage.