Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - BitDance: Scaling Autoregressive Generative Models with Binary Tokens
[go: Go Back, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2026-02-18T01:41:18.746Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6833346486091614},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}},{"id":"69951dad11f115535b1d4782","author":{"_id":"65709464b3501cbcb8f4007e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65709464b3501cbcb8f4007e/lD6_mJvhiGPqYxIXhkKal.jpeg","fullname":"Yuang Ai","name":"shallowdream204","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":31,"isUserFollowing":false},"createdAt":"2026-02-18T02:02:21.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Code: https://github.com/shallowdream204/BitDance\nProtege Page: https://bitdance.csuhan.com/\nDemo: https://huggingface.co/spaces/shallowdream204/BitDance-14B-64x\nModel Weights: https://huggingface.co/collections/shallowdream204/bitdance","html":"

Code: https://github.com/shallowdream204/BitDance
Protege Page: https://bitdance.csuhan.com/
Demo: https://huggingface.co/spaces/shallowdream204/BitDance-14B-64x
Model Weights: https://huggingface.co/collections/shallowdream204/bitdance

\n","updatedAt":"2026-02-18T02:02:21.507Z","author":{"_id":"65709464b3501cbcb8f4007e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65709464b3501cbcb8f4007e/lD6_mJvhiGPqYxIXhkKal.jpeg","fullname":"Yuang Ai","name":"shallowdream204","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":31,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.642824113368988},"editors":["shallowdream204"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/65709464b3501cbcb8f4007e/lD6_mJvhiGPqYxIXhkKal.jpeg"],"reactions":[{"reaction":"🔥","users":["taesiri","nalasmspizae"],"count":2},{"reaction":"🤗","users":["taesiri","nalasmspizae"],"count":2}],"isReport":false}},{"id":"6995d6be7c8207785f93d75d","author":{"_id":"65243980050781c16f234f1f","avatarUrl":"/avatars/743a009681d5d554c27e04300db9f267.svg","fullname":"Avi","name":"avahal","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isUserFollowing":false},"createdAt":"2026-02-18T15:11:58.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"arXivLens breakdown of this paper 👉 https://arxivlens.com/PaperView/Details/bitdance-scaling-autoregressive-generative-models-with-binary-tokens-8420-95439757\n- Executive Summary\n- Detailed Breakdown\n- Practical Applications","html":"

arXivLens breakdown of this paper 👉 https://arxivlens.com/PaperView/Details/bitdance-scaling-autoregressive-generative-models-with-binary-tokens-8420-95439757

\n
    \n
  • Executive Summary
  • \n
  • Detailed Breakdown
  • \n
  • Practical Applications
  • \n
\n","updatedAt":"2026-02-18T15:11:58.110Z","author":{"_id":"65243980050781c16f234f1f","avatarUrl":"/avatars/743a009681d5d554c27e04300db9f267.svg","fullname":"Avi","name":"avahal","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7230231761932373},"editors":["avahal"],"editorAvatarUrls":["/avatars/743a009681d5d554c27e04300db9f267.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2602.14041","authors":[{"_id":"6993d8d450fb2c0be4783ccc","user":{"_id":"65709464b3501cbcb8f4007e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65709464b3501cbcb8f4007e/lD6_mJvhiGPqYxIXhkKal.jpeg","isPro":true,"fullname":"Yuang Ai","user":"shallowdream204","type":"user"},"name":"Yuang Ai","status":"claimed_verified","statusLastChangedAt":"2026-02-18T09:06:59.779Z","hidden":false},{"_id":"6993d8d450fb2c0be4783ccd","user":{"_id":"62318c0386753f5f41d0e261","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62318c0386753f5f41d0e261/xO_5PvOf7lXhQPnQLcmnq.jpeg","isPro":false,"fullname":"Jiaming Han","user":"csuhan","type":"user"},"name":"Jiaming Han","status":"claimed_verified","statusLastChangedAt":"2026-02-18T09:06:57.914Z","hidden":false},{"_id":"6993d8d450fb2c0be4783cce","name":"Shaobin Zhuang","hidden":false},{"_id":"6993d8d450fb2c0be4783ccf","name":"Weijia Mao","hidden":false},{"_id":"6993d8d450fb2c0be4783cd0","user":{"_id":"6830e8e81bdea85fad4c65f5","avatarUrl":"/avatars/f5aa39c61052c40240db8d42a35e6b52.svg","isPro":false,"fullname":"Xuefeng Hu","user":"leonhuxff","type":"user"},"name":"Xuefeng Hu","status":"claimed_verified","statusLastChangedAt":"2026-02-19T09:53:49.944Z","hidden":false},{"_id":"6993d8d450fb2c0be4783cd1","name":"Ziyan Yang","hidden":false},{"_id":"6993d8d450fb2c0be4783cd2","name":"Zhenheng Yang","hidden":false},{"_id":"6993d8d450fb2c0be4783cd3","name":"Huaibo Huang","hidden":false},{"_id":"6993d8d450fb2c0be4783cd4","name":"Xiangyu Yue","hidden":false},{"_id":"6993d8d450fb2c0be4783cd5","name":"Hao Chen","hidden":false}],"publishedAt":"2026-02-15T08:09:05.000Z","submittedOnDailyAt":"2026-02-17T02:18:25.240Z","title":"BitDance: Scaling Autoregressive Generative Models with Binary Tokens","submittedOnDailyBy":{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},"summary":"We present BitDance, a scalable autoregressive (AR) image generator that predicts binary visual tokens instead of codebook indices. With high-entropy binary latents, BitDance lets each token represent up to 2^{256} states, yielding a compact yet highly expressive discrete representation. Sampling from such a huge token space is difficult with standard classification. To resolve this, BitDance uses a binary diffusion head: instead of predicting an index with softmax, it employs continuous-space diffusion to generate the binary tokens. Furthermore, we propose next-patch diffusion, a new decoding method that predicts multiple tokens in parallel with high accuracy, greatly speeding up inference. On ImageNet 256x256, BitDance achieves an FID of 1.24, the best among AR models. With next-patch diffusion, BitDance beats state-of-the-art parallel AR models that use 1.4B parameters, while using 5.4x fewer parameters (260M) and achieving 8.7x speedup. For text-to-image generation, BitDance trains on large-scale multimodal tokens and generates high-resolution, photorealistic images efficiently, showing strong performance and favorable scaling. When generating 1024x1024 images, BitDance achieves a speedup of over 30x compared to prior AR models. We release code and models to facilitate further research on AR foundation models. Code and models are available at: https://github.com/shallowdream204/BitDance.","upvotes":40,"discussionId":"6993d8d550fb2c0be4783cd6","projectPage":"https://bitdance.csuhan.com/","githubRepo":"https://github.com/shallowdream204/BitDance","githubRepoAddedBy":"user","ai_summary":"BitDance is a scalable autoregressive image generator that uses binary visual tokens and diffusion-based methods to achieve efficient high-resolution image generation with improved speed and performance.","ai_keywords":["autoregressive image generator","binary visual tokens","high-entropy binary latents","binary diffusion head","next-patch diffusion","diffusion models","FID","parameter-efficient","text-to-image generation","photorealistic images","image generation speedup"],"githubStars":235,"organization":{"_id":"653b817d32c97d0655575872","name":"ByteDance","fullname":"ByteDance","avatar":"https://cdn-uploads.huggingface.co/production/uploads/6535c9e88bde2fae19b6fb25/0clr54wj5Ly-RkYU9OXPp.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"684d57f26e04c265777ead3f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/cuOj-bQqukSZreXgUJlfm.png","isPro":false,"fullname":"Joakim Lee","user":"Reinforcement4All","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"66d8512c54209e9101811e8e","avatarUrl":"/avatars/62dfd8e6261108f2508efe678d5a2a57.svg","isPro":false,"fullname":"M Saad Salman","user":"MSS444","type":"user"},{"_id":"62bf29e8e140faaa852a755e","avatarUrl":"/avatars/f307045a4b94223e2e71a292a8f8efdf.svg","isPro":false,"fullname":"Walter Hugo Lopez Pinaya","user":"Warvito","type":"user"},{"_id":"635ba0c637c6a2c12e2daef9","avatarUrl":"/avatars/9fc2932d9ace2715f540f896754ec7d2.svg","isPro":false,"fullname":"Ollie McCarthy","user":"ollieollie","type":"user"},{"_id":"63c5d43ae2804cb2407e4d43","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1673909278097-noauth.png","isPro":false,"fullname":"xziayro","user":"xziayro","type":"user"},{"_id":"64bef7bc1363b5c799de6d44","avatarUrl":"/avatars/a9947c6d7ca98d7385efa5ee7f2fb9a8.svg","isPro":false,"fullname":"hassenhamdi","user":"hassenhamdi","type":"user"},{"_id":"62dd573a26b500df91243419","avatarUrl":"/avatars/4a818c24e05358fb0adb146a0abad058.svg","isPro":false,"fullname":"Nikhil Satani","user":"satani","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"65709464b3501cbcb8f4007e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65709464b3501cbcb8f4007e/lD6_mJvhiGPqYxIXhkKal.jpeg","isPro":true,"fullname":"Yuang Ai","user":"shallowdream204","type":"user"},{"_id":"689a37d8c6276c8d352864b7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/689a37d8c6276c8d352864b7/w5qWbvxwtSWWqlXkl_ebi.jpeg","isPro":false,"fullname":"Mateo Lafalce","user":"lafalce","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":3,"organization":{"_id":"653b817d32c97d0655575872","name":"ByteDance","fullname":"ByteDance","avatar":"https://cdn-uploads.huggingface.co/production/uploads/6535c9e88bde2fae19b6fb25/0clr54wj5Ly-RkYU9OXPp.png"}}">
Papers
arxiv:2602.14041

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Published on Feb 15
· Submitted by
taesiri
on Feb 17
#3 Paper of the day
Authors:
,
,
,
,
,
,

Abstract

BitDance is a scalable autoregressive image generator that uses binary visual tokens and diffusion-based methods to achieve efficient high-resolution image generation with improved speed and performance.

AI-generated summary

We present BitDance, a scalable autoregressive (AR) image generator that predicts binary visual tokens instead of codebook indices. With high-entropy binary latents, BitDance lets each token represent up to 2^{256} states, yielding a compact yet highly expressive discrete representation. Sampling from such a huge token space is difficult with standard classification. To resolve this, BitDance uses a binary diffusion head: instead of predicting an index with softmax, it employs continuous-space diffusion to generate the binary tokens. Furthermore, we propose next-patch diffusion, a new decoding method that predicts multiple tokens in parallel with high accuracy, greatly speeding up inference. On ImageNet 256x256, BitDance achieves an FID of 1.24, the best among AR models. With next-patch diffusion, BitDance beats state-of-the-art parallel AR models that use 1.4B parameters, while using 5.4x fewer parameters (260M) and achieving 8.7x speedup. For text-to-image generation, BitDance trains on large-scale multimodal tokens and generates high-resolution, photorealistic images efficiently, showing strong performance and favorable scaling. When generating 1024x1024 images, BitDance achieves a speedup of over 30x compared to prior AR models. We release code and models to facilitate further research on AR foundation models. Code and models are available at: https://github.com/shallowdream204/BitDance.

Community

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

arXivLens breakdown of this paper 👉 https://arxivlens.com/PaperView/Details/bitdance-scaling-autoregressive-generative-models-with-binary-tokens-8420-95439757

  • Executive Summary
  • Detailed Breakdown
  • Practical Applications

Sign up or log in to comment

Models citing this paper 8

Browse 8 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.14041 in a dataset README.md to link it from this page.

Spaces citing this paper 2

Collections including this paper 4