Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - One-step Latent-free Image Generation with Pixel Mean Flows
\n","updatedAt":"2026-01-30T22:26:35.921Z","author":{"_id":"65243980050781c16f234f1f","avatarUrl":"/avatars/743a009681d5d554c27e04300db9f267.svg","fullname":"Avi","name":"avahal","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.677797794342041},"editors":["avahal"],"editorAvatarUrls":["/avatars/743a009681d5d554c27e04300db9f267.svg"],"reactions":[],"isReport":false}},{"id":"697d5cbb440ea4e00c8311a0","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2026-01-31T01:36:59.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [SoFlow: Solution Flow Models for One-Step Generative Modeling](https://huggingface.co/papers/2512.15657) (2025)\n* [One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation](https://huggingface.co/papers/2512.07829) (2025)\n* [RecTok: Reconstruction Distillation along Rectified Flow](https://huggingface.co/papers/2512.13421) (2025)\n* [Fast, faithful and photorealistic diffusion-based image super-resolution with enhanced Flow Map models](https://huggingface.co/papers/2601.16660) (2026)\n* [REGLUE Your Latents with Global and Local Semantics for Entangled Diffusion](https://huggingface.co/papers/2512.16636) (2025)\n* [Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing](https://huggingface.co/papers/2512.17909) (2025)\n* [Few-Step Distillation for Text-to-Image Generation: A Practical Guide](https://huggingface.co/papers/2512.13006) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
\n
The following papers were recommended by the Semantic Scholar API
Please give a thumbs up to this comment if you found it helpful!
\n
If you want recommendations for any Paper on Hugging Face checkout this Space
\n
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
\n","updatedAt":"2026-01-31T01:36:59.482Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7236424684524536},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2601.22158","authors":[{"_id":"697c6ad6a67238fac88cc276","name":"Yiyang Lu","hidden":false},{"_id":"697c6ad6a67238fac88cc277","name":"Susie Lu","hidden":false},{"_id":"697c6ad6a67238fac88cc278","name":"Qiao Sun","hidden":false},{"_id":"697c6ad6a67238fac88cc279","name":"Hanhong Zhao","hidden":false},{"_id":"697c6ad6a67238fac88cc27a","name":"Zhicheng Jiang","hidden":false},{"_id":"697c6ad6a67238fac88cc27b","name":"Xianbang Wang","hidden":false},{"_id":"697c6ad6a67238fac88cc27c","name":"Tianhong Li","hidden":false},{"_id":"697c6ad6a67238fac88cc27d","name":"Zhengyang Geng","hidden":false},{"_id":"697c6ad6a67238fac88cc27e","name":"Kaiming He","hidden":false}],"publishedAt":"2026-01-29T18:59:56.000Z","submittedOnDailyAt":"2026-01-30T05:55:44.778Z","title":"One-step Latent-free Image Generation with Pixel Mean Flows","submittedOnDailyBy":{"_id":"661678b244425d16e37f2341","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/661678b244425d16e37f2341/VUSQ3S0IXZLoIxO4oX4cv.png","isPro":false,"fullname":"Yiyang Lu","user":"Lyy0725","type":"user"},"summary":"Modern diffusion/flow-based models for image generation typically exhibit two core characteristics: (i) using multi-step sampling, and (ii) operating in a latent space. Recent advances have made encouraging progress on each aspect individually, paving the way toward one-step diffusion/flow without latents. In this work, we take a further step towards this goal and propose \"pixel MeanFlow\" (pMF). Our core guideline is to formulate the network output space and the loss space separately. The network target is designed to be on a presumed low-dimensional image manifold (i.e., x-prediction), while the loss is defined via MeanFlow in the velocity space. We introduce a simple transformation between the image manifold and the average velocity field. In experiments, pMF achieves strong results for one-step latent-free generation on ImageNet at 256x256 resolution (2.22 FID) and 512x512 resolution (2.48 FID), filling a key missing piece in this regime. We hope that our study will further advance the boundaries of diffusion/flow-based generative models.","upvotes":17,"discussionId":"697c6ad6a67238fac88cc27f","ai_summary":"Pixel MeanFlow introduces a one-step latent-free image generation method by separating network output space from loss space, achieving strong performance on ImageNet at multiple resolutions.","ai_keywords":["diffusion models","flow-based models","multi-step sampling","latent space","one-step generation","image manifold","MeanFlow","velocity space","x-prediction"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"661678b244425d16e37f2341","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/661678b244425d16e37f2341/VUSQ3S0IXZLoIxO4oX4cv.png","isPro":false,"fullname":"Yiyang Lu","user":"Lyy0725","type":"user"},{"_id":"66bc518c47a442c77238f15b","avatarUrl":"/avatars/e35fec526c0e74cfa20ecdfd74f0ef1f.svg","isPro":false,"fullname":"TianyuCui","user":"Rebelliousgang","type":"user"},{"_id":"677272184d148b904333e874","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/5dUau7gxLk4Wm1TiiJJri.jpeg","isPro":false,"fullname":"Efstathios Karypidis","user":"Sta8is","type":"user"},{"_id":"6783271a6604282a63aabc47","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/DutSc2S6zSixLJUnE3JdK.jpeg","isPro":false,"fullname":"Johan Liebert","user":"johandliebert","type":"user"},{"_id":"6576b137a90ae2daae171245","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6576b137a90ae2daae171245/WQK4UyNDX1XxIK83GpOZB.jpeg","isPro":false,"fullname":"zhouchushu(SII)","user":"zhouchushu","type":"user"},{"_id":"6351e5bb3734c6e8a5c1bec1","avatarUrl":"/avatars/a784a51b369b197398575c3afbd5ceab.svg","isPro":false,"fullname":"Han-Bit Kang","user":"hbkang","type":"user"},{"_id":"65ad57da57f263e3d030187a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/dJ3DYSIlv3Pb_6IEbqwOQ.png","isPro":false,"fullname":"潘子豪","user":"Apostle723","type":"user"},{"_id":"682d9f90296b594331ca7f29","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/DimnX-7kJ54eJrGwceGib.png","isPro":false,"fullname":"xiang","user":"zhangxiang1209","type":"user"},{"_id":"64faed2e5ca946a010857aec","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64faed2e5ca946a010857aec/eR8Hx0Dyy-DrPd1rmww8_.png","isPro":false,"fullname":"Xu Lin","user":"gatilin","type":"user"},{"_id":"62e0fcfe4db2175cd2710063","avatarUrl":"/avatars/0f8d1b5ba3e3eba2af18f98f65f0b180.svg","isPro":true,"fullname":"Zijian Zhou","user":"franciszzj","type":"user"},{"_id":"66615c855fd9d736e670e0a9","avatarUrl":"/avatars/0ff3127b513552432a7c651e21d7f283.svg","isPro":false,"fullname":"wangshuai","user":"wangsssssss","type":"user"},{"_id":"66f9450ec25c3fcb32b46f79","avatarUrl":"/avatars/ad2607163183465f32ec1ef2ed0f2fd4.svg","isPro":false,"fullname":"Lucien Hubert","user":"lhrlhr","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Pixel MeanFlow introduces a one-step latent-free image generation method by separating network output space from loss space, achieving strong performance on ImageNet at multiple resolutions.
AI-generated summary
Modern diffusion/flow-based models for image generation typically exhibit two core characteristics: (i) using multi-step sampling, and (ii) operating in a latent space. Recent advances have made encouraging progress on each aspect individually, paving the way toward one-step diffusion/flow without latents. In this work, we take a further step towards this goal and propose "pixel MeanFlow" (pMF). Our core guideline is to formulate the network output space and the loss space separately. The network target is designed to be on a presumed low-dimensional image manifold (i.e., x-prediction), while the loss is defined via MeanFlow in the velocity space. We introduce a simple transformation between the image manifold and the average velocity field. In experiments, pMF achieves strong results for one-step latent-free generation on ImageNet at 256x256 resolution (2.22 FID) and 512x512 resolution (2.48 FID), filling a key missing piece in this regime. We hope that our study will further advance the boundaries of diffusion/flow-based generative models.