Deprecated : The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
T2I Models - a Stalin16 Collection
T2I Models updated 29 days ago
yandex/stable-diffusion-3.5-medium-alchemist Text-to-Image
• Updated May 16, 2025 • 17
• 7
Paper
• 2506.23044
• Published Jun 29, 2025 • 61
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model Paper
• 2507.01953
• Published Jul 2, 2025 • 18
LongAnimation: Long Animation Generation with Dynamic Global-Local
Memory Paper
• 2507.01945
• Published Jul 2, 2025 • 76
4KAgent: Agentic Any Image to 4K Super-Resolution Paper
• 2507.07105
• Published Jul 9, 2025 • 107
T-LoRA: Single Image Diffusion Model Customization Without Overfitting Paper
• 2507.05964
• Published Jul 8, 2025 • 121
LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS Paper
• 2507.07136
• Published Jul 9, 2025 • 40
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining Paper
• 2507.14119
• Published Jul 18, 2025 • 60
DesignLab: Designing Slides Through Iterative Detection and Correction Paper
• 2507.17202
• Published Jul 23, 2025 • 51
PUSA V1.0: Surpassing Wan-I2V with $500 Training Cost by Vectorized
Timestep Adaptation Paper
• 2507.16116
• Published Jul 22, 2025 • 13
ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World
Shorts Paper
• 2507.20939
• Published Jul 28, 2025 • 57
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image
Generative Models Great Again Paper
• 2507.22058
• Published Jul 29, 2025 • 40
Qwen-Image Technical Report Paper
• 2508.02324
• Published Aug 4, 2025 • 274
Omni-Effects: Unified and Spatially-Controllable Visual Effects
Generation Paper
• 2508.07981
• Published Aug 11, 2025 • 63
NextStep-1: Toward Autoregressive Image Generation with Continuous
Tokens at Scale Paper
• 2508.10711
• Published Aug 14, 2025 • 146
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable
Text-to-Image Reinforcement Learning Paper
• 2508.20751
• Published Aug 28, 2025 • 90
Emu3.5: Native Multimodal Models are World Learners Paper
• 2510.26583
• Published Oct 30, 2025 • 114
OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal
Document Layout Generation Paper
• 2510.26213
• Published Oct 30, 2025 • 10
Multimodal Spatial Reasoning in the Large Model Era: A Survey and
Benchmarks Paper
• 2510.25760
• Published Oct 29, 2025 • 17
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models Paper
• 2511.10629
• Published Nov 13, 2025 • 129
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper
• 2511.14993
• Published Nov 19, 2025 • 233
Back to Basics: Let Denoising Generative Models Denoise Paper
• 2511.13720
• Published Nov 17, 2025 • 70
Light-X: Generative 4D Video Rendering with Camera and Illumination Control Paper
• 2512.05115
• Published Dec 4, 2025 • 11
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper
• 2512.08765
• Published Dec 9, 2025 • 134
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality Paper
• 2512.07951
• Published Dec 8, 2025 • 51
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing Paper
• 2512.06065
• Published Dec 5, 2025 • 29
Towards Scalable Pre-training of Visual Tokenizers for Generation Paper
• 2512.13687
• Published Dec 15, 2025 • 106
Few-Step Distillation for Text-to-Image Generation: A Practical Guide Paper
• 2512.13006
• Published Dec 15, 2025 • 10
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper
• 2512.08269
• Published Dec 9, 2025 • 123
OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer Paper
• 2601.14250
• Published Jan 20 • 48
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss Paper
• 2602.02493
• Published Feb 2 • 46
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing Paper
• 2602.02437
• Published Feb 2 • 80
Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation Paper
• 2602.01756
• Published Feb 2 • 23
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models Paper
• 2602.22859
• Published Feb 26 • 151
The Trinity of Consistency as a Defining Principle for General World Models Paper
• 2602.23152
• Published Feb 26 • 201
OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens Paper
• 2603.02138
• Published Mar 2 • 151
From Scale to Speed: Adaptive Test-Time Scaling for Image Editing Paper
• 2603.00141
• Published Feb 24 • 138
Helios: Real Real-Time Long Video Generation Model Paper
• 2603.04379
• Published Mar 4 • 186
Phi-4-reasoning-vision-15B Technical Report Paper
• 2603.03975
• Published Mar 4 • 20
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper
• 2603.21986
• Published 30 days ago • 123