Huggingface Projects

company

https://huggingface.co/

huggingface

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

sergiopaniego updated a dataset about 7 hours ago

huggingface-projects/Deep-RL-Course-Certification

pcuenq updated a dataset 1 day ago

huggingface-projects/drlc-leaderboard-data

akhaliq submitted a paper 1 day ago

Image Generators are Generalist Vision Learners

View all activity

sergiopaniego

updated a dataset about 7 hours ago

huggingface-projects/Deep-RL-Course-Certification

Viewer • Updated about 3 hours ago • 1.7k • 199 • 18

pcuenq

updated a dataset 1 day ago

huggingface-projects/drlc-leaderboard-data

Viewer • Updated 1 day ago • 49.7k • 8.17k • 2

akhaliq

submitted a paper to Daily Papers 1 day ago

Image Generators are Generalist Vision Learners

Paper • 2604.20329 • Published 3 days ago • 8

hysts

updated 11 Spaces 1 day ago

Llama 3.2 3B Instruct

😻

127

Chatbot

Llama 2 13b Chat

🦙

490

Chat with the Llama‑2 13B language model

Llama 2 7B Chat

🏆

482

Chat with Llama‑2 7B AI model

Gemma 3n E4B It

⚡

142

Chat with a helper that understands text, images, audio, and video

Gemma 3 12b It

🔥

163

Chat with a multimodal AI that understands text, images, and videos

Gemma 2 9B IT

😻

101

Chatbot

Gemma 2 2B JPN IT

😻

Chatbot

Gemma 2 2B IT

😻

Chatbot

Gemma 4 26B-A4B It

🚀

Chat with AI using text, images, or video

Gemma 4 31B It

🚀

Chat with AI using text, images, or video

Gemma 4 E4B It

🚀

Chat with AI using text, images, audio, or video

AdinaY

submitted a paper to Daily Papers 8 days ago

TRACER: Trace-Based Adaptive Cost-Efficient Routing for LLM Classification

Paper • 2604.14531 • Published 9 days ago • 7

sergiopaniego

posted an update 9 days ago

Post

1142

Earlier this month, Apple introduced Simple Self-Distillation: a fine-tuning method that improves models on coding tasks just by sampling from the model and training on its own outputs with plain cross-entropy

And… it's already supported in TRL, built by Kashif Rasul. you can really feel the pace of development in the team 🐎

Paper by Ruixiang ZHANG, He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang at Apple 🍎

How it works: the model generates completions at a training-time temperature (T_train) with top_k/top_p truncation, then fine-tunes on them with plain cross-entropy. no labels or verifier needed

You can try it right away with this ready-to-run example (Qwen3-4B on rStar-Coder):
https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd.py
or benchmark a checkpoint with the eval script:
https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd_eval.py

One neat insight from the paper: T_train and T_eval compose into an effective T_eff = T_train × T_eval, so a broad band of configs works well. even very noisy samples still help

Want to dig deeper?

Paper: Embarrassingly Simple Self-Distillation Improves Code Generation (2604.01193)
Trainer docs: https://huggingface.co/docs/trl/main/en/ssd_trainer

victor

posted an update 11 days ago

Post

4979

Want to share my enthusiasm for zai-org/GLM-5.1 here too 🔥

I think we have it: our open source Claude Code = GLM-5.1 + Pi (https://pi.dev/) - Built a Three.js racing game to eval and it's extremely impressive. Thoughts:

- One-shot car physics with real drift mechanics (this is hard)

- My fav part: Awesome at self iterating (with no vision!) created 20+ Bun.WebView debugging tools to drive the car programmatically and read game state. Proved a winding bug with vector math without ever seeing the screen

- 531-line racing AI in a single write: 4 personalities, curvature map, racing lines, tactical drifting. Built telemetry tools to compare player vs AI speed curves and data-tuned parameters

- All assets from scratch: 3D models, procedural textures, sky shader, engine sounds, spatial AI audio!

- Can do hard math: proved road normals pointed DOWN via vector cross products, computed track curvature normalized by arc length to tune AI cornering speed

You are going to hear about this model a lot in the next months - open source let's go - and thanks z-ai🚀🚀

4 replies

sergiopaniego

posted an update 15 days ago

Post

377

Great experience yesterday at PyTorch Conf Europe in Paris 🇫🇷

We (w/ @kashif ) talked about training LLMs through interaction, using trajectories across games, browsers, or simulators

Room was packed, a clear sign of interest in where RL post-training is heading.

sharing the slides! 🤓
https://drive.google.com/file/d/16k7YRnf5EJEo0XjXGlRJ_hVeLoFWKyNP/view?usp=sharing

akhaliq

submitted a paper to Daily Papers 21 days ago

MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines

Paper • 2603.06679 • Published 26 days ago • 6

sergiopaniego

posted an update 22 days ago

Post

2797

Gemma 4 💎 is here and it’s strong!

to celebrate, we’re rolling out in TRL:

> support for multimodal tool responses for environments (OpenEnv)
> an example to train it in CARLA for autonomous driving with image-based tool calls

go check it out 🏎️🏎️

blog: https://huggingface.co/blog/gemma4
script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/carla_vlm_gemma.py

AI & ML interests

Recent Activity

Team members 20