-
TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar
Paper • 2510.14972 • Published • 35 -
LightMem: Lightweight and Efficient Memory-Augmented Generation
Paper • 2510.18866 • Published • 114 -
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning
Paper • 2510.19338 • Published • 115 -
The Smol Training Playbook
📚3kThe secrets to building world-class LLMs
Jonatan Borkowski PRO
j14i
AI & ML interests
None yet
Recent Activity
reacted
to
qgallouedec's
post
with 🔥
about 15 hours ago
@CohereLabs just released 🌿 Tiny Aya: a fully open-source 3B parameter model that speaks 70+ languages 🌍! But there’s a catch:
Tiny Aya is just a language model. It doesn’t support tool calling, the key capability that turns frontier models into powerful *agents*.
So the real question is:
How hard is it to turn Tiny Aya into an agent?
Turns out… it’s simple, thanks to Hugging Face TRL.
We’re sharing a hands-on example showing how to train Tiny Aya to turn it into a tool-calling agent using TRL, unlocking what could become the first *massively multilingual open agent*.
Small model. Global reach. Agent capabilities.
👉 https://github.com/huggingface/trl/blob/main/examples/notebooks/sft_tool_calling.ipynb
liked
a model
10 days ago
unsloth/GLM-5