view article Article Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs +3 11 days ago • 28
DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models Paper • 2603.26164 • Published 24 days ago • 354
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 18 days ago • 866
MolmoWeb-Data Collection This is the collection of all datasets in MolmoWebMix. • 6 items • Updated 27 days ago • 26
MolmoWeb Collection This is the collection of MolmoWeb artifacts, including model checkpoints and data. • 8 items • Updated 7 days ago • 24
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation Paper • 2103.06874 • Published Mar 11, 2021 • 3
The MultiBERTs: BERT Reproductions for Robustness Analysis Paper • 2106.16163 • Published Jun 30, 2021 • 1
Well-Read Students Learn Better: On the Importance of Pre-training Compact Models Paper • 1908.08962 • Published Aug 23, 2019 • 1