SFT/Alignment - Phase 007-06-MLP8: ethicalabs/Kurtis-EON1-SFT Mix (1 epoch)

by mrs83 - opened Mar 16

Discussion

mrs83

ethicalabs.ai org Mar 16

mrs83

ethicalabs.ai org Mar 16

Errata corrige: that's v0.7.6 not v0.7.5. v0.7.5 was still on Finetome

mrs83 changed discussion title from SFT/Alignment - Phase 007-05-MLP8: ethicalabs/Kurtis-EON1-SFT Mix (1 epoch) to SFT/Alignment - Phase 007-06-MLP8: ethicalabs/Kurtis-EON1-SFT Mix (1 epoch) Mar 16

mrs83

ethicalabs.ai org Mar 16

Look, linear GPU MEM! 🚀 Just running an lm_eval with batch_size=16 on the new architecture.

Notice the orange line on the left. That is a perfectly flat, constant allocation of ~15GB out of 96GB, while compute (blue line) is pinned at 100%.

By shifting to a custom 3-Pass Triton kernel the memory footprint now scales linearly, allowing us to hold massive batch sizes and long contexts entirely in memory without triggering PyTorch's garbage collection or OOMs.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment