Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
LLM360/MegaMath-Llama-3.2-3B · Hugging Face

LLM360
/

MegaMath-Llama-3.2-3B

Text Generation

text-generation-inference

Model card Files Files and versions

MegaMath-Llama-3.2-3B

Arxiv | Datasets

A proof-of-concept model train on MegaMath dataset, capable of both Chain-of-Thought and Program-Aided-Language problem solving.

Performance

Citation

If you find our work useful, please cite

@article{zhou2025megamath,
  title     = {MegaMath: Pushing the Limits of Open Math Corpora},
  author    = {Zhou, Fan and Wang, Zengzhi and Ranjan, Nikhil and Cheng, Zhoujun and Tang, Liping and He, Guowei and Liu, Zhengzhong and Xing, Eric P.},
  journal   = {arXiv preprint arXiv:2504.02807},
  year      = {2025},
  note      = {Preprint}
}

Downloads last month: 12

Safetensors

Model size

3B params

Tensor type

BF16

·

Model tree for LLM360/MegaMath-Llama-3.2-3B

Quantizations

Dataset used to train LLM360/MegaMath-Llama-3.2-3B

Collection including LLM360/MegaMath-Llama-3.2-3B

MegaMath

MegaMath, the largest open math pre-training dataset curated from diverse, math-focused sources, with over 300B tokens. • 4 items • Updated Jul 3, 2025 • 2

Paper for LLM360/MegaMath-Llama-3.2-3B

MegaMath: Pushing the Limits of Open Math Corpora

Paper • 2504.02807 • Published Apr 3, 2025 • 35