Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Xmodel-LM Technical Report

arxiv:2406.02856

Xmodel-LM Technical Report

Published on Jun 5, 2024

· Submitted by

AK on Jun 6, 2024

Authors:

,

,

Xucheng Huang ,

Abstract

A compact Xmodel-LM language model, trained on a 2 trillion-token dataset, outperforms similar-sized open-source models.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

We introduce Xmodel-LM, a compact and efficient 1.1B language model pre-trained on over 2 trillion tokens. Trained on our self-built dataset (Xdata), which balances Chinese and English corpora based on downstream task optimization, Xmodel-LM exhibits remarkable performance despite its smaller size. It notably surpasses existing open-source language models of similar scale. Our model checkpoints and code are publicly accessible on GitHub at https://github.com/XiaoduoAILab/XmodelLM.

View arXiv page View PDF GitHub 0 auto Add to collection

Community

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2406.02856

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2406.02856 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2406.02856 in a Space README.md to link it from this page.

Collections including this paper 2