Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
HuggingFaceFW (FineData)

FineData

community

AI & ML interests

We release large pre-training datasets to accelerate open LLM development. Part of the Hugging Face Science team (hf.co/science)

Recent Activity

joelniklaus updated a Space 3 days ago

HuggingFaceFW/finephrase

joelniklaus updated a bucket 4 days ago

HuggingFaceFW/finephrase-checkpoints

joelniklaus new activity about 1 month ago

HuggingFaceFW/finephrase:Intrinsic quality evaluation of 3000 examples using LLM-as-judge

View all activity

Papers

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

View all Papers

HuggingFaceFW 's papers 2

Submitted by

Guilherme Penedo

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

HuggingFaceFW

1

Submitted by

Philipp Schmid

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

HuggingFaceFW

5