Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
LorMolf/SPSD-RL · Hugging Face
[go: Go Back, main page]

SPSD-RL

Qwen3-4B-Base full-parameter checkpoint trained with prompt/completion supervision on the LorMolf/SPSD-RL conversation dataset.

Source artifact: outputs/qwen3_4b_base_spsd_rl_sft_prompt_completion_4gpu_20260603/final_bs20_accum4_ddp7200_wandb_localcache.

Downloads last month
8
Safetensors
Model size
4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LorMolf/SPSD-RL

Finetuned
(308)
this model