Deprecated : The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
RL4Reasoning (RL4Reasoning)
AI & ML interests None defined yet.
Team members 3
models 39 RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-8192-rtl-cliphigh-hf-1.5B-2_deepscaler_-390 Updated Apr 22, 2025
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-e-l4096-cliphigh-hf-1.5B-4_deepscaler_-220 2B • Updated Apr 22, 2025 • 2
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-e-cliphigh-hf-1.5B-4_deepscaler_-390 2B • Updated Apr 22, 2025 • 2
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-2048-rtl-cliphigh-hf-1.5B-4_deepscaler_-340 2B • Updated Apr 22, 2025 • 7
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-4096-rtl-cliphigh-hf-1.5B-4_deepscaler_-140 2B • Updated Apr 22, 2025 • 3
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-8192-rtl-cliphigh-hf-1.5B-4_deepscaler_-390 2B • Updated Apr 22, 2025 • 4
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-l4096-cliphigh-hf-1.5B-4_deepscaler_-320 2B • Updated Apr 22, 2025 • 2
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-cliphigh-hf-1.5B-4_deepscaler_-460 2B • Updated Apr 22, 2025 • 2
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-l1024-cliphigh-hf-1.5B-4_deepscaler_-430 2B • Updated Apr 22, 2025 • 2
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-l4096-cliphigh-hf-1.5B-4_deepscaler_-220 2B • Updated Apr 22, 2025 • 3