Deprecated : The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
AIPlans (AI Plans)
Reinforcement Learning
•
0.6B
•
Updated
Dec 22, 2025
•
8
•
2
AIPlans/Qwen3-0.6B-GRPO-RM_NVIDIA
Text Generation
•
0.6B
•
Updated
Dec 20, 2025
•
2
AIPlans/Qwen3-0.6B-GRPO_Epoch2
Text Generation
•
0.6B
•
Updated
Dec 18, 2025
•
2
AIPlans/Qwen3-0.6B-GRPO_Epoch1
Text Generation
•
0.6B
•
Updated
Dec 18, 2025
•
3
Reinforcement Learning
•
0.6B
•
Updated
Dec 12, 2025
•
10
AIPlans/qwen3-0.6b-base-PPO-hs2
Updated
Dec 11, 2025
AIPlans/Qwen3-0.6B-DPO_Epoch_1
Text Generation
•
0.6B
•
Updated
Dec 8, 2025
•
3
AIPlans/Qwen3-0.6B-SFT-hs2
Text Generation
•
0.6B
•
Updated
Dec 4, 2025
•
6
AIPlans/Qwen3-0.6B-RM-hs2
Text Classification
•
0.6B
•
Updated
Dec 1, 2025
•
1
•
1
Text Generation
•
Updated
Nov 28, 2025
•
4
AIPlans/Qwen3-0.6B-DPO_NOTLORA
Text Generation
•
0.6B
•
Updated
Nov 25, 2025
•
2
Text Generation
•
Updated
Nov 22, 2025
•
4
•
1
Text Generation
•
Updated
Nov 22, 2025
•
1
AIPlans/qwen3-0.6b-hh-rlhf-sft
0.6B
•
Updated
Nov 17, 2025
AIPlans/Qwen3-0.6B-KTO_trial
Text Generation
•
0.6B
•
Updated
Nov 10, 2025
•
2
•
1
AIPlans/qwen3-0.6b-sft-hh-rlhf-lora
Updated
Oct 24, 2025
AIPlans/qwen3-0.6b-base-PPO-PM
AIPlans/qwen3-0.6b-base-hl-RM
Text Classification
•
0.6B
•
Updated
Sep 27, 2025
0.6B
•
Updated
Sep 24, 2025
AIPlans/qwen3-0.6b-dpo-lora
Text Generation
•
0.6B
•
Updated
Sep 18, 2025
•
2
•
1
AIPlans/qwen3-0.6B-reward-hh-rlhf
Text Generation
•
0.6B
•
Updated
Sep 13, 2025
•
1
AIPlans/qwen3-8b-ipo-hh-rlhf
Text Generation
•
Updated
Jul 17, 2025
AIPlans/qwen3-8b-dpo-hh-rlhf
Updated
Jul 4, 2025
AIPlans/Qwen3-HHH-Cipher-Eng
Text Generation
•
0.6B
•
Updated
Jun 15, 2025
•
13
AIPlans/Qwen-HHH-Cipher-Eng
Text Generation
•
0.5B
•
Updated
Jun 14, 2025
•
2
AIPlans/Qwen-HHH-Sans-Eng
Text Generation
•
0.5B
•
Updated
Jun 11, 2025
•
1