Deprecated : The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Dahoas (Alex Havrilla)
78
followers
·
4 following
AI & ML interests
NLP, RL
Organizations
Illustrating Reinforcement Learning from Human Feedback (RLHF)
Viewer
•
Updated
Jan 29, 2025
•
12.5k
•
20
Viewer
•
Updated
Dec 23, 2024
•
361k
•
4
Dahoas/aimo-validation-aime
Viewer
•
Updated
Dec 11, 2024
•
90
•
2
Dahoas/qwen-1.5-4B-default-positives-epoch-1-100
Viewer
•
Updated
Dec 6, 2024
•
290k
•
7
Dahoas/qwen-1.5-4B-tree-positives-epoch-2-100
Viewer
•
Updated
Dec 6, 2024
•
491k
•
4
Dahoas/qwen-1.5-4B-tree-positives-epoch-1-100
Viewer
•
Updated
Dec 5, 2024
•
477k
•
7
Dahoas/qwen-1.5-4B-epoch-1-test-100
Viewer
•
Updated
Nov 28, 2024
•
498k
•
7
Dahoas/qwen-1.5-4B-K-100-test
Viewer
•
Updated
Nov 5, 2024
•
500k
•
25
Dahoas/MATH_train_K_100_qwen_1.5_4B_outputs
Viewer
•
Updated
Oct 22, 2024
•
750k
•
16
Viewer
•
Updated
Sep 12, 2024
•
750k
•
7
•
2