arxiv:2606.07379
Takashi Ishida
tksii
AI & ML interests
None yet
Recent Activity
upvoted a paper about 12 hours ago
CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent Economies upvoted a paper 14 days ago
Mitigating Reward Hacking in RLHF via Advantage Sign Robustness authored a paper 15 days ago
Mitigating Reward Hacking in RLHF via Advantage Sign Robustness