Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks
[go: Go Back, main page]

https://arxivexplained.com/papers/cot-self-instruct-building-high-quality-synthetic-prompts-for-reasoning-and-non-reasoning-tasks

\n","updatedAt":"2025-08-14T12:05:34.094Z","author":{"_id":"65d9fc2a0e6ad24551d87a1e","avatarUrl":"/avatars/3aedb9522cc3cd08349d654f523fd792.svg","fullname":"Grant Singleton","name":"grantsing","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8274579644203186},"editors":["grantsing"],"editorAvatarUrls":["/avatars/3aedb9522cc3cd08349d654f523fd792.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2507.23751","authors":[{"_id":"688f549b0a411b3b8d28d552","name":"Ping Yu","hidden":false},{"_id":"688f549b0a411b3b8d28d553","user":{"_id":"65dd30a6ea1a202d3cb469f7","avatarUrl":"/avatars/14029def026380e6e20822d916b25a72.svg","isPro":false,"fullname":"Jack Lanchantin","user":"jcklcn","type":"user"},"name":"Jack Lanchantin","status":"claimed_verified","statusLastChangedAt":"2025-10-30T14:46:10.003Z","hidden":false},{"_id":"688f549b0a411b3b8d28d554","name":"Tianlu Wang","hidden":false},{"_id":"688f549b0a411b3b8d28d555","name":"Weizhe Yuan","hidden":false},{"_id":"688f549b0a411b3b8d28d556","name":"Olga Golovneva","hidden":false},{"_id":"688f549b0a411b3b8d28d557","name":"Ilia Kulikov","hidden":false},{"_id":"688f549b0a411b3b8d28d558","name":"Sainbayar Sukhbaatar","hidden":false},{"_id":"688f549b0a411b3b8d28d559","name":"Jason Weston","hidden":false},{"_id":"688f549b0a411b3b8d28d55a","name":"Jing Xu","hidden":false}],"publishedAt":"2025-07-31T17:38:50.000Z","title":"CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning\n and non-reasoning tasks","summary":"We propose CoT-Self-Instruct, a synthetic data generation method that\ninstructs LLMs to first reason and plan via Chain-of-Thought (CoT) based on the\ngiven seed tasks, and then to generate a new synthetic prompt of similar\nquality and complexity for use in LLM training, followed by filtering for\nhigh-quality data with automatic metrics. In verifiable reasoning, our\nsynthetic data significantly outperforms existing training datasets, such as\ns1k and OpenMathReasoning, across MATH500, AMC23, AIME24 and GPQA-Diamond. For\nnon-verifiable instruction-following tasks, our method surpasses the\nperformance of human or standard self-instruct prompts on both AlpacaEval 2.0\nand Arena-Hard.","upvotes":4,"discussionId":"688f549c0a411b3b8d28d55b","ai_summary":"CoT-Self-Instruct generates high-quality synthetic data for LLM training by leveraging Chain-of-Thought reasoning and automatic filtering, outperforming existing datasets in both verifiable reasoning and instruction-following tasks.","ai_keywords":["Chain-of-Thought","CoT-Self-Instruct","synthetic data generation","LLMs","automatic metrics","MATH500","AMC23","AIME24","GPQA-Diamond","AlpacaEval 2.0","Arena-Hard"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64834b399b352597e41816ac","avatarUrl":"/avatars/63d9d123bffa90f43186a0bdc4455cbd.svg","isPro":false,"fullname":"Shaobai Jiang","user":"shaobaij","type":"user"},{"_id":"655eca47e1b6d15cfe8f8235","avatarUrl":"/avatars/1ab847d12957a8ab77c84bdc452e3f5f.svg","isPro":false,"fullname":"Tao","user":"Leitian","type":"user"},{"_id":"6532cecc7139c5dd8d527685","avatarUrl":"/avatars/b36027822d6b00831eb9c232031194f0.svg","isPro":true,"fullname":"Jarrod Barnes","user":"Jarrodbarnes","type":"user"},{"_id":"6358edff3b3638bdac83f7ac","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1666772404424-noauth.jpeg","isPro":false,"fullname":"Pratyay Banerjee","user":"Neilblaze","type":"user"}],"acceptLanguages":["*"]}">
Papers
arxiv:2507.23751

CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks

Published on Jul 31, 2025
Authors:
,
,
,
,
,
,
,

Abstract

CoT-Self-Instruct generates high-quality synthetic data for LLM training by leveraging Chain-of-Thought reasoning and automatic filtering, outperforming existing datasets in both verifiable reasoning and instruction-following tasks.

AI-generated summary

We propose CoT-Self-Instruct, a synthetic data generation method that instructs LLMs to first reason and plan via Chain-of-Thought (CoT) based on the given seed tasks, and then to generate a new synthetic prompt of similar quality and complexity for use in LLM training, followed by filtering for high-quality data with automatic metrics. In verifiable reasoning, our synthetic data significantly outperforms existing training datasets, such as s1k and OpenMathReasoning, across MATH500, AMC23, AIME24 and GPQA-Diamond. For non-verifiable instruction-following tasks, our method surpasses the performance of human or standard self-instruct prompts on both AlpacaEval 2.0 and Arena-Hard.

Community

For more information, check out the arXiv page for this paper: arXiv:2507.23751.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2507.23751 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 1

Collections including this paper 1