Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning
[go: Go Back, main page]

https://github.com/ChengpengLi1003/DotaMath.

\n","updatedAt":"2024-07-08T07:28:03.302Z","author":{"_id":"61cd4b833dd34ba1985e0753","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png","fullname":"KABI","name":"dongguanting","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":64,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.8661122918128967},"editors":["dongguanting"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png"],"reactions":[{"reaction":"🔥","users":["HugoLaurencon","marinaretik","dongguanting"],"count":3}],"isReport":false}},{"id":"668b94d912de1f2acc2e39a5","author":{"_id":"61cd4b833dd34ba1985e0753","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png","fullname":"KABI","name":"dongguanting","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":64,"isUserFollowing":false},"createdAt":"2024-07-08T07:27:21.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Code will be released at https://github.com/ChengpengLi1003/DotaMath","html":"

Code will be released at https://github.com/ChengpengLi1003/DotaMath

\n","updatedAt":"2024-07-08T07:27:21.303Z","author":{"_id":"61cd4b833dd34ba1985e0753","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png","fullname":"KABI","name":"dongguanting","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":64,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7793712019920349},"editors":["dongguanting"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png"],"reactions":[],"isReport":false}},{"id":"668be654404f06f92879185e","author":{"_id":"5f1158120c833276f61f1a84","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1608042047613-5f1158120c833276f61f1a84.jpeg","fullname":"Niels Rogge","name":"nielsr","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":1096,"isUserFollowing":false},"createdAt":"2024-07-08T13:15:00.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi @dongguanting thanks for submitting the paper! Feel free to claim the paper (by clicking on your author name at the top of this page), so that it appears on your HF profile.\n\nAlso, are you planning on releasing the model and dataset on Hugging Face?","html":"

Hi \n\n@dongguanting\n\t thanks for submitting the paper! Feel free to claim the paper (by clicking on your author name at the top of this page), so that it appears on your HF profile.

\n

Also, are you planning on releasing the model and dataset on Hugging Face?

\n","updatedAt":"2024-07-08T13:15:00.372Z","author":{"_id":"5f1158120c833276f61f1a84","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1608042047613-5f1158120c833276f61f1a84.jpeg","fullname":"Niels Rogge","name":"nielsr","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":1096,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9263043403625488},"editors":["nielsr"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1608042047613-5f1158120c833276f61f1a84.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2407.04078","authors":[{"_id":"668b943035da6da695a9a392","user":{"_id":"65294b334d7cf551ac50d6a6","avatarUrl":"/avatars/75d21e20b711b871616ef3850bb900b7.svg","isPro":false,"fullname":"ChengpengLi","user":"ChengpengLi","type":"user"},"name":"Chengpeng Li","status":"admin_assigned","statusLastChangedAt":"2024-07-08T21:12:23.011Z","hidden":false},{"_id":"668b943035da6da695a9a393","user":{"_id":"61cd4b833dd34ba1985e0753","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png","isPro":false,"fullname":"KABI","user":"dongguanting","type":"user"},"name":"Guanting Dong","status":"admin_assigned","statusLastChangedAt":"2024-07-08T21:12:36.856Z","hidden":false},{"_id":"668b943035da6da695a9a394","user":{"_id":"5f8946925d083370c711f296","avatarUrl":"/avatars/14246aae3b1f8b7ad050f8ff2c8b260e.svg","isPro":false,"fullname":"Mingfeng Xue","user":"mingfengxue","type":"user"},"name":"Mingfeng Xue","status":"claimed_verified","statusLastChangedAt":"2024-07-08T08:49:40.844Z","hidden":false},{"_id":"668b943035da6da695a9a395","user":{"_id":"6687b233586426849536faff","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6687b233586426849536faff/q7EBRrWlk2eYidsKCPC9h.jpeg","isPro":false,"fullname":"Ru Peng","user":"RuPeng","type":"user"},"name":"Ru Peng","status":"admin_assigned","statusLastChangedAt":"2024-07-08T21:12:46.126Z","hidden":false},{"_id":"668b943035da6da695a9a396","user":{"_id":"65fca775fa59bdf4737b1a84","avatarUrl":"/avatars/a161b510bde8f57e7686cbb0b4aa6a52.svg","isPro":false,"fullname":"Xiang Wang","user":"xiangwang1223","type":"user"},"name":"Xiang Wang","status":"admin_assigned","statusLastChangedAt":"2024-07-08T21:13:54.374Z","hidden":false},{"_id":"668b943035da6da695a9a397","user":{"_id":"6434d4989bd5a84b5dd0b0f5","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6434d4989bd5a84b5dd0b0f5/0Elf9qbfG9Hkgypm9pTGm.jpeg","isPro":false,"fullname":"Dayiheng Liu","user":"Losin94","type":"user"},"name":"Dayiheng Liu","status":"admin_assigned","statusLastChangedAt":"2024-07-08T21:12:56.332Z","hidden":false}],"publishedAt":"2024-07-04T17:39:16.000Z","submittedOnDailyAt":"2024-07-08T05:56:10.008Z","title":"DotaMath: Decomposition of Thought with Code Assistance and\n Self-correction for Mathematical Reasoning","submittedOnDailyBy":{"_id":"61cd4b833dd34ba1985e0753","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png","isPro":false,"fullname":"KABI","user":"dongguanting","type":"user"},"summary":"Large language models (LLMs) have made impressive progress in handling simple\nmath problems, yet they still struggle with more challenging and complex\nmathematical tasks. In this paper, we introduce a series of LLMs that employs\nthe Decomposition of thought with code assistance and self-correction for\nmathematical reasoning, dubbed as DotaMath. DotaMath models tackle complex\nmathematical tasks by decomposing them into simpler logical subtasks,\nleveraging code to solve these subtasks, obtaining fine-grained feedback from\nthe code interpreter, and engaging in self-reflection and correction. By\nannotating diverse interactive tool-use trajectories and employing query\nevolution on GSM8K and MATH datasets, we generate an instruction fine-tuning\ndataset called DotaMathQA with 574K query-response pairs. We train a series of\nbase LLMs using imitation learning on DotaMathQA, resulting in DotaMath models\nthat achieve remarkable performance compared to open-source LLMs across various\nin-domain and out-of-domain benchmarks. Notably, DotaMath-deepseek-7B showcases\nan outstanding performance of 64.8% on the competitive MATH dataset and 86.7%\non GSM8K. Besides, DotaMath-deepseek-7B maintains strong competitiveness on a\nseries of in-domain and out-of-domain benchmarks (Avg. 80.1%). Looking forward,\nwe anticipate that the DotaMath paradigm will open new pathways for addressing\nintricate mathematical problems. Our code is publicly available at\nhttps://github.com/ChengpengLi1003/DotaMath.","upvotes":21,"discussionId":"668b943235da6da695a9a448","githubRepo":"https://github.com/chengpengli1003/dotamath","githubRepoAddedBy":"auto","ai_summary":"DotaMath, a series of LLMs using decomposition with code assistance and self-correction, achieves high performance on complex mathematical tasks by fine-tuning on a large dataset of interactive tool Use.","ai_keywords":["LLMs","Decomposition of thought","code assistance","self-correction","mathematical reasoning","DotaMath","logical subtasks","code interpreter","self-reflection","annotation","query evolution","imitation learning","DotaMathQA","in-domain benchmarks","out-of-domain benchmarks","MATH dataset","GSM8K","DotaMath-deepseek-7B"],"githubStars":30},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"61cd4b833dd34ba1985e0753","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png","isPro":false,"fullname":"KABI","user":"dongguanting","type":"user"},{"_id":"65294b334d7cf551ac50d6a6","avatarUrl":"/avatars/75d21e20b711b871616ef3850bb900b7.svg","isPro":false,"fullname":"ChengpengLi","user":"ChengpengLi","type":"user"},{"_id":"6687b233586426849536faff","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6687b233586426849536faff/q7EBRrWlk2eYidsKCPC9h.jpeg","isPro":false,"fullname":"Ru Peng","user":"RuPeng","type":"user"},{"_id":"5f8946925d083370c711f296","avatarUrl":"/avatars/14246aae3b1f8b7ad050f8ff2c8b260e.svg","isPro":false,"fullname":"Mingfeng Xue","user":"mingfengxue","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"648eb1eb59c4e5c87dc116e0","avatarUrl":"/avatars/c636cea39c2c0937f01398c94ead5dad.svg","isPro":false,"fullname":"fdsqefsgergd","user":"T-representer","type":"user"},{"_id":"64587be872b60ae7a3817858","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64587be872b60ae7a3817858/BbdOOxOCEzWTvEpkWp8MM.png","isPro":false,"fullname":"Minbyul Jeong","user":"Minbyul","type":"user"},{"_id":"631d9fc7e4b67f5168f65eef","avatarUrl":"/avatars/5ead011052c96962901a5419c68b9cc2.svg","isPro":false,"fullname":"Sherman Chann","user":"152334H","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"61c85f96ef57e13c426f608f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61c85f96ef57e13c426f608f/aHrqyoZwImIATd2rB1kAN.png","isPro":false,"fullname":"Peng Wang","user":"stillarrow","type":"user"},{"_id":"6555125a4f361968f0e3aad7","avatarUrl":"/avatars/e7692d82804338f21ecdc6e731f5c5ea.svg","isPro":false,"fullname":"marinaretikof","user":"marinaretik","type":"user"},{"_id":"651c80a26ba9ab9b9582c273","avatarUrl":"/avatars/e963452eafd21f517d800f2e58e0f918.svg","isPro":false,"fullname":"siyeng feng","user":"siyengfeng","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2407.04078

DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning

Published on Jul 4, 2024
· Submitted by
KABI
on Jul 8, 2024

Abstract

DotaMath, a series of LLMs using decomposition with code assistance and self-correction, achieves high performance on complex mathematical tasks by fine-tuning on a large dataset of interactive tool Use.

AI-generated summary

Large language models (LLMs) have made impressive progress in handling simple math problems, yet they still struggle with more challenging and complex mathematical tasks. In this paper, we introduce a series of LLMs that employs the Decomposition of thought with code assistance and self-correction for mathematical reasoning, dubbed as DotaMath. DotaMath models tackle complex mathematical tasks by decomposing them into simpler logical subtasks, leveraging code to solve these subtasks, obtaining fine-grained feedback from the code interpreter, and engaging in self-reflection and correction. By annotating diverse interactive tool-use trajectories and employing query evolution on GSM8K and MATH datasets, we generate an instruction fine-tuning dataset called DotaMathQA with 574K query-response pairs. We train a series of base LLMs using imitation learning on DotaMathQA, resulting in DotaMath models that achieve remarkable performance compared to open-source LLMs across various in-domain and out-of-domain benchmarks. Notably, DotaMath-deepseek-7B showcases an outstanding performance of 64.8% on the competitive MATH dataset and 86.7% on GSM8K. Besides, DotaMath-deepseek-7B maintains strong competitiveness on a series of in-domain and out-of-domain benchmarks (Avg. 80.1%). Looking forward, we anticipate that the DotaMath paradigm will open new pathways for addressing intricate mathematical problems. Our code is publicly available at https://github.com/ChengpengLi1003/DotaMath.

Community

Paper author Paper submitter
•
edited Jul 8, 2024

In this paper, we introduce a series of LLMs that employs the Decomposition of thought with code assistance and self-correction for mathematical reasoning, dubbed as DotaMath. DotaMath models tackle complex mathematical tasks by decomposing them into simpler logical subtasks, leveraging code to solve these subtasks, obtaining fine-grained feedback from the code interpreter, and engaging in self-reflection and correction. By annotating diverse interactive tool-use trajectories and employing query evolution on GSM8K and MATH datasets, we generate an instruction fine-tuning dataset called DotaMathQA with 574K query-response pairs. We train a series of base LLMs using imitation learning on DotaMathQA, resulting in DotaMath models that achieve remarkable performance compared to open-source LLMs across various in-domain and out-of-domain benchmarks.

Notably, DotaMath-deepseek-7B showcases an outstanding performance of 64.8% on the competitive MATH dataset and 86.7% on GSM8K. Besides, DotaMath-deepseek-7B maintains strong competitiveness on a series of in-domain and out-of-domain benchmarks (Avg. 80.1%). Looking forward, we anticipate that the DotaMath paradigm will open new pathways for addressing intricate mathematical problems. Our code is publicly available at https://github.com/ChengpengLi1003/DotaMath.

Paper author Paper submitter

Hi @dongguanting thanks for submitting the paper! Feel free to claim the paper (by clicking on your author name at the top of this page), so that it appears on your HF profile.

Also, are you planning on releasing the model and dataset on Hugging Face?

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2407.04078 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2407.04078 in a Space README.md to link it from this page.

Collections including this paper 5