https://arxiv.org/abs/2601.08225

\n","updatedAt":"2026-01-14T04:49:55.439Z","author":{"_id":"64587be872b60ae7a3817858","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64587be872b60ae7a3817858/BbdOOxOCEzWTvEpkWp8MM.png","fullname":"Minbyul Jeong","name":"Minbyul","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8326322436332703},"editors":["Minbyul"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64587be872b60ae7a3817858/BbdOOxOCEzWTvEpkWp8MM.png"],"reactions":[],"isReport":false}},{"id":"6967f354fdbb71802941833c","author":{"_id":"638b4bd63bbf29e5890282a2","avatarUrl":"/avatars/8ae7bce2d58740d9e5173e9cbdea0c4f.svg","fullname":"Rasmus Schultz","name":"mindplay","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-01-14T19:49:40.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This sounds silly.\n\nIf you fine tune the model for \"high turn-count conversations\", it will (literally) learn to converse about things it could have just answered instead. \n\nLLMs don't have any thought process - they don't know what they know before predicting the next token.","html":"

This sounds silly.

If you fine tune the model for \"high turn-count conversations\", it will (literally) learn to converse about things it could have just answered instead.

LLMs don't have any thought process - they don't know what they know before predicting the next token.

\n","updatedAt":"2026-01-14T19:49:40.282Z","author":{"_id":"638b4bd63bbf29e5890282a2","avatarUrl":"/avatars/8ae7bce2d58740d9e5173e9cbdea0c4f.svg","fullname":"Rasmus Schultz","name":"mindplay","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9708483815193176},"editors":["mindplay"],"editorAvatarUrls":["/avatars/8ae7bce2d58740d9e5173e9cbdea0c4f.svg"],"reactions":[{"reaction":"👍","users":["PKUBrian"],"count":1}],"isReport":false}},{"id":"696844d1dfd37852c28ad3d7","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2026-01-15T01:37:21.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Jenius Agent: Towards Experience-Driven Accuracy Optimization in Real-World Scenarios](https://huggingface.co/papers/2601.01857) (2026)\n* [RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction](https://huggingface.co/papers/2601.06966) (2026)\n* [SpeakRL: Synergizing Reasoning, Speaking, and Acting in Language Models with Reinforcement Learning](https://huggingface.co/papers/2512.13159) (2025)\n* [ToolGym: an Open-world Tool-using Environment for Scalable Agent Testing and Data Curation](https://huggingface.co/papers/2601.06328) (2026)\n* [TravelBench: A Broader Real-World Benchmark for Multi-Turn and Tool-Using Travel Planning](https://huggingface.co/papers/2512.22673) (2025)\n* [SimRPD: Optimizing Recruitment Proactive Dialogue Agents through Simulator-Based Data Evaluation and Selection](https://huggingface.co/papers/2601.02871) (2026)\n* [Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction](https://huggingface.co/papers/2512.04987) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2026-01-15T01:37:21.927Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7483484148979187},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2601.08225","authors":[{"_id":"6967202cc5e371f6b235d1cc","user":{"_id":"6596a87480a4560b8f9b9532","avatarUrl":"/avatars/33f39ee01c1648f1daca41a40a4964fb.svg","isPro":false,"fullname":"Christopher Cho","user":"ChristopherCho","type":"user"},"name":"Jungho Cho","status":"claimed_verified","statusLastChangedAt":"2026-01-15T15:04:06.076Z","hidden":false},{"_id":"6967202cc5e371f6b235d1cd","name":"Minbyul Jeong","hidden":false},{"_id":"6967202cc5e371f6b235d1ce","user":{"_id":"65446c938737c799e9ad6f83","avatarUrl":"/avatars/6ade251e01442b14cbf8cd7888358fd1.svg","isPro":false,"fullname":"Sungrae Park","user":"sungrae-park","type":"user"},"name":"Sungrae Park","status":"admin_assigned","statusLastChangedAt":"2026-01-14T12:49:29.339Z","hidden":false}],"publishedAt":"2026-01-13T05:14:09.000Z","submittedOnDailyAt":"2026-01-14T02:19:55.431Z","title":"User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale","submittedOnDailyBy":{"_id":"64587be872b60ae7a3817858","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64587be872b60ae7a3817858/BbdOOxOCEzWTvEpkWp8MM.png","isPro":false,"fullname":"Minbyul Jeong","user":"Minbyul","type":"user"},"summary":"The recent paradigm shift toward large reasoning models (LRMs) as autonomous agents has intensified the demand for sophisticated, multi-turn tool-use capabilities. Yet, existing datasets and data-generation approaches are limited by static, predefined toolsets that cannot scale to the complexity of open-ended human-agent collaboration. To address this, we initially developed a framework for automated task-oriented multi-turn dialogue generation at scale, utilizing an LRM-based simulator to dynamically generate high-value, domain-specific tools to solve specified tasks. However, we observe that a purely task-oriented design often results in \"solely task-solving\" trajectories, where the agent completes the objective with minimal interaction, failing to generate the high turn-count conversations seen in realistic scenarios. To bridge this gap, we shift toward a user-oriented simulation paradigm. By decoupling task generation from a dedicated user simulator that mimics human behavioral rules - such as incremental request-making and turn-by-turn feedback - we facilitate more authentic, extended multi-turn dialogues that reflect the iterative nature of real-world problem solving. Our generation pipeline operates as a versatile, plug-and-play module capable of initiating generation from any state, ensuring high scalability in producing extended tool-use data. Furthermore, by facilitating multiple task completions within a single trajectory, it yields a high-density dataset that reflects the multifaceted demands of real-world human-agent interaction.","upvotes":52,"discussionId":"6967202dc5e371f6b235d1cf","ai_summary":"Large reasoning models enable scalable multi-turn dialogue generation through automated task-oriented simulation and user-oriented behavioral modeling for enhanced human-agent interaction datasets.","ai_keywords":["large reasoning models","multi-turn dialogue generation","task-oriented simulation","user simulator","behavioral rules","turn-by-turn feedback","automated task generation","high-density dataset","human-agent interaction"],"organization":{"_id":"62940d125d1c94a62e838db2","name":"upstage","fullname":"upstage","avatar":"https://cdn-uploads.huggingface.co/production/uploads/649144feeb13c70f7671c603/bUxWC5jKltd-MyrrCNCv5.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64587be872b60ae7a3817858","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64587be872b60ae7a3817858/BbdOOxOCEzWTvEpkWp8MM.png","isPro":false,"fullname":"Minbyul Jeong","user":"Minbyul","type":"user"},{"_id":"6596a87480a4560b8f9b9532","avatarUrl":"/avatars/33f39ee01c1648f1daca41a40a4964fb.svg","isPro":false,"fullname":"Christopher Cho","user":"ChristopherCho","type":"user"},{"_id":"670f5c3f642f58673b1f435a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/wa_TulGokgVzMN0-_GlBB.png","isPro":false,"fullname":"YewonCho","user":"doldol330","type":"user"},{"_id":"6306df0ed37ce67e0e53e3f1","avatarUrl":"/avatars/3cbe5762d1e1ccf259f4bbed9fc1fa00.svg","isPro":false,"fullname":"Hyeon Hwang","user":"Hyeoni","type":"user"},{"_id":"68073e715231d4a2dd0c226f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/xg1yDxbV4tBz7Izi38sXV.png","isPro":false,"fullname":"Gyungin Shin","user":"noelshin-upstage","type":"user"},{"_id":"631c386bc73939ffc0716a37","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1662793811119-noauth.jpeg","isPro":false,"fullname":"SeongWan Kim","user":"idgmatrix","type":"user"},{"_id":"60f8435644e75317cc02ed51","avatarUrl":"/avatars/68b7fc077fe2bda6607b1c470add8140.svg","isPro":false,"fullname":"Jungwoo Park","user":"affjljoo3581","type":"user"},{"_id":"66838f4de0aa21c5653547a7","avatarUrl":"/avatars/f4d03c613a9bdd727dc4dc4f568fe584.svg","isPro":false,"fullname":"Junseok Choe","user":"juns94","type":"user"},{"_id":"66e0d4bf290df82f137de44c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66e0d4bf290df82f137de44c/RJ43RxY56_OvZmauF88Tw.jpeg","isPro":false,"fullname":"Kyle Yi","user":"younatics","type":"user"},{"_id":"641e81e75c348064a8259d5d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/641e81e75c348064a8259d5d/uPGIDF5NOoVCInxpAscm_.jpeg","isPro":false,"fullname":"Wonho Song","user":"wonhosong","type":"user"},{"_id":"64d38405f8082bf19b718e87","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64d38405f8082bf19b718e87/denTgpDZ2SnakkrAsggQ7.jpeg","isPro":false,"fullname":"Yves","user":"41ow1ives","type":"user"},{"_id":"6520f9569334173c6252aeda","avatarUrl":"/avatars/27ce486d5bba8071e45b47a38577d68f.svg","isPro":false,"fullname":"Sungbeom","user":"baekeley","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0,"organization":{"_id":"62940d125d1c94a62e838db2","name":"upstage","fullname":"upstage","avatar":"https://cdn-uploads.huggingface.co/production/uploads/649144feeb13c70f7671c603/bUxWC5jKltd-MyrrCNCv5.png"}}">

Papers

arxiv:2601.08225

User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale

Published on Jan 13

· Submitted by

Minbyul Jeong on Jan 14

upstage

Upvote

Authors:

Jungho Cho ,

Sungrae Park

Abstract

Large reasoning models enable scalable multi-turn dialogue generation through automated task-oriented simulation and user-oriented behavioral modeling for enhanced human-agent interaction datasets.

AI-generated summary

The recent paradigm shift toward large reasoning models (LRMs) as autonomous agents has intensified the demand for sophisticated, multi-turn tool-use capabilities. Yet, existing datasets and data-generation approaches are limited by static, predefined toolsets that cannot scale to the complexity of open-ended human-agent collaboration. To address this, we initially developed a framework for automated task-oriented multi-turn dialogue generation at scale, utilizing an LRM-based simulator to dynamically generate high-value, domain-specific tools to solve specified tasks. However, we observe that a purely task-oriented design often results in "solely task-solving" trajectories, where the agent completes the objective with minimal interaction, failing to generate the high turn-count conversations seen in realistic scenarios. To bridge this gap, we shift toward a user-oriented simulation paradigm. By decoupling task generation from a dedicated user simulator that mimics human behavioral rules - such as incremental request-making and turn-by-turn feedback - we facilitate more authentic, extended multi-turn dialogues that reflect the iterative nature of real-world problem solving. Our generation pipeline operates as a versatile, plug-and-play module capable of initiating generation from any state, ensuring high scalability in producing extended tool-use data. Furthermore, by facilitating multiple task completions within a single trajectory, it yields a high-density dataset that reflects the multifaceted demands of real-world human-agent interaction.

View arXiv page View PDF Add to collection

Community

Minbyul

Paper submitter Jan 14

While large language models have shown remarkable progress in tool use, maintaining high-quality, user-centric multi-turn conversations at scale remains a significant challenge.

Our work focuses on:
(1) Generating high-fidelity multi-turn dialogue datasets designed for practical tool-use scenarios.
(2) Enhancing model performance in complex, user-oriented interactions.
(3) Providing insights into scaling dialogue generation without compromising on user experience.

Check out the full paper here: https://arxiv.org/abs/2601.08225

mindplay

Jan 14

This sounds silly.

If you fine tune the model for "high turn-count conversations", it will (literally) learn to converse about things it could have just answered instead.

LLMs don't have any thought process - they don't know what they know before predicting the next token.