Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Typhoon-S: Minimal Open Post-Training for Sovereign Large Language Models
[go: Go Back, main page]

https://github.com/scb-10x/typhoon-s
Artifact: https://huggingface.co/collections/typhoon-ai/typhoon-s

\n","updatedAt":"2026-01-30T03:03:47.583Z","author":{"_id":"62d192c2d50433c35eb1b48e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d192c2d50433c35eb1b48e/VjmDu8GOIuLuQNBQdQLLS.png","fullname":"Kunat Pipatanakul","name":"kunato","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":15,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7344342470169067},"editors":["kunato"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/62d192c2d50433c35eb1b48e/VjmDu8GOIuLuQNBQdQLLS.png"],"reactions":[],"isReport":false}},{"id":"697c1fd65b5b9ce106e1d4b2","author":{"_id":"62d192c2d50433c35eb1b48e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d192c2d50433c35eb1b48e/VjmDu8GOIuLuQNBQdQLLS.png","fullname":"Kunat Pipatanakul","name":"kunato","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":15,"isUserFollowing":false},"createdAt":"2026-01-30T03:04:54.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Large language models (LLMs) have progressed rapidly; however, most state-of-the-art models are\r\ntrained and evaluated primarily in high-resource languages such as English and Chinese. In addition, they are often developed by a small number of organizations with access to large-scale compute\r\nand data. This gatekeeping creates a practical barrier for sovereign settings in which a regional- or\r\nnational-scale institution or domain owner must retain control and understanding of model weights,\r\ntraining data, and deployment while operating under limited resources and strict transparency constraints. To this end, we identify two core requirements: (1) adoptability, the ability to transform\r\na base model into a general-purpose assistant, and (2) sovereign capability, the ability to perform\r\nhigh-stakes, region-specific tasks (e.g., legal reasoning in local languages and cultural knowledge).\r\nWe investigate whether these requirements can be achieved without scaling massive general-purpose\r\ninstruction corpora or relying on complex preference tuning pipelines and large-scale reinforcement\r\nfine-tuning (RFT). We present Typhoon S, a minimal and open post-training recipe that combines supervised fine-tuning, on-policy distillation, and small-scale RFT stages. Using Thai as a\r\nrepresentative case study, we demonstrate that our approach successfully addresses adoptability by\r\ntransforming both sovereign-adapted and general-purpose base models into instruction-tuned models\r\nwith strong general performance. We further show that small-scale RFT with InK-GRPO–an extension of GRPO that augments the GRPO loss with a next-word prediction loss–enables sovereign\r\ncapability by improving Thai legal reasoning and Thai-specific knowledge while preserving general\r\ncapabilities. Our results suggest that a carefully designed post-training strategy can reduce the\r\nrequired scale of instruction data and computation, providing a practical path toward high-quality\r\nsovereign LLMs under academic-scale resources (approximately two days of 8-GPU training for an\r\n8B model for adoptability, and one day of 4-GPU training for sovereign capability).","html":"

Large language models (LLMs) have progressed rapidly; however, most state-of-the-art models are
trained and evaluated primarily in high-resource languages such as English and Chinese. In addition, they are often developed by a small number of organizations with access to large-scale compute
and data. This gatekeeping creates a practical barrier for sovereign settings in which a regional- or
national-scale institution or domain owner must retain control and understanding of model weights,
training data, and deployment while operating under limited resources and strict transparency constraints. To this end, we identify two core requirements: (1) adoptability, the ability to transform
a base model into a general-purpose assistant, and (2) sovereign capability, the ability to perform
high-stakes, region-specific tasks (e.g., legal reasoning in local languages and cultural knowledge).
We investigate whether these requirements can be achieved without scaling massive general-purpose
instruction corpora or relying on complex preference tuning pipelines and large-scale reinforcement
fine-tuning (RFT). We present Typhoon S, a minimal and open post-training recipe that combines supervised fine-tuning, on-policy distillation, and small-scale RFT stages. Using Thai as a
representative case study, we demonstrate that our approach successfully addresses adoptability by
transforming both sovereign-adapted and general-purpose base models into instruction-tuned models
with strong general performance. We further show that small-scale RFT with InK-GRPO–an extension of GRPO that augments the GRPO loss with a next-word prediction loss–enables sovereign
capability by improving Thai legal reasoning and Thai-specific knowledge while preserving general
capabilities. Our results suggest that a carefully designed post-training strategy can reduce the
required scale of instruction data and computation, providing a practical path toward high-quality
sovereign LLMs under academic-scale resources (approximately two days of 8-GPU training for an
8B model for adoptability, and one day of 4-GPU training for sovereign capability).

\n","updatedAt":"2026-01-30T03:04:54.072Z","author":{"_id":"62d192c2d50433c35eb1b48e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d192c2d50433c35eb1b48e/VjmDu8GOIuLuQNBQdQLLS.png","fullname":"Kunat Pipatanakul","name":"kunato","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":15,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9209707379341125},"editors":["kunato"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/62d192c2d50433c35eb1b48e/VjmDu8GOIuLuQNBQdQLLS.png"],"reactions":[],"isReport":false}},{"id":"697d2fc27b4ea8733c8a1332","author":{"_id":"65243980050781c16f234f1f","avatarUrl":"/avatars/743a009681d5d554c27e04300db9f267.svg","fullname":"Avi","name":"avahal","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isUserFollowing":false},"createdAt":"2026-01-30T22:25:06.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"arXivLens breakdown of this paper 👉 https://arxivlens.com/PaperView/Details/typhoon-s-minimal-open-post-training-for-sovereign-large-language-models-304-d267c302\n- Executive Summary\n- Detailed Breakdown\n- Practical Applications","html":"

arXivLens breakdown of this paper 👉 https://arxivlens.com/PaperView/Details/typhoon-s-minimal-open-post-training-for-sovereign-large-language-models-304-d267c302

\n
    \n
  • Executive Summary
  • \n
  • Detailed Breakdown
  • \n
  • Practical Applications
  • \n
\n","updatedAt":"2026-01-30T22:25:06.290Z","author":{"_id":"65243980050781c16f234f1f","avatarUrl":"/avatars/743a009681d5d554c27e04300db9f267.svg","fullname":"Avi","name":"avahal","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.73252934217453},"editors":["avahal"],"editorAvatarUrls":["/avatars/743a009681d5d554c27e04300db9f267.svg"],"reactions":[],"isReport":false}},{"id":"697d5d061876f169967679fa","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2026-01-31T01:38:14.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [SiamGPT: Quality-First Fine-Tuning for Stable Thai Text Generation](https://huggingface.co/papers/2512.19455) (2025)\n* [Gamayun's Path to Multilingual Mastery: Cost-Efficient Training of a 1.5B-Parameter LLM](https://huggingface.co/papers/2512.21580) (2025)\n* [A.X K1 Technical Report](https://huggingface.co/papers/2601.09200) (2026)\n* [AfriqueLLM: How Data Mixing and Model Architecture Impact Continued Pre-training for African Languages](https://huggingface.co/papers/2601.06395) (2026)\n* [BYOL: Bring Your Own Language Into LLMs](https://huggingface.co/papers/2601.10804) (2026)\n* [Kakugo: Distillation of Low-Resource Languages into Small Language Models](https://huggingface.co/papers/2601.14051) (2026)\n* [MiniLingua: A Small Open-Source LLM for European Languages](https://huggingface.co/papers/2512.13298) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2026-01-31T01:38:14.911Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.724073052406311},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2601.18129","authors":[{"_id":"6979f992df3e800774f139a0","name":"Kunat Pipatanakul","hidden":false},{"_id":"6979f992df3e800774f139a1","user":{"_id":"615313b0793ef66b3324da1f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/615313b0793ef66b3324da1f/VyJniD3dxbV5a2CMgVVQ2.jpeg","isPro":false,"fullname":"Pittawat Taveekitworachai","user":"pittawat","type":"user"},"name":"Pittawat Taveekitworachai","status":"claimed_verified","statusLastChangedAt":"2026-01-28T14:41:40.587Z","hidden":false}],"publishedAt":"2026-01-26T04:20:59.000Z","submittedOnDailyAt":"2026-01-30T00:34:54.063Z","title":"Typhoon-S: Minimal Open Post-Training for Sovereign Large Language Models","submittedOnDailyBy":{"_id":"62d192c2d50433c35eb1b48e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d192c2d50433c35eb1b48e/VjmDu8GOIuLuQNBQdQLLS.png","isPro":false,"fullname":"Kunat Pipatanakul","user":"kunato","type":"user"},"summary":"Large language models (LLMs) have progressed rapidly; however, most state-of-the-art models are trained and evaluated primarily in high-resource languages such as English and Chinese, and are often developed by a small number of organizations with access to large-scale compute and data. This gatekeeping creates a practical barrier for sovereign settings in which a regional- or national-scale institution or domain owner must retain control and understanding of model weights, training data, and deployment while operating under limited resources and strict transparency constraints. To this end, we identify two core requirements: (1) adoptability, the ability to transform a base model into a general-purpose assistant, and (2) sovereign capability, the ability to perform high-stakes, region-specific tasks (e.g., legal reasoning in local languages and cultural knowledge). We investigate whether these requirements can be achieved without scaling massive instruction corpora or relying on complex preference tuning pipelines and large-scale reinforcement fine-tuning (RFT). We present Typhoon S, a minimal and open post-training recipe that combines supervised fine-tuning, on-policy distillation, and small-scale RFT. Using Thai as a representative case study, we demonstrate that our approach transforms both sovereign-adapted and general-purpose base models into instruction-tuned models with strong general performance. We further show that small-scale RFT with InK-GRPO -- an extension of GRPO that augments the GRPO loss with a next-word prediction loss -- improves Thai legal reasoning and Thai-specific knowledge while preserving general capabilities. Our results suggest that a carefully designed post-training strategy can reduce the required scale of instruction data and computation, providing a practical path toward high-quality sovereign LLMs under academic-scale resources.","upvotes":11,"discussionId":"6979f993df3e800774f139a2","projectPage":"https://opentyphoon.ai/model/typhoon-s","githubRepo":"https://github.com/scb-10x/typhoon-s","githubRepoAddedBy":"user","ai_summary":"A minimal post-training approach using supervised fine-tuning, on-policy distillation, and small-scale reinforcement fine-tuning enables the development of high-quality sovereign language models with reduced resource requirements.","ai_keywords":["supervised fine-tuning","on-policy distillation","reinforcement fine-tuning","GRPO","InK-GRPO","instruction tuning","sovereign language models","minimal post-training recipe"],"githubStars":8,"organization":{"_id":"63e9cdf9dd2c4effdd6d39c0","name":"typhoon-ai","fullname":"Typhoon","avatar":"https://cdn-uploads.huggingface.co/production/uploads/679c6a57a3d5c3ba94fb1289/13sACxi2PL23wCeKHzwrF.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"615313b0793ef66b3324da1f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/615313b0793ef66b3324da1f/VyJniD3dxbV5a2CMgVVQ2.jpeg","isPro":false,"fullname":"Pittawat Taveekitworachai","user":"pittawat","type":"user"},{"_id":"62d192c2d50433c35eb1b48e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d192c2d50433c35eb1b48e/VjmDu8GOIuLuQNBQdQLLS.png","isPro":false,"fullname":"Kunat Pipatanakul","user":"kunato","type":"user"},{"_id":"6463554dd2044cd1d7c6e0bf","avatarUrl":"/avatars/d7653623117268c545a7063fec69664b.svg","isPro":false,"fullname":"Bingzheng Wei","user":"Bingzheng","type":"user"},{"_id":"64030afa56038547951c6114","avatarUrl":"/avatars/79bde57576e75815e1ba383c3bd2eea9.svg","isPro":false,"fullname":"Surapon Nonesung","user":"Suraponn","type":"user"},{"_id":"63f6a050b4c9a104f4b95755","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63f6a050b4c9a104f4b95755/eJQyJkenSz536j-EGcpkH.jpeg","isPro":true,"fullname":"Potsawee Manakul","user":"potsawee","type":"user"},{"_id":"679c6a57a3d5c3ba94fb1289","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/679c6a57a3d5c3ba94fb1289/JplvdbgQoL_FC5Xm3hZar.jpeg","isPro":true,"fullname":"OpenTyphoon","user":"opentyphoon","type":"user"},{"_id":"64ab758aeb47b35522d23eb6","avatarUrl":"/avatars/09e1071629aa923f0baa9d55ce13364b.svg","isPro":false,"fullname":"Sittipong","user":"pongtsu","type":"user"},{"_id":"63d371bcd0b503b7f239ef9d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63d371bcd0b503b7f239ef9d/RCHFx6Y6fnh0nZYyF52sS.jpeg","isPro":true,"fullname":"Sirichotedumrong","user":"Warit","type":"user"},{"_id":"63082bb7bc0a2a5ee2253523","avatarUrl":"/avatars/6cf8d12d16d15db1070fbea89b5b3967.svg","isPro":false,"fullname":"Kuo-Hsin Tu","user":"dapumptu","type":"user"},{"_id":"6669d8938b6feadc10eb0472","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/mqXvBryBWC2Yj50lYPlKJ.png","isPro":false,"fullname":"Oravee Smithiphol","user":"ornsmith","type":"user"},{"_id":"64705d3890482b0e0f6591ed","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64705d3890482b0e0f6591ed/PFJs66YhXDogcreVHH1OL.png","isPro":true,"fullname":"Natapong Nitarach (Schwyter)","user":"natnitaract","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0,"organization":{"_id":"63e9cdf9dd2c4effdd6d39c0","name":"typhoon-ai","fullname":"Typhoon","avatar":"https://cdn-uploads.huggingface.co/production/uploads/679c6a57a3d5c3ba94fb1289/13sACxi2PL23wCeKHzwrF.jpeg"}}">
Papers
arxiv:2601.18129

Typhoon-S: Minimal Open Post-Training for Sovereign Large Language Models

Published on Jan 26
· Submitted by
Kunat Pipatanakul
on Jan 30
Authors:
,

Abstract

A minimal post-training approach using supervised fine-tuning, on-policy distillation, and small-scale reinforcement fine-tuning enables the development of high-quality sovereign language models with reduced resource requirements.

AI-generated summary

Large language models (LLMs) have progressed rapidly; however, most state-of-the-art models are trained and evaluated primarily in high-resource languages such as English and Chinese, and are often developed by a small number of organizations with access to large-scale compute and data. This gatekeeping creates a practical barrier for sovereign settings in which a regional- or national-scale institution or domain owner must retain control and understanding of model weights, training data, and deployment while operating under limited resources and strict transparency constraints. To this end, we identify two core requirements: (1) adoptability, the ability to transform a base model into a general-purpose assistant, and (2) sovereign capability, the ability to perform high-stakes, region-specific tasks (e.g., legal reasoning in local languages and cultural knowledge). We investigate whether these requirements can be achieved without scaling massive instruction corpora or relying on complex preference tuning pipelines and large-scale reinforcement fine-tuning (RFT). We present Typhoon S, a minimal and open post-training recipe that combines supervised fine-tuning, on-policy distillation, and small-scale RFT. Using Thai as a representative case study, we demonstrate that our approach transforms both sovereign-adapted and general-purpose base models into instruction-tuned models with strong general performance. We further show that small-scale RFT with InK-GRPO -- an extension of GRPO that augments the GRPO loss with a next-word prediction loss -- improves Thai legal reasoning and Thai-specific knowledge while preserving general capabilities. Our results suggest that a carefully designed post-training strategy can reduce the required scale of instruction data and computation, providing a practical path toward high-quality sovereign LLMs under academic-scale resources.

Community

Paper submitter

Large language models (LLMs) have progressed rapidly; however, most state-of-the-art models are
trained and evaluated primarily in high-resource languages such as English and Chinese. In addition, they are often developed by a small number of organizations with access to large-scale compute
and data. This gatekeeping creates a practical barrier for sovereign settings in which a regional- or
national-scale institution or domain owner must retain control and understanding of model weights,
training data, and deployment while operating under limited resources and strict transparency constraints. To this end, we identify two core requirements: (1) adoptability, the ability to transform
a base model into a general-purpose assistant, and (2) sovereign capability, the ability to perform
high-stakes, region-specific tasks (e.g., legal reasoning in local languages and cultural knowledge).
We investigate whether these requirements can be achieved without scaling massive general-purpose
instruction corpora or relying on complex preference tuning pipelines and large-scale reinforcement
fine-tuning (RFT). We present Typhoon S, a minimal and open post-training recipe that combines supervised fine-tuning, on-policy distillation, and small-scale RFT stages. Using Thai as a
representative case study, we demonstrate that our approach successfully addresses adoptability by
transforming both sovereign-adapted and general-purpose base models into instruction-tuned models
with strong general performance. We further show that small-scale RFT with InK-GRPO–an extension of GRPO that augments the GRPO loss with a next-word prediction loss–enables sovereign
capability by improving Thai legal reasoning and Thai-specific knowledge while preserving general
capabilities. Our results suggest that a carefully designed post-training strategy can reduce the
required scale of instruction data and computation, providing a practical path toward high-quality
sovereign LLMs under academic-scale resources (approximately two days of 8-GPU training for an
8B model for adoptability, and one day of 4-GPU training for sovereign capability).

arXivLens breakdown of this paper 👉 https://arxivlens.com/PaperView/Details/typhoon-s-minimal-open-post-training-for-sovereign-large-language-models-304-d267c302

  • Executive Summary
  • Detailed Breakdown
  • Practical Applications

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 3

Datasets citing this paper 2

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2601.18129 in a Space README.md to link it from this page.

Collections including this paper 1