Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - Kanana: Compute-efficient Bilingual Language Models
https://huggingface.co/collections/kakaocorp/kanana-nano-21b-67a326cda1c449c8d4172259 github: https://github.com/kakao/kanana\n","updatedAt":"2025-02-27T05:26:26.900Z","author":{"_id":"60436d159e905013ae8715d7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1623809612769-60436d159e905013ae8715d7.jpeg","fullname":"Minho Ryu","name":"bzantium","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.6788538098335266},"editors":["bzantium"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1623809612769-60436d159e905013ae8715d7.jpeg"],"reactions":[],"isReport":false}},{"id":"67c112afd8247a49b8feb763","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2025-02-28T01:34:39.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning](https://huggingface.co/papers/2502.11573) (2025)\n* [UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings](https://huggingface.co/papers/2502.16961) (2025)\n* [From Drafts to Answers: Unlocking LLM Potential via Aggregation Fine-Tuning](https://huggingface.co/papers/2501.11877) (2025)\n* [Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging -- An Open Recipe](https://huggingface.co/papers/2502.09056) (2025)\n* [Multilingual Language Model Pretraining using Machine-translated Data](https://huggingface.co/papers/2502.13252) (2025)\n* [The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities](https://huggingface.co/papers/2501.13921) (2025)\n* [Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study](https://huggingface.co/papers/2502.02481) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
\n
The following papers were recommended by the Semantic Scholar API
Please give a thumbs up to this comment if you found it helpful!
\n
If you want recommendations for any Paper on Hugging Face checkout this Space
\n
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
\n","updatedAt":"2025-02-28T01:34:39.378Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7262100577354431},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2502.18934","authors":[{"_id":"67bfe1bf4426925c82fe5953","name":"Kanana LLM Team","hidden":false},{"_id":"67bfe1bf4426925c82fe5954","user":{"_id":"64d08bd75de9e1e911b24226","avatarUrl":"/avatars/e572bb47659393573a0c1fb3d333dd7b.svg","isPro":false,"fullname":"Yunju Bak","user":"yunjubak63","type":"user"},"name":"Yunju Bak","status":"admin_assigned","statusLastChangedAt":"2025-02-27T12:55:35.505Z","hidden":false},{"_id":"67bfe1bf4426925c82fe5955","name":"Hojin Lee","hidden":false},{"_id":"67bfe1bf4426925c82fe5956","user":{"_id":"60436d159e905013ae8715d7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1623809612769-60436d159e905013ae8715d7.jpeg","isPro":false,"fullname":"Minho Ryu","user":"bzantium","type":"user"},"name":"Minho Ryu","status":"claimed_verified","statusLastChangedAt":"2025-02-27T09:14:17.979Z","hidden":false},{"_id":"67bfe1bf4426925c82fe5957","user":{"_id":"66ebb4fdc5b2c25450fd17de","avatarUrl":"/avatars/e6b40dcbe2eba838ba21be9221758a3c.svg","isPro":false,"fullname":"Jiyeon Ham","user":"jiyeonham","type":"user"},"name":"Jiyeon Ham","status":"claimed_verified","statusLastChangedAt":"2025-02-27T09:14:11.786Z","hidden":false},{"_id":"67bfe1bf4426925c82fe5958","name":"Seungjae Jung","hidden":false},{"_id":"67bfe1bf4426925c82fe5959","user":{"_id":"66c82a50c1b3c03c61aea140","avatarUrl":"/avatars/3c508f96bdca2f2ce9746d3decd4718e.svg","isPro":false,"fullname":"daniel nam","user":"daniel-rl2","type":"user"},"name":"Daniel Wontae Nam","status":"claimed_verified","statusLastChangedAt":"2025-02-27T09:14:09.613Z","hidden":false},{"_id":"67bfe1bf4426925c82fe595a","name":"Taegyeong Eo","hidden":false},{"_id":"67bfe1bf4426925c82fe595b","name":"Donghun Lee","hidden":false},{"_id":"67bfe1bf4426925c82fe595c","user":{"_id":"6142e17fe9e656d4459121e4","avatarUrl":"/avatars/6baebd4598a845ec7fdb735eb0d53139.svg","isPro":false,"fullname":"Doohae Jung","user":"Doohae","type":"user"},"name":"Doohae Jung","status":"claimed_verified","statusLastChangedAt":"2025-02-27T09:14:06.858Z","hidden":false},{"_id":"67bfe1bf4426925c82fe595d","user":{"_id":"60f559be68ee3ef098e407cf","avatarUrl":"/avatars/e1f00ff1c1c9fa7f591535d39c7d5e44.svg","isPro":false,"fullname":"Boseop Kim","user":"seopbo","type":"user"},"name":"Boseop Kim","status":"claimed_verified","statusLastChangedAt":"2025-02-27T09:14:01.989Z","hidden":false},{"_id":"67bfe1bf4426925c82fe595e","user":{"_id":"6605028007a154c768e1c4c7","avatarUrl":"/avatars/88678edb83fdb466067e38acd22d07de.svg","isPro":false,"fullname":"Nayeon Kim","user":"lana-ny","type":"user"},"name":"Nayeon Kim","status":"claimed_verified","statusLastChangedAt":"2025-02-27T09:14:13.867Z","hidden":false},{"_id":"67bfe1bf4426925c82fe595f","user":{"_id":"6136f65440e43b8f748a0833","avatarUrl":"/avatars/f72a5ae3d3e94485de8aed8df94abdad.svg","isPro":false,"fullname":"Jaesun Park","user":"jaesun","type":"user"},"name":"Jaesun Park","status":"claimed_verified","statusLastChangedAt":"2025-02-27T09:14:15.898Z","hidden":false},{"_id":"67bfe1bf4426925c82fe5960","name":"Hyunho Kim","hidden":false},{"_id":"67bfe1bf4426925c82fe5961","user":{"_id":"5fd888cf61e46993190ce543","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1634604273263-5fd888cf61e46993190ce543.jpeg","isPro":false,"fullname":"Hyunwoong Ko","user":"hyunwoongko","type":"user"},"name":"Hyunwoong Ko","status":"admin_assigned","statusLastChangedAt":"2025-02-27T12:58:05.546Z","hidden":false},{"_id":"67bfe1bf4426925c82fe5962","user":{"_id":"63d268bb57ab367124ea7b75","avatarUrl":"/avatars/11312cde1e9f077aa9e5103b48be5de6.svg","isPro":false,"fullname":"Changmin Lee","user":"changminlee","type":"user"},"name":"Changmin Lee","status":"claimed_verified","statusLastChangedAt":"2025-02-27T09:14:04.506Z","hidden":false},{"_id":"67bfe1bf4426925c82fe5963","user":{"_id":"62bd31e1d2c8a6542f53fcba","avatarUrl":"/avatars/4ac18a7bcaf9dd3885b0478dea90818f.svg","isPro":false,"fullname":"Kyoung-Woon On","user":"kloud","type":"user"},"name":"Kyoung-Woon On","status":"admin_assigned","statusLastChangedAt":"2025-02-27T12:58:11.269Z","hidden":false},{"_id":"67bfe1bf4426925c82fe5964","name":"Seulye Baeg","hidden":false},{"_id":"67bfe1bf4426925c82fe5965","name":"Junrae Cho","hidden":false},{"_id":"67bfe1bf4426925c82fe5966","user":{"_id":"65e30342e8b017ee1384824c","avatarUrl":"/avatars/e5d07b037f611ccfaf719959d971d102.svg","isPro":false,"fullname":"Sunghee Jung","user":"hash2430","type":"user"},"name":"Sunghee Jung","status":"claimed_verified","statusLastChangedAt":"2025-04-03T08:29:34.214Z","hidden":false},{"_id":"67bfe1bf4426925c82fe5967","name":"Jieun Kang","hidden":false},{"_id":"67bfe1bf4426925c82fe5968","name":"EungGyun Kim","hidden":false},{"_id":"67bfe1bf4426925c82fe5969","name":"Eunhwa Kim","hidden":false},{"_id":"67bfe1bf4426925c82fe596a","name":"Byeongil Ko","hidden":false},{"_id":"67bfe1bf4426925c82fe596b","name":"Daniel Lee","hidden":false},{"_id":"67bfe1bf4426925c82fe596c","name":"Minchul Lee","hidden":false},{"_id":"67bfe1bf4426925c82fe596d","name":"Miok Lee","hidden":false},{"_id":"67bfe1bf4426925c82fe596e","name":"Shinbok Lee","hidden":false},{"_id":"67bfe1bf4426925c82fe596f","user":{"_id":"63148a8f5f47a18962765802","avatarUrl":"/avatars/bc58a863727794006dddf758efa09411.svg","isPro":false,"fullname":"gaeunseo","user":"gaeunseo","type":"user"},"name":"Gaeun Seo","status":"admin_assigned","statusLastChangedAt":"2025-02-27T12:59:39.670Z","hidden":false}],"publishedAt":"2025-02-26T08:36:20.000Z","submittedOnDailyAt":"2025-02-27T01:35:13.440Z","title":"Kanana: Compute-efficient Bilingual Language Models","submittedOnDailyBy":{"_id":"60436d159e905013ae8715d7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1623809612769-60436d159e905013ae8715d7.jpeg","isPro":false,"fullname":"Minho Ryu","user":"bzantium","type":"user"},"summary":"We introduce Kanana, a series of bilingual language models that demonstrate\nexceeding performance in Korean and competitive performance in English. The\ncomputational cost of Kanana is significantly lower than that of\nstate-of-the-art models of similar size. The report details the techniques\nemployed during pre-training to achieve compute-efficient yet competitive\nmodels, including high quality data filtering, staged pre-training, depth\nup-scaling, and pruning and distillation. Furthermore, the report outlines the\nmethodologies utilized during the post-training of the Kanana models,\nencompassing supervised fine-tuning and preference optimization, aimed at\nenhancing their capability for seamless interaction with users. Lastly, the\nreport elaborates on plausible approaches used for language model adaptation to\nspecific scenarios, such as embedding, retrieval augmented generation, and\nfunction calling. The Kanana model series spans from 2.1B to 32.5B parameters\nwith 2.1B models (base, instruct, embedding) publicly released to promote\nresearch on Korean language models.","upvotes":65,"discussionId":"67bfe1c04426925c82fe59a1","projectPage":"https://huggingface.co/kakaocorp","githubRepo":"https://github.com/kakao/kanana","githubRepoAddedBy":"auto","ai_summary":"Kanana, a series of bilingual language models, achieves superior performance in Korean and competitive performance in English with lower computational costs through efficient pre-training and post-training techniques.","ai_keywords":["high quality data filtering","staged pre-training","depth up-scaling","pruning","distillation","supervised fine-tuning","preference optimization","embedding","retrieval augmented generation","function calling"],"githubStars":279},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6142e17fe9e656d4459121e4","avatarUrl":"/avatars/6baebd4598a845ec7fdb735eb0d53139.svg","isPro":false,"fullname":"Doohae Jung","user":"Doohae","type":"user"},{"_id":"66c82a50c1b3c03c61aea140","avatarUrl":"/avatars/3c508f96bdca2f2ce9746d3decd4718e.svg","isPro":false,"fullname":"daniel nam","user":"daniel-rl2","type":"user"},{"_id":"670dfa5f7625d0f5c6bbf61e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/f5oVw_777mhNJ-dNE3gTC.png","isPro":false,"fullname":"chloe-py","user":"chloe-py","type":"user"},{"_id":"60436d159e905013ae8715d7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1623809612769-60436d159e905013ae8715d7.jpeg","isPro":false,"fullname":"Minho Ryu","user":"bzantium","type":"user"},{"_id":"66c2ea881ea0a61c6cc0142e","avatarUrl":"/avatars/0f7f0bf1217be5be50198c33b3729db7.svg","isPro":false,"fullname":"wavy-jung","user":"wavy-jung","type":"user"},{"_id":"60f559be68ee3ef098e407cf","avatarUrl":"/avatars/e1f00ff1c1c9fa7f591535d39c7d5e44.svg","isPro":false,"fullname":"Boseop Kim","user":"seopbo","type":"user"},{"_id":"63be91f74a2beec6555f167f","avatarUrl":"/avatars/2b0f02acfa976e72b2d2166c96a49e3b.svg","isPro":false,"fullname":"Hojin Lee","user":"hjlee1371","type":"user"},{"_id":"66e0ef48e11ef4dcc10a3fbf","avatarUrl":"/avatars/4b2ee28e2e2d922cabcaaed6ccaea9f2.svg","isPro":false,"fullname":"juyoung","user":"michael-jy","type":"user"},{"_id":"66e0f1f39ba10fc995e9a8d4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66e0f1f39ba10fc995e9a8d4/U9uotrg3RSKvxLL1jTZtk.jpeg","isPro":false,"fullname":"jason.gk","user":"jason-gk","type":"user"},{"_id":"66e0f557dd56688c29635e3d","avatarUrl":"/avatars/b621ab445d8c014a934b6d7dff2f88e2.svg","isPro":false,"fullname":"peter.roh","user":"peterroh","type":"user"},{"_id":"66825fefe642442e0dd16ea1","avatarUrl":"/avatars/9835de139672e595fdf6e267774fdfe1.svg","isPro":false,"fullname":"lee jeehye","user":"jessie-e","type":"user"},{"_id":"6729dfd6286bcc483b618eed","avatarUrl":"/avatars/1d41f6490f6f45ddcf00086d2d5e9847.svg","isPro":false,"fullname":"KIMHOON","user":"hoonkim73","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":2}">
Kanana, a series of bilingual language models, achieves superior performance in Korean and competitive performance in English with lower computational costs through efficient pre-training and post-training techniques.
AI-generated summary
We introduce Kanana, a series of bilingual language models that demonstrate
exceeding performance in Korean and competitive performance in English. The
computational cost of Kanana is significantly lower than that of
state-of-the-art models of similar size. The report details the techniques
employed during pre-training to achieve compute-efficient yet competitive
models, including high quality data filtering, staged pre-training, depth
up-scaling, and pruning and distillation. Furthermore, the report outlines the
methodologies utilized during the post-training of the Kanana models,
encompassing supervised fine-tuning and preference optimization, aimed at
enhancing their capability for seamless interaction with users. Lastly, the
report elaborates on plausible approaches used for language model adaptation to
specific scenarios, such as embedding, retrieval augmented generation, and
function calling. The Kanana model series spans from 2.1B to 32.5B parameters
with 2.1B models (base, instruct, embedding) publicly released to promote
research on Korean language models.