Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Bridging Language Barriers in Healthcare: A Study on Arabic LLMs
[go: Go Back, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-01-21T01:33:52.432Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7159744501113892},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2501.09825","authors":[{"_id":"678e3e0c8aeb001443af5cb1","user":{"_id":"66bb35988b09ede0b7b92313","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66bb35988b09ede0b7b92313/M06mQ3ifyRwuladTNwMS2.png","isPro":false,"fullname":"Nada Saadi","user":"Nadas31","type":"user"},"name":"Nada Saadi","status":"admin_assigned","statusLastChangedAt":"2025-01-20T14:12:06.609Z","hidden":false},{"_id":"678e3e0c8aeb001443af5cb2","user":{"_id":"5f5f6c113c67af20d9945afb","avatarUrl":"/avatars/06b2eb3a5d27864280d4d02e6d00d782.svg","isPro":false,"fullname":"Tathagata Raha","user":"tathagataraha","type":"user"},"name":"Tathagata Raha","status":"admin_assigned","statusLastChangedAt":"2025-01-20T14:12:13.246Z","hidden":false},{"_id":"678e3e0c8aeb001443af5cb3","user":{"_id":"628e39f4b1596566033b8d7b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/628e39f4b1596566033b8d7b/-Y807up1cgMmAQsczdOPn.jpeg","isPro":false,"fullname":"Clément Christophe","user":"cchristophe","type":"user"},"name":"Clément Christophe","status":"admin_assigned","statusLastChangedAt":"2025-01-20T14:12:19.413Z","hidden":false},{"_id":"678e3e0c8aeb001443af5cb4","name":"Marco AF Pimentel","hidden":false},{"_id":"678e3e0c8aeb001443af5cb5","user":{"_id":"65281d6ef61ca80b9c2ee707","avatarUrl":"/avatars/090ea7210a4bb6549b0f7fee71525625.svg","isPro":false,"fullname":"Ronnie Rajan","user":"ronnierajan","type":"user"},"name":"Ronnie Rajan","status":"admin_assigned","statusLastChangedAt":"2025-01-20T14:12:33.526Z","hidden":false},{"_id":"678e3e0c8aeb001443af5cb6","user":{"_id":"65280984b794fe3d06544d77","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65280984b794fe3d06544d77/tyrxbxtDG02On1uiRaVbL.jpeg","isPro":false,"fullname":"Praveenkumar","user":"pkanithi","type":"user"},"name":"Praveen K Kanithi","status":"claimed_verified","statusLastChangedAt":"2025-01-20T14:07:22.890Z","hidden":false}],"publishedAt":"2025-01-16T20:24:56.000Z","submittedOnDailyAt":"2025-01-20T09:44:47.264Z","title":"Bridging Language Barriers in Healthcare: A Study on Arabic LLMs","submittedOnDailyBy":{"_id":"628e39f4b1596566033b8d7b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/628e39f4b1596566033b8d7b/-Y807up1cgMmAQsczdOPn.jpeg","isPro":false,"fullname":"Clément Christophe","user":"cchristophe","type":"user"},"summary":"This paper investigates the challenges of developing large language models\n(LLMs) proficient in both multilingual understanding and medical knowledge. We\ndemonstrate that simply translating medical data does not guarantee strong\nperformance on clinical tasks in the target language. Our experiments reveal\nthat the optimal language mix in training data varies significantly across\ndifferent medical tasks. We find that larger models with carefully calibrated\nlanguage ratios achieve superior performance on native-language clinical tasks.\nFurthermore, our results suggest that relying solely on fine-tuning may not be\nthe most effective approach for incorporating new language knowledge into LLMs.\nInstead, data and computationally intensive pretraining methods may still be\nnecessary to achieve optimal performance in multilingual medical settings.\nThese findings provide valuable guidance for building effective and inclusive\nmedical AI systems for diverse linguistic communities.","upvotes":14,"discussionId":"678e3e0d8aeb001443af5cf4","ai_summary":"Training large language models with carefully calibrated language ratios in multilingual medical data improves performance on clinical tasks more effectively than fine-tuning alone.","ai_keywords":["large language models","multilingual understanding","medical knowledge","clinical tasks","language mix","language ratios","pretraining methods","fine-tuning","medical AI systems"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"628e39f4b1596566033b8d7b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/628e39f4b1596566033b8d7b/-Y807up1cgMmAQsczdOPn.jpeg","isPro":false,"fullname":"Clément Christophe","user":"cchristophe","type":"user"},{"_id":"65281d6ef61ca80b9c2ee707","avatarUrl":"/avatars/090ea7210a4bb6549b0f7fee71525625.svg","isPro":false,"fullname":"Ronnie Rajan","user":"ronnierajan","type":"user"},{"_id":"6506cfafd55dd4e15caeea09","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/uTkC6G8Cj51av4i7kAaI8.png","isPro":false,"fullname":"Svetlana Maslenkova","user":"maslenkovas","type":"user"},{"_id":"66bb35988b09ede0b7b92313","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66bb35988b09ede0b7b92313/M06mQ3ifyRwuladTNwMS2.png","isPro":false,"fullname":"Nada Saadi","user":"Nadas31","type":"user"},{"_id":"5f5f6c113c67af20d9945afb","avatarUrl":"/avatars/06b2eb3a5d27864280d4d02e6d00d782.svg","isPro":false,"fullname":"Tathagata Raha","user":"tathagataraha","type":"user"},{"_id":"65280984b794fe3d06544d77","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65280984b794fe3d06544d77/tyrxbxtDG02On1uiRaVbL.jpeg","isPro":false,"fullname":"Praveenkumar","user":"pkanithi","type":"user"},{"_id":"6767b9bdfe8020a5347fbe95","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/h_DNzkFz9ZqM2SMYk4yuT.png","isPro":false,"fullname":"Raneem Mohammed","user":"RaneemM55","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"647437a26a972f252de6b0ce","avatarUrl":"/avatars/02e6bed173eee14a18e30e0d247b8aa1.svg","isPro":false,"fullname":"Nasir Hayat","user":"nasirhayat","type":"user"},{"_id":"6418354aedc5a69a66963935","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6418354aedc5a69a66963935/AUZRKKvqPaDJUJrQgeA9L.jpeg","isPro":false,"fullname":"Pavan Kumar Balijepalli","user":"pavankumarbalijepalli","type":"user"},{"_id":"663ccbff3a74a20189d4aa2e","avatarUrl":"/avatars/83a54455e0157480f65c498cd9057cf2.svg","isPro":false,"fullname":"Nguyen Van Thanh","user":"NguyenVanThanhHust","type":"user"},{"_id":"6776340dd3ceb4493fda0c6e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6776340dd3ceb4493fda0c6e/JzUAaFFPICKhZLgJR3pgP.png","isPro":false,"fullname":"Ruben Roy","user":"rubenroy","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2501.09825

Bridging Language Barriers in Healthcare: A Study on Arabic LLMs

Published on Jan 16, 2025
· Submitted by
Clément Christophe
on Jan 20, 2025

Abstract

Training large language models with carefully calibrated language ratios in multilingual medical data improves performance on clinical tasks more effectively than fine-tuning alone.

AI-generated summary

This paper investigates the challenges of developing large language models (LLMs) proficient in both multilingual understanding and medical knowledge. We demonstrate that simply translating medical data does not guarantee strong performance on clinical tasks in the target language. Our experiments reveal that the optimal language mix in training data varies significantly across different medical tasks. We find that larger models with carefully calibrated language ratios achieve superior performance on native-language clinical tasks. Furthermore, our results suggest that relying solely on fine-tuning may not be the most effective approach for incorporating new language knowledge into LLMs. Instead, data and computationally intensive pretraining methods may still be necessary to achieve optimal performance in multilingual medical settings. These findings provide valuable guidance for building effective and inclusive medical AI systems for diverse linguistic communities.

Community

Paper author Paper submitter

Bridging Language Barriers in Healthcare: A Study on Arabic LLMs.

Paper accepted at GenAI4Health Workshop @ AAAI25

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2501.09825 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2501.09825 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2501.09825 in a Space README.md to link it from this page.

Collections including this paper 1