Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding
Benchmark for Culture-aware Evaluation
Please give a thumbs up to this comment if you found it helpful!
\n
If you want recommendations for any Paper on Hugging Face checkout this Space
\n
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
\n","updatedAt":"2024-10-24T01:34:49.177Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6845632791519165},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2410.17250","authors":[{"_id":"671887ca86e0b4dce87aedbd","user":{"_id":"64fa893f35a7fc7d4ff63321","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64fa893f35a7fc7d4ff63321/Gibk2euroCPycBFmf5I_p.jpeg","isPro":false,"fullname":"Shota Onohara","user":"shtapm","type":"user"},"name":"Shota Onohara","status":"admin_assigned","statusLastChangedAt":"2024-10-23T08:01:15.561Z","hidden":false},{"_id":"671887ca86e0b4dce87aedbe","user":{"_id":"6527b37c0ae663e384eb1b85","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6527b37c0ae663e384eb1b85/zKWa8h6YU4BWfcitpM5Pl.png","isPro":false,"fullname":"Atsuyuki Miyai","user":"AtsuMiyai","type":"user"},"name":"Atsuyuki Miyai","status":"claimed_verified","statusLastChangedAt":"2024-10-23T07:33:10.955Z","hidden":false},{"_id":"671887ca86e0b4dce87aedbf","user":{"_id":"65d86a59bdb95b4bbc744e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65d86a59bdb95b4bbc744e10/TUYziiCbQr8JN0LZSxoTY.jpeg","isPro":true,"fullname":"Yuki Imajuku","user":"yuki-imajuku","type":"user"},"name":"Yuki Imajuku","status":"claimed_verified","statusLastChangedAt":"2024-10-28T11:25:16.473Z","hidden":false},{"_id":"671887ca86e0b4dce87aedc0","name":"Kazuki Egashira","hidden":false},{"_id":"671887ca86e0b4dce87aedc1","user":{"_id":"6523e5e8d14f5b11611eca95","avatarUrl":"/avatars/a3f11aed9531849ee5670a95190b006a.svg","isPro":false,"fullname":"Jeonghun Baek","user":"ku21fan","type":"user"},"name":"Jeonghun Baek","status":"admin_assigned","statusLastChangedAt":"2024-10-23T08:01:40.973Z","hidden":false},{"_id":"671887ca86e0b4dce87aedc2","user":{"_id":"6230d750d93e84e233882dbc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6230d750d93e84e233882dbc/4MGEekLW3oWzqeFWDWvIK.jpeg","isPro":false,"fullname":"Xiang Yue","user":"yuexiang96","type":"user"},"name":"Xiang Yue","status":"claimed_verified","statusLastChangedAt":"2024-12-09T14:48:32.681Z","hidden":false},{"_id":"671887ca86e0b4dce87aedc3","user":{"_id":"60de14638bedd2315529d43f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1625166923504-noauth.png","isPro":false,"fullname":"Graham Neubig","user":"gneubig","type":"user"},"name":"Graham Neubig","status":"admin_assigned","statusLastChangedAt":"2024-10-23T08:01:31.738Z","hidden":false},{"_id":"671887ca86e0b4dce87aedc4","name":"Kiyoharu Aizawa","hidden":false}],"publishedAt":"2024-10-22T17:59:56.000Z","submittedOnDailyAt":"2024-10-23T03:52:20.410Z","title":"JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding\n Benchmark for Culture-aware Evaluation","submittedOnDailyBy":{"_id":"6527b37c0ae663e384eb1b85","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6527b37c0ae663e384eb1b85/zKWa8h6YU4BWfcitpM5Pl.png","isPro":false,"fullname":"Atsuyuki Miyai","user":"AtsuMiyai","type":"user"},"summary":"Accelerating research on Large Multimodal Models (LMMs) in non-English\nlanguages is crucial for enhancing user experiences across broader populations.\nIn this paper, we introduce JMMMU (Japanese MMMU), the first large-scale\nJapanese benchmark designed to evaluate LMMs on expert-level tasks based on the\nJapanese cultural context. To facilitate comprehensive culture-aware\nevaluation, JMMMU features two complementary subsets: (i) culture-agnostic (CA)\nsubset, where the culture-independent subjects (e.g., Math) are selected and\ntranslated into Japanese, enabling one-to-one comparison with its English\ncounterpart MMMU; and (ii) culture-specific (CS) subset, comprising newly\ncrafted subjects that reflect Japanese cultural context. Using the CA subset,\nwe observe performance drop in many LMMs when evaluated in Japanese, which is\npurely attributable to language variation. Using the CS subset, we reveal their\ninadequate Japanese cultural understanding. Further, by combining both subsets,\nwe identify that some LMMs perform well on the CA subset but not on the CS\nsubset, exposing a shallow understanding of the Japanese language that lacks\ndepth in cultural understanding. We hope this work will not only help advance\nLMM performance in Japanese but also serve as a guideline to create\nhigh-standard, culturally diverse benchmarks for multilingual LMM development.\nThe project page is https://mmmu-japanese-benchmark.github.io/JMMMU/.","upvotes":14,"discussionId":"671887cd86e0b4dce87aeea9","ai_summary":"JMMMU, a Japanese benchmark for Large Multimodal Models, evaluates their performance on expert-level tasks and cultural understanding through culture-agnostic and culture-specific subsets.","ai_keywords":["Large Multimodal Models","JMMMU","Japanese benchmark","culture-agnostic","culture-specific","Japanese cultural context","multilingual LMM","expert-level tasks"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"5f32b2367e583543386214d9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1635314457124-5f32b2367e583543386214d9.jpeg","isPro":false,"fullname":"Sergei Averkiev","user":"averoo","type":"user"},{"_id":"63085bf32d33960e6b7afb30","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1661493249052-63085bf32d33960e6b7afb30.jpeg","isPro":false,"fullname":"Nobuhiro Ueda","user":"nobu-g","type":"user"},{"_id":"63b2a92e18e5cf2cdd333492","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63b2a92e18e5cf2cdd333492/GxnngJG0u7d0jYTEFOrfe.png","isPro":false,"fullname":"Jaehyun Jun","user":"btjhjeon","type":"user"},{"_id":"63e3a472737f1e42374fa3aa","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/46mkVTbL_KmDr47dzrHnr.png","isPro":false,"fullname":"KazukiEgashira","user":"Kazuki1450","type":"user"},{"_id":"64fa893f35a7fc7d4ff63321","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64fa893f35a7fc7d4ff63321/Gibk2euroCPycBFmf5I_p.jpeg","isPro":false,"fullname":"Shota Onohara","user":"shtapm","type":"user"},{"_id":"6111ad63fc4ee24fa160f76b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6111ad63fc4ee24fa160f76b/eZDK39yLeYVM7obmUl0fO.png","isPro":false,"fullname":"Simon DL","user":"SimonDL","type":"user"},{"_id":"65decc75beffeb39ba679eba","avatarUrl":"/avatars/735b678bd5863a0c1b1bdd3bbf8858fa.svg","isPro":true,"fullname":"r","user":"oceansweep","type":"user"},{"_id":"6640bbd0220cfa8cbfdce080","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6640bbd0220cfa8cbfdce080/wiAHUu5ewawyipNs0YFBR.png","isPro":true,"fullname":"John Smith","user":"John6666","type":"user"},{"_id":"641b754d1911d3be6745cce9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/641b754d1911d3be6745cce9/Ydjcjd4VuNUGj5Cd4QHdB.png","isPro":false,"fullname":"atayloraerospace","user":"Taylor658","type":"user"},{"_id":"64d4615cf8082bf19b916492","avatarUrl":"/avatars/8e1b59565ec5e4b31090cf1b911781b9.svg","isPro":false,"fullname":"wongyukim","user":"wongyukim","type":"user"},{"_id":"6527b37c0ae663e384eb1b85","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6527b37c0ae663e384eb1b85/zKWa8h6YU4BWfcitpM5Pl.png","isPro":false,"fullname":"Atsuyuki Miyai","user":"AtsuMiyai","type":"user"},{"_id":"675c37ee8505da535b0d353f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/675c37ee8505da535b0d353f/heMCzg4T8s0r5p6uyE9M1.jpeg","isPro":false,"fullname":"Sarah Thompson","user":"Thompson91","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
JMMMU, a Japanese benchmark for Large Multimodal Models, evaluates their performance on expert-level tasks and cultural understanding through culture-agnostic and culture-specific subsets.
AI-generated summary
Accelerating research on Large Multimodal Models (LMMs) in non-English
languages is crucial for enhancing user experiences across broader populations.
In this paper, we introduce JMMMU (Japanese MMMU), the first large-scale
Japanese benchmark designed to evaluate LMMs on expert-level tasks based on the
Japanese cultural context. To facilitate comprehensive culture-aware
evaluation, JMMMU features two complementary subsets: (i) culture-agnostic (CA)
subset, where the culture-independent subjects (e.g., Math) are selected and
translated into Japanese, enabling one-to-one comparison with its English
counterpart MMMU; and (ii) culture-specific (CS) subset, comprising newly
crafted subjects that reflect Japanese cultural context. Using the CA subset,
we observe performance drop in many LMMs when evaluated in Japanese, which is
purely attributable to language variation. Using the CS subset, we reveal their
inadequate Japanese cultural understanding. Further, by combining both subsets,
we identify that some LMMs perform well on the CA subset but not on the CS
subset, exposing a shallow understanding of the Japanese language that lacks
depth in cultural understanding. We hope this work will not only help advance
LMM performance in Japanese but also serve as a guideline to create
high-standard, culturally diverse benchmarks for multilingual LMM development.
The project page is https://mmmu-japanese-benchmark.github.io/JMMMU/.