Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Kimi-VL Technical Report
[go: Go Back, main page]

https://huggingface.co/collections/moonshotai/kimi-vl-a3b-67f67b6ac91d3b03d382dd85
Space: https://huggingface.co/spaces/moonshotai/Kimi-VL-A3B-Thinking
GitHub: https://github.com/MoonshotAI/Kimi-VL

\n","updatedAt":"2025-04-11T05:10:59.055Z","author":{"_id":"63047ed2412a1b9d381b09c9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63047ed2412a1b9d381b09c9/2Ill5G0uSMyGstrawgmIb.jpeg","fullname":"Haoning Wu, Teo","name":"teowu","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":110,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.45962411165237427},"editors":["teowu"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/63047ed2412a1b9d381b09c9/2Ill5G0uSMyGstrawgmIb.jpeg"],"reactions":[{"reaction":"🚀","users":["AdinaY","teowu"],"count":2}],"isReport":false}},{"id":"67f9c2fc41d0970e8daaf014","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2025-04-12T01:33:48.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Shakti-VLMs: Scalable Vision-Language Models for Enterprise AI](https://huggingface.co/papers/2502.17092) (2025)\n* [SmolVLM: Redefining small and efficient multimodal models](https://huggingface.co/papers/2504.05299) (2025)\n* [FCoT-VL:Advancing Text-oriented Large Vision-Language Models with Efficient Visual Token Compression](https://huggingface.co/papers/2502.18512) (2025)\n* [Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources](https://huggingface.co/papers/2504.00595) (2025)\n* [M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance](https://huggingface.co/papers/2502.18778) (2025)\n* [JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse](https://huggingface.co/papers/2503.16365) (2025)\n* [MindGYM: Enhancing Vision-Language Models via Synthetic Self-Challenging Questions](https://huggingface.co/papers/2503.09499) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-04-12T01:33:48.979Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7002951502799988},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}},{"id":"686171dea53e52fb923cde42","author":{"_id":"67fa89232b03c68dd26a0b7a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/2jgTD4DDO9fFLrilYdyiP.png","fullname":"Seshathri","name":"Seshathri0007","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-06-29T17:03:26.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"\n![IMG-20250629-WA0000.jpg](https://cdn-uploads.huggingface.co/production/uploads/67fa89232b03c68dd26a0b7a/pFd7W9VDZ-7-4kKWhv2kI.jpeg)\n","html":"

\"IMG-20250629-WA0000.jpg\"

\n","updatedAt":"2025-06-29T17:03:26.964Z","author":{"_id":"67fa89232b03c68dd26a0b7a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/2jgTD4DDO9fFLrilYdyiP.png","fullname":"Seshathri","name":"Seshathri0007","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.2871902585029602},"editors":["Seshathri0007"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/2jgTD4DDO9fFLrilYdyiP.png"],"reactions":[],"isReport":false}},{"id":"686171e9ac9b166c048cfbe0","author":{"_id":"67fa89232b03c68dd26a0b7a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/2jgTD4DDO9fFLrilYdyiP.png","fullname":"Seshathri","name":"Seshathri0007","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-06-29T17:03:37.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"What is this\n","html":"

What is this

\n","updatedAt":"2025-06-29T17:03:37.014Z","author":{"_id":"67fa89232b03c68dd26a0b7a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/2jgTD4DDO9fFLrilYdyiP.png","fullname":"Seshathri","name":"Seshathri0007","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9951521158218384},"editors":["Seshathri0007"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/2jgTD4DDO9fFLrilYdyiP.png"],"reactions":[],"isReport":false},"replies":[{"id":"686382e602c9ff71d86de0ea","author":{"_id":"63047ed2412a1b9d381b09c9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63047ed2412a1b9d381b09c9/2Ill5G0uSMyGstrawgmIb.jpeg","fullname":"Haoning Wu, Teo","name":"teowu","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":110,"isUserFollowing":false},"createdAt":"2025-07-01T06:40:38.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Please try out our model's visual question answering ability on https://huggingface.co/spaces/moonshotai/Kimi-VL-A3B-Thinking.","html":"

Please try out our model's visual question answering ability on https://huggingface.co/spaces/moonshotai/Kimi-VL-A3B-Thinking.

\n","updatedAt":"2025-07-01T06:40:38.161Z","author":{"_id":"63047ed2412a1b9d381b09c9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63047ed2412a1b9d381b09c9/2Ill5G0uSMyGstrawgmIb.jpeg","fullname":"Haoning Wu, Teo","name":"teowu","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":110,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8244515657424927},"editors":["teowu"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/63047ed2412a1b9d381b09c9/2Ill5G0uSMyGstrawgmIb.jpeg"],"reactions":[],"isReport":false,"parentCommentId":"686171e9ac9b166c048cfbe0"}}]}],"primaryEmailConfirmed":false,"paper":{"id":"2504.07491","authors":[{"_id":"67f8a3db7de2391a06a3b2e0","name":"Kimi Team","hidden":false},{"_id":"67f8a3db7de2391a06a3b2e1","user":{"_id":"617798384ce8f8cb2c2c74a3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/617798384ce8f8cb2c2c74a3/cWSTlcZrpyEf4shuVEcdW.png","isPro":false,"fullname":"Ang","user":"duang","type":"user"},"name":"Angang Du","status":"claimed_verified","statusLastChangedAt":"2025-05-08T08:59:59.025Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2e2","name":"Bohong Yin","hidden":false},{"_id":"67f8a3db7de2391a06a3b2e3","user":{"_id":"67503f270dfe827c4068a408","avatarUrl":"/avatars/4591c8229c7815bfd6dc4b98aea85ca8.svg","isPro":false,"fullname":"Bowei Xing","user":"xingbowei","type":"user"},"name":"Bowei Xing","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:27:42.989Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2e4","name":"Bowen Qu","hidden":false},{"_id":"67f8a3db7de2391a06a3b2e5","name":"Bowen Wang","hidden":false},{"_id":"67f8a3db7de2391a06a3b2e6","name":"Cheng Chen","hidden":false},{"_id":"67f8a3db7de2391a06a3b2e7","user":{"_id":"644ce4e416703fd670260e2e","avatarUrl":"/avatars/db43b13c6913af31cc97f5be7bf30091.svg","isPro":false,"fullname":"Chenlin Zhang","user":"tzzcl","type":"user"},"name":"Chenlin Zhang","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:30:16.472Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2e8","user":{"_id":"64c21fb42426d683e56b42bf","avatarUrl":"/avatars/60359fe204e32af831d701d2975c4599.svg","isPro":false,"fullname":"Du","user":"DuChenZhuang","type":"user"},"name":"Chenzhuang Du","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:30:30.673Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2e9","user":{"_id":"635ddec594e5b275ca7941e8","avatarUrl":"/avatars/28ebfaee74d31e1de020a3ae735a4c1b.svg","isPro":false,"fullname":"Chu Wei","user":"courage17340","type":"user"},"name":"Chu Wei","status":"claimed_verified","statusLastChangedAt":"2025-07-22T07:54:52.083Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2ea","user":{"_id":"5eefd87c5e979253a010eee5","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1603575136094-5eefd87c5e979253a010eee5.jpeg","isPro":false,"fullname":"Congcong Wang","user":"congcongwang","type":"user"},"name":"Congcong Wang","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:40:18.553Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2eb","name":"Dehao Zhang","hidden":false},{"_id":"67f8a3db7de2391a06a3b2ec","name":"Dikang Du","hidden":false},{"_id":"67f8a3db7de2391a06a3b2ed","user":{"_id":"67652998288b8433a92f3c43","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/yJCzGJ8gl6JyXRc7A9IeI.png","isPro":false,"fullname":"wang","user":"dongliangwang","type":"user"},"name":"Dongliang Wang","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:40:52.991Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2ee","user":{"_id":"6331606f18711776b4655e67","avatarUrl":"/avatars/1479c2ca743b9f92d845b0ed23fcd07b.svg","isPro":false,"fullname":"Enming Yuan","user":"EnmingYuan","type":"user"},"name":"Enming Yuan","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:41:01.062Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2ef","user":{"_id":"67aed930cc96f87ce3c3132f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/JDrhmbCRcuCtKir7i9z9n.png","isPro":false,"fullname":"Lu","user":"Enzhe","type":"user"},"name":"Enzhe Lu","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:41:12.500Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2f0","name":"Fang Li","hidden":false},{"_id":"67f8a3db7de2391a06a3b2f1","user":{"_id":"6343d01a08c017b2c042305d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6343d01a08c017b2c042305d/cmJrYkGs9RjDAKfMYCqdW.jpeg","isPro":false,"fullname":"Flood Sung","user":"floodsung","type":"user"},"name":"Flood Sung","status":"claimed_verified","statusLastChangedAt":"2025-04-11T07:24:56.624Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2f2","name":"Guangda Wei","hidden":false},{"_id":"67f8a3db7de2391a06a3b2f3","user":{"_id":"63b4c71758f367a212c4f9ef","avatarUrl":"/avatars/d61736e0ae8b333a7c24eb411378698c.svg","isPro":false,"fullname":"Lai","user":"Guokun","type":"user"},"name":"Guokun Lai","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:41:40.586Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2f4","name":"Han Zhu","hidden":false},{"_id":"67f8a3db7de2391a06a3b2f5","user":{"_id":"67bdb4ff599d450529afecf4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/clUC8MtK-qlVAfJQ7v99H.png","isPro":false,"fullname":"Hao Ding","user":"HaoDing","type":"user"},"name":"Hao Ding","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:41:51.300Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2f6","name":"Hao Hu","hidden":false},{"_id":"67f8a3db7de2391a06a3b2f7","user":{"_id":"64ec364e7e2ec711a7601cde","avatarUrl":"/avatars/6ba47d496586de65df183f056d35982b.svg","isPro":false,"fullname":"Hao Yang","user":"hayayanghao","type":"user"},"name":"Hao Yang","status":"claimed_verified","statusLastChangedAt":"2025-04-11T07:24:58.807Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2f8","name":"Hao Zhang","hidden":false},{"_id":"67f8a3db7de2391a06a3b2f9","user":{"_id":"63047ed2412a1b9d381b09c9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63047ed2412a1b9d381b09c9/2Ill5G0uSMyGstrawgmIb.jpeg","isPro":true,"fullname":"Haoning Wu, Teo","user":"teowu","type":"user"},"name":"Haoning Wu","status":"claimed_verified","statusLastChangedAt":"2025-04-11T07:25:01.006Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2fa","user":{"_id":"642bcd9be8dfcc1fe4f4f853","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/642bcd9be8dfcc1fe4f4f853/M9Yqkyt66dnWWCwmBZ8l0.jpeg","isPro":false,"fullname":"Haotian Yao","user":"skylark-95","type":"user"},"name":"Haotian Yao","status":"claimed_verified","statusLastChangedAt":"2025-04-11T12:01:56.033Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2fb","user":{"_id":"64c206a3fd4d5966b453ed85","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64c206a3fd4d5966b453ed85/NemBrHcAJFm2ws_VQG8ia.jpeg","isPro":false,"fullname":"Haoyu Lu","user":"Nealeon","type":"user"},"name":"Haoyu Lu","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:42:25.550Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2fc","name":"Heng Wang","hidden":false},{"_id":"67f8a3db7de2391a06a3b2fd","user":{"_id":"62728f4f6253fe2068da1021","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62728f4f6253fe2068da1021/KZ65X0EH98AF3zXemPiap.jpeg","isPro":false,"fullname":"Hongcheng Gao","user":"HongchengGao","type":"user"},"name":"Hongcheng Gao","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:42:35.087Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2fe","user":{"_id":"61860e1258cb1f8c362f9441","avatarUrl":"/avatars/8dbc8209ad0d918453c1ffacc8f61e7f.svg","isPro":false,"fullname":"Huabin Zheng","user":"zhenghuabin","type":"user"},"name":"Huabin Zheng","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:42:42.807Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b2ff","user":{"_id":"64f83e01d493d8b0d2ab4cd3","avatarUrl":"/avatars/788d42871df1be2c9b79b2916de3e4d0.svg","isPro":false,"fullname":"Jiaming Li","user":"blabluble","type":"user"},"name":"Jiaming Li","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:42:50.314Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b300","user":{"_id":"6404982cad54665351d7c1e0","avatarUrl":"/avatars/8fb6d01802cbd4a1cbb7f6a0d83faa3a.svg","isPro":false,"fullname":"jianlin su","user":"bojone","type":"user"},"name":"Jianlin Su","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:42:57.607Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b301","user":{"_id":"63be6bf6da08ed0544f1eb7a","avatarUrl":"/avatars/19b5be6d3296da402d8822e51d6376e2.svg","isPro":false,"fullname":"jianzhouWang","user":"jianzhouWang","type":"user"},"name":"Jianzhou Wang","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:43:04.777Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b302","user":{"_id":"66eeeb2ae65d94c88e9af620","avatarUrl":"/avatars/a25657d634878e9d53ada19feb38149a.svg","isPro":false,"fullname":"Jiaqi Deng","user":"MillanK","type":"user"},"name":"Jiaqi Deng","status":"claimed_verified","statusLastChangedAt":"2025-04-11T12:01:58.189Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b303","name":"Jiezhong Qiu","hidden":false},{"_id":"67f8a3db7de2391a06a3b304","name":"Jin Xie","hidden":false},{"_id":"67f8a3db7de2391a06a3b305","name":"Jinhong Wang","hidden":false},{"_id":"67f8a3db7de2391a06a3b306","name":"Jingyuan Liu","hidden":false},{"_id":"67f8a3db7de2391a06a3b307","name":"Junjie Yan","hidden":false},{"_id":"67f8a3db7de2391a06a3b308","user":{"_id":"62cd7aca7a036fc9941bb2b0","avatarUrl":"/avatars/17a4d27af0243fd7dccf06066f671461.svg","isPro":false,"fullname":"kun ouyang","user":"RUBBISHLIKE","type":"user"},"name":"Kun Ouyang","status":"claimed_verified","statusLastChangedAt":"2025-04-16T09:24:13.330Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b309","name":"Liang Chen","hidden":false},{"_id":"67f8a3db7de2391a06a3b30a","user":{"_id":"64abb439e368492ab81458f7","avatarUrl":"/avatars/fa07b23ae7e78a276cf22c2f14577092.svg","isPro":false,"fullname":"Lin Sui","user":"suilin0432","type":"user"},"name":"Lin Sui","status":"claimed_verified","statusLastChangedAt":"2025-04-25T12:27:34.326Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b30b","name":"Longhui Yu","hidden":false},{"_id":"67f8a3db7de2391a06a3b30c","user":{"_id":"6309d1e6a58e1be42eb6eb5e","avatarUrl":"/avatars/2a7a437e801389a9f79b49c164f85817.svg","isPro":false,"fullname":"dong","user":"mengnan","type":"user"},"name":"Mengfan Dong","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:43:53.204Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b30d","name":"Mengnan Dong","hidden":false},{"_id":"67f8a3db7de2391a06a3b30e","user":{"_id":"609653c1146ef3bfe2fc7392","avatarUrl":"/avatars/1639b6552a419209ae67b6562183bc2f.svg","isPro":false,"fullname":"Inui","user":"Norm","type":"user"},"name":"Nuo Xu","status":"claimed_verified","statusLastChangedAt":"2025-04-25T08:35:59.865Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b30f","name":"Pengyu Cheng","hidden":false},{"_id":"67f8a3db7de2391a06a3b310","name":"Qizheng Gu","hidden":false},{"_id":"67f8a3db7de2391a06a3b311","name":"Runjie Zhou","hidden":false},{"_id":"67f8a3db7de2391a06a3b312","name":"Shaowei Liu","hidden":false},{"_id":"67f8a3db7de2391a06a3b313","name":"Sihan Cao","hidden":false},{"_id":"67f8a3db7de2391a06a3b314","name":"Tao Yu","hidden":false},{"_id":"67f8a3db7de2391a06a3b315","user":{"_id":"649e7693a83143427691769c","avatarUrl":"/avatars/d04f7b3d417423abaa053375212da21f.svg","isPro":false,"fullname":"Tianhui Song","user":"sthui","type":"user"},"name":"Tianhui Song","status":"claimed_verified","statusLastChangedAt":"2025-04-14T09:47:04.167Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b316","name":"Tongtong Bai","hidden":false},{"_id":"67f8a3db7de2391a06a3b317","name":"Wei Song","hidden":false},{"_id":"67f8a3db7de2391a06a3b318","user":{"_id":"63a41c1f412fd71fb7ec4a20","avatarUrl":"/avatars/d6837b0dbbf3507896e98754d3f8a468.svg","isPro":false,"fullname":"Wayne Ho","user":"hewr2010","type":"user"},"name":"Weiran He","status":"claimed_verified","statusLastChangedAt":"2025-04-15T07:55:38.322Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b319","user":{"_id":"63c1052e894342c896483a84","avatarUrl":"/avatars/ef99a3c4487b2e3d4c4a266e77b42d15.svg","isPro":false,"fullname":"Weixiao Huang","user":"ztxcydzz","type":"user"},"name":"Weixiao Huang","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:55:57.280Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b31a","name":"Weixin Xu","hidden":false},{"_id":"67f8a3db7de2391a06a3b31b","user":{"_id":"66276d360601587f0befb9fd","avatarUrl":"/avatars/467c847cad08783ee8a47af90c65615d.svg","isPro":false,"fullname":"Xiaokun Yuan","user":"kx233333","type":"user"},"name":"Xiaokun Yuan","status":"admin_assigned","statusLastChangedAt":"2025-04-11T08:21:16.827Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b31c","user":{"_id":"67593edcd3ac91d6238a4901","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/kN7Mf53FamPIi29hLxiII.png","isPro":false,"fullname":"Xingcheng Yao","user":"sxyao","type":"user"},"name":"Xingcheng Yao","status":"admin_assigned","statusLastChangedAt":"2025-04-11T08:23:35.004Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b31d","name":"Xingzhe Wu","hidden":false},{"_id":"67f8a3db7de2391a06a3b31e","name":"Xinxing Zu","hidden":false},{"_id":"67f8a3db7de2391a06a3b31f","name":"Xinyu Zhou","hidden":false},{"_id":"67f8a3db7de2391a06a3b320","user":{"_id":"67b327cdd4665a0448eef7d5","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67b327cdd4665a0448eef7d5/_B5Z9MCa_qiFrDj1axKlz.png","isPro":false,"fullname":"Xinyuan Wang","user":"xywang626","type":"user"},"name":"Xinyuan Wang","status":"admin_assigned","statusLastChangedAt":"2025-06-06T16:17:20.645Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b321","name":"Y. Charles","hidden":false},{"_id":"67f8a3db7de2391a06a3b322","name":"Yan Zhong","hidden":false},{"_id":"67f8a3db7de2391a06a3b323","name":"Yang Li","hidden":false},{"_id":"67f8a3db7de2391a06a3b324","name":"Yangyang Hu","hidden":false},{"_id":"67f8a3db7de2391a06a3b325","name":"Yanru Chen","hidden":false},{"_id":"67f8a3db7de2391a06a3b326","name":"Yejie Wang","hidden":false},{"_id":"67f8a3db7de2391a06a3b327","name":"Yibo Liu","hidden":false},{"_id":"67f8a3db7de2391a06a3b328","user":{"_id":"64a139c098fad0c8a5a627a4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64a139c098fad0c8a5a627a4/MgWortQS64cJOZ3pL4WyH.jpeg","isPro":false,"fullname":"Yibo Miao","user":"instro","type":"user"},"name":"Yibo Miao","status":"admin_assigned","statusLastChangedAt":"2025-04-11T08:23:15.494Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b329","name":"Yidao Qin","hidden":false},{"_id":"67f8a3db7de2391a06a3b32a","name":"Yimin Chen","hidden":false},{"_id":"67f8a3db7de2391a06a3b32b","name":"Yiping Bao","hidden":false},{"_id":"67f8a3db7de2391a06a3b32c","name":"Yiqin Wang","hidden":false},{"_id":"67f8a3db7de2391a06a3b32d","name":"Yongsheng Kang","hidden":false},{"_id":"67f8a3db7de2391a06a3b32e","user":{"_id":"6489761dcaea79f577897f98","avatarUrl":"/avatars/8f56dc9c08dc2b672555602d68509a03.svg","isPro":false,"fullname":"Yuanxin Liu","user":"lyx97","type":"user"},"name":"Yuanxin Liu","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:56:23.814Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b32f","user":{"_id":"6340f31fb78ed99eab04ce33","avatarUrl":"/avatars/2e7fcbf0233bdc0bc9a3f4603fd8bf90.svg","isPro":false,"fullname":"Du","user":"Yulun","type":"user"},"name":"Yulun Du","status":"admin_assigned","statusLastChangedAt":"2025-04-11T08:23:06.421Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b330","name":"Yuxin Wu","hidden":false},{"_id":"67f8a3db7de2391a06a3b331","user":{"_id":"67127a470a82509269d738ae","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/M9qLmI3P6dT2FIwEPFJq0.png","isPro":false,"fullname":"yuzhi wang","user":"vin-tage","type":"user"},"name":"Yuzhi Wang","status":"admin_assigned","statusLastChangedAt":"2025-04-11T08:22:52.379Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b332","name":"Yuzi Yan","hidden":false},{"_id":"67f8a3db7de2391a06a3b333","user":{"_id":"64409d69518271b0d1c033a6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64409d69518271b0d1c033a6/xxr8UEceXqyNVMETWSiOu.jpeg","isPro":false,"fullname":"zhouzaida","user":"zhouzaida","type":"user"},"name":"Zaida Zhou","status":"admin_assigned","statusLastChangedAt":"2025-04-11T08:22:41.896Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b334","name":"Zhaowei Li","hidden":false},{"_id":"67f8a3db7de2391a06a3b335","user":{"_id":"662c6e8352e194d5d44d873c","avatarUrl":"/avatars/385a5cc7299faf2f61ccbabedd827f29.svg","isPro":false,"fullname":"Zhejun Jiang","user":"Skewed","type":"user"},"name":"Zhejun Jiang","status":"admin_assigned","statusLastChangedAt":"2025-04-11T08:22:27.049Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b336","name":"Zheng Zhang","hidden":false},{"_id":"67f8a3db7de2391a06a3b337","user":{"_id":"64bf74154d2052b1aa5ca6d9","avatarUrl":"/avatars/7aa6f2952cdbc20cfa758fdd905f06a6.svg","isPro":false,"fullname":"ZHILIN YANG","user":"bruceyannnn","type":"user"},"name":"Zhilin Yang","status":"admin_assigned","statusLastChangedAt":"2025-04-11T08:22:17.469Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b338","name":"Zhiqi Huang","hidden":false},{"_id":"67f8a3db7de2391a06a3b339","user":{"_id":"66561c5a8ec33cfd8c724cf1","avatarUrl":"/avatars/88ce5bc8ce2d7b1ca97a33d7863bf184.svg","isPro":false,"fullname":"Zihao Huang","user":"EdwardHzh","type":"user"},"name":"Zihao Huang","status":"admin_assigned","statusLastChangedAt":"2025-04-11T07:43:37.864Z","hidden":false},{"_id":"67f8a3db7de2391a06a3b33a","name":"Zijia Zhao","hidden":false},{"_id":"67f8a3db7de2391a06a3b33b","name":"Ziwei Chen","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/63047ed2412a1b9d381b09c9/p-kLtTC-gIyuAzN76GIt9.png","https://cdn-uploads.huggingface.co/production/uploads/63047ed2412a1b9d381b09c9/hPa3VKuztFbrcKKj8LQDy.jpeg"],"publishedAt":"2025-04-10T06:48:26.000Z","submittedOnDailyAt":"2025-04-11T03:40:59.047Z","title":"Kimi-VL Technical Report","submittedOnDailyBy":{"_id":"63047ed2412a1b9d381b09c9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63047ed2412a1b9d381b09c9/2Ill5G0uSMyGstrawgmIb.jpeg","isPro":true,"fullname":"Haoning Wu, Teo","user":"teowu","type":"user"},"summary":"We present Kimi-VL, an efficient open-source Mixture-of-Experts (MoE)\nvision-language model (VLM) that offers advanced multimodal reasoning,\nlong-context understanding, and strong agent capabilities - all while\nactivating only 2.8B parameters in its language decoder (Kimi-VL-A3B). Kimi-VL\ndemonstrates strong performance across challenging domains: as a\ngeneral-purpose VLM, Kimi-VL excels in multi-turn agent tasks (e.g., OSWorld),\nmatching flagship models. Furthermore, it exhibits remarkable capabilities\nacross diverse challenging vision language tasks, including college-level image\nand video comprehension, OCR, mathematical reasoning, and multi-image\nunderstanding. In comparative evaluations, it effectively competes with\ncutting-edge efficient VLMs such as GPT-4o-mini, Qwen2.5-VL-7B, and\nGemma-3-12B-IT, while surpassing GPT-4o in several key domains. Kimi-VL also\nadvances in processing long contexts and perceiving clearly. With a 128K\nextended context window, Kimi-VL can process diverse long inputs, achieving\nimpressive scores of 64.5 on LongVideoBench and 35.1 on MMLongBench-Doc. Its\nnative-resolution vision encoder, MoonViT, further allows it to see and\nunderstand ultra-high-resolution visual inputs, achieving 83.2 on InfoVQA and\n34.5 on ScreenSpot-Pro, while maintaining lower computational cost for common\ntasks. Building upon Kimi-VL, we introduce an advanced long-thinking variant:\nKimi-VL-Thinking. Developed through long chain-of-thought (CoT) supervised\nfine-tuning (SFT) and reinforcement learning (RL), this model exhibits strong\nlong-horizon reasoning capabilities. It achieves scores of 61.7 on MMMU, 36.8\non MathVision, and 71.3 on MathVista while maintaining the compact 2.8B\nactivated LLM parameters, setting a new standard for efficient multimodal\nthinking models. Code and models are publicly accessible at\nhttps://github.com/MoonshotAI/Kimi-VL.","upvotes":137,"discussionId":"67f8a3de7de2391a06a3b420","projectPage":"https://huggingface.co/spaces/moonshotai/Kimi-VL-A3B-Thinking","githubRepo":"https://github.com/MoonshotAI/Kimi-VL","githubRepoAddedBy":"user","ai_summary":"Kimi-VL, an efficient Mixture-of-Experts vision-language model, excels in multimodal reasoning, long-context understanding, and diverse vision-language tasks, achieving competitive performance with reduced computational cost.","ai_keywords":["Mixture-of-Experts (MoE)","vision-language model (VLM)","multi-turn agent tasks","OSWorld","college-level image and video comprehension","OCR","mathematical reasoning","multi-image understanding","GPT-4o-mini","Qwen2.5-VL-7B","Gemma-3-12B-IT","long contexts","LongVideoBench","MMLongBench-Doc","MoonViT","InfoVQA","ScreenSpot-Pro","chain-of-thought (CoT) supervised fine-tuning (SFT)","reinforcement learning (RL)","MMMU","MathVision","MathVista"],"githubStars":1162,"organization":{"_id":"6425a114812813f8f4a9b02c","name":"moonshotai","fullname":"Moonshot AI","avatar":"https://cdn-uploads.huggingface.co/production/uploads/641c1e77c3983aa9490f8121/X1yT2rsaIbR9cdYGEVu0X.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"63047ed2412a1b9d381b09c9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63047ed2412a1b9d381b09c9/2Ill5G0uSMyGstrawgmIb.jpeg","isPro":true,"fullname":"Haoning Wu, Teo","user":"teowu","type":"user"},{"_id":"667a7cefdfe717c193e53f99","avatarUrl":"/avatars/94e4410f94e43e0c0e740730d108eace.svg","isPro":false,"fullname":"Yimin","user":"Maxwelldht","type":"user"},{"_id":"64ec364e7e2ec711a7601cde","avatarUrl":"/avatars/6ba47d496586de65df183f056d35982b.svg","isPro":false,"fullname":"Hao Yang","user":"hayayanghao","type":"user"},{"_id":"6365df6912188d67e65f5c5b","avatarUrl":"/avatars/59a1d2f30ba4faea0336bedf4df321a8.svg","isPro":false,"fullname":"Yanru Chen","user":"AChen-qaq","type":"user"},{"_id":"63dc7aa562dc193e6d467200","avatarUrl":"/avatars/56c81062b497ccede76a347a5b049485.svg","isPro":false,"fullname":"Zhiqi Huang","user":"googlebrain","type":"user"},{"_id":"64c21fb42426d683e56b42bf","avatarUrl":"/avatars/60359fe204e32af831d701d2975c4599.svg","isPro":false,"fullname":"Du","user":"DuChenZhuang","type":"user"},{"_id":"65c20ee58aedd6edd2b89000","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65c20ee58aedd6edd2b89000/LtS4YTbmxiCFqHSGHfdC8.png","isPro":false,"fullname":"Chmielewski","user":"Eryk-Chmielewski","type":"user"},{"_id":"6331606f18711776b4655e67","avatarUrl":"/avatars/1479c2ca743b9f92d845b0ed23fcd07b.svg","isPro":false,"fullname":"Enming Yuan","user":"EnmingYuan","type":"user"},{"_id":"642bcd9be8dfcc1fe4f4f853","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/642bcd9be8dfcc1fe4f4f853/M9Yqkyt66dnWWCwmBZ8l0.jpeg","isPro":false,"fullname":"Haotian Yao","user":"skylark-95","type":"user"},{"_id":"6656ae9485c7a647ebd3c407","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6656ae9485c7a647ebd3c407/h5bo91uLLQB_CrjctCcPl.jpeg","isPro":false,"fullname":"Theo Li","user":"bbtfr","type":"user"},{"_id":"657152eb12f162153b50ec9d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/657152eb12f162153b50ec9d/qnldHP35PclV0pDz_05q8.jpeg","isPro":false,"fullname":"Byung-Kwan Lee","user":"BK-Lee","type":"user"},{"_id":"6343d01a08c017b2c042305d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6343d01a08c017b2c042305d/cmJrYkGs9RjDAKfMYCqdW.jpeg","isPro":false,"fullname":"Flood Sung","user":"floodsung","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":1,"organization":{"_id":"6425a114812813f8f4a9b02c","name":"moonshotai","fullname":"Moonshot AI","avatar":"https://cdn-uploads.huggingface.co/production/uploads/641c1e77c3983aa9490f8121/X1yT2rsaIbR9cdYGEVu0X.jpeg"}}">
Papers
arxiv:2504.07491

Kimi-VL Technical Report

Published on Apr 10, 2025
· Submitted by
Haoning Wu, Teo
on Apr 11, 2025
#1 Paper of the day
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

Kimi-VL, an efficient Mixture-of-Experts vision-language model, excels in multimodal reasoning, long-context understanding, and diverse vision-language tasks, achieving competitive performance with reduced computational cost.

AI-generated summary

We present Kimi-VL, an efficient open-source Mixture-of-Experts (MoE) vision-language model (VLM) that offers advanced multimodal reasoning, long-context understanding, and strong agent capabilities - all while activating only 2.8B parameters in its language decoder (Kimi-VL-A3B). Kimi-VL demonstrates strong performance across challenging domains: as a general-purpose VLM, Kimi-VL excels in multi-turn agent tasks (e.g., OSWorld), matching flagship models. Furthermore, it exhibits remarkable capabilities across diverse challenging vision language tasks, including college-level image and video comprehension, OCR, mathematical reasoning, and multi-image understanding. In comparative evaluations, it effectively competes with cutting-edge efficient VLMs such as GPT-4o-mini, Qwen2.5-VL-7B, and Gemma-3-12B-IT, while surpassing GPT-4o in several key domains. Kimi-VL also advances in processing long contexts and perceiving clearly. With a 128K extended context window, Kimi-VL can process diverse long inputs, achieving impressive scores of 64.5 on LongVideoBench and 35.1 on MMLongBench-Doc. Its native-resolution vision encoder, MoonViT, further allows it to see and understand ultra-high-resolution visual inputs, achieving 83.2 on InfoVQA and 34.5 on ScreenSpot-Pro, while maintaining lower computational cost for common tasks. Building upon Kimi-VL, we introduce an advanced long-thinking variant: Kimi-VL-Thinking. Developed through long chain-of-thought (CoT) supervised fine-tuning (SFT) and reinforcement learning (RL), this model exhibits strong long-horizon reasoning capabilities. It achieves scores of 61.7 on MMMU, 36.8 on MathVision, and 71.3 on MathVista while maintaining the compact 2.8B activated LLM parameters, setting a new standard for efficient multimodal thinking models. Code and models are publicly accessible at https://github.com/MoonshotAI/Kimi-VL.

Community

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

IMG-20250629-WA0000.jpg

What is this

·
Paper author

Please try out our model's visual question answering ability on https://huggingface.co/spaces/moonshotai/Kimi-VL-A3B-Thinking.

Sign up or log in to comment

Models citing this paper 13

Browse 13 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2504.07491 in a dataset README.md to link it from this page.

Spaces citing this paper 14

Collections including this paper 24