drluodian@gmail.com if you want to build with us.\n
\n[2025-12] 🤞🤞 See the real world in motion | Presenting OneVision Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence.
\n💻 GitHub | 🤗 Model | 📖 Paper
\n \n[2025-11] 🔭🔭 Introducing LongVT: an end-to-end agentic framework for \"Thinking with Long Videos\" via native tool calling
\n💻 GitHub | 🤗 Model and Dataset | 📖 Paper | 📚 Blog | 💻 Demo
\n \n[2025-11] 🔥🔥 Introducing OpenMMReasoner: a fully transparent two-stage recipe for multimodal reasoning spanning supervised fine-tuning (SFT) and reinforcement learning (RL).
\n💻 GitHub | 🤗 Model and Dataset | 📖 Paper | 📚 Blog
\n \n[2025-9] 🔥🔥 Introducing LLaVA-OneVision-1.5: a novel family of fully open-source Large Multimodal Models (LMMs) that achieves state-of-the-art performance with substantially lower cost through training on native resolution images.
\n 💻 GitHub | 🤗 Model and Dataset | 📖 Paper
\n \n[2025-9] 🔥🔥 Introducing LLaVA-Critic-R1: A family of generative critic VLM trained through GRPO using pairwise critic data. LLaVA-Critic-R1 not only demonstrates strong critic capability, but also achieves SoTA policy performance at the 7B scale.
\n 💻 GitHub | 🤗 Model and Dataset | 📖 Paper
\n \n[2025-4] 🔈🔈 Introducing Aero-1-Audio: It is a compact audio model adept at various audio tasks, including speech recognition, audio understanding, and following audio instructions.
\n 📚 Blog | 🤗 Model Checkpoints | 📖 Evaluation Results | 📚 Cookbook
\n \n[2025-3] 👓👓 Introducing EgoLife: Towards Egocentric Life Assistant. For one week, six individuals lived together, capturing every moment through AI glasses, and creating the EgoLife dataset. Based on this we build models and benchmarks to drive the future of AI life assistants that capable of recalling past events, tracking habits, and providing personalized, long-context assistance to enhance daily life.
\n Homepage | Github | Blog | Paper | Demo
\n \n[2025-1] 🎬🎬 Introducing VideoMMMU: Evaluating Knowledge Acquisition from Professional Videos. Spanning 6 professional disciplines (Art, Business, Science, Medicine, Humanities, Engineering) and 30 diverse subjects, Video-MMMU challenges models to learn and apply college-level knowledge from videos.
\n Homepage | Github | Paper
\n \n[2024-11] 🔔🔔 We are excited to introduce LMMs-Eval/v0.3.0, focusing on audio understanding. Building upon LMMs-Eval/v0.2.0, we have added audio models and tasks. Now, LMMs-Eval provides a consistent evaluation toolkit across image, video, and audio modalities.
\n GitHub | Documentation
\n \n[2024-11] 🤯🤯 We introduce Multimodal SAE, the first framework designed to interpret learned features in large-scale multimodal models using Sparse Autoencoders. Through our approach, we leverage LLaVA-OneVision-72B to analyze and explain the SAE-derived features of LLaVA-NeXT-LLaMA3-8B. Furthermore, we demonstrate the ability to steer model behavior by clamping specific features to alleviate hallucinations and avoid safety-related issues.
\n GitHub | Paper
\n \n[2024-10] 🔥🔥 We present LLaVA-Critic, the first open-source large multimodal model as a generalist evaluator for assessing LMM-generated responses across diverse multimodal tasks and scenarios.
\n GitHub | Blog
\n \n[2024-10] 🎬🎬 Introducing LLaVA-Video, a family of open large multimodal models designed specifically for advanced video understanding. We're open-sourcing LLaVA-Video-178K, a high-quality, synthetic dataset for video instruction tuning.
\n GitHub | Blog
\n \n[2024-08] 🤞🤞 We present LLaVA-OneVision, a family of LMMs developed by consolidating insights into data, models, and visual representations.
\n GitHub | Blog
\n \n[2024-06] 🧑🎨🧑🎨 We release LLaVA-NeXT-Interleave, an LMM extending capabilities to real-world settings: Multi-image, Multi-frame (videos), Multi-view (3D), and Multi-patch (single-image).
\n GitHub | Blog
\n \n[2024-06] 🚀🚀 We release LongVA, a long language model with state-of-the-art video understanding performance.
\n GitHub | Blog
\n \n
\n
\n Older Updates (2024-06 and earlier)
\n\n\n[2024-06] 🎬🎬 The lmms-eval/v0.2 toolkit now supports video evaluations for models like LLaVA-NeXT Video and Gemini 1.5 Pro.
\n GitHub | Blog
\n \n[2024-05] 🚀🚀 We release LLaVA-NeXT Video, a model performing at Google's Gemini level on video understanding tasks.
\n GitHub | Blog
\n \n[2024-05] 🚀🚀 The LLaVA-NeXT model family reaches near GPT-4V performance on multimodal benchmarks, with models up to 110B parameters.
\n GitHub | Blog
\n \n[2024-03] We release lmms-eval, a toolkit for holistic evaluations with 50+ multimodal datasets and 10+ models.
\n GitHub | Blog
\n \n
\n ","classNames":"hf-sanitized hf-sanitized-LjWk2KZ5IMIUYHo_TpTpg"},"users":[{"_id":"62d3f7d84b0933c48f3cdd9c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/Tab1vxtxLatWzXS8NVIyo.png","isPro":true,"fullname":"Bo Li","user":"luodian","type":"user"},{"_id":"646e1ef5075bbcc48ddf21e8","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/_vJC0zeVOIvaNV2R6toqg.jpeg","isPro":false,"fullname":"Pu Fanyi","user":"pufanyi","type":"user"},{"_id":"63565cc56d7fcf1bedb7d347","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63565cc56d7fcf1bedb7d347/XGcHP4VkO_oieA1gZ4IAX.jpeg","isPro":false,"fullname":"Zhang Peiyuan","user":"PY007","type":"user"},{"_id":"62a993d80472c0b7f94027df","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62a993d80472c0b7f94027df/j5vp-IwLA2YBexylUHiQU.png","isPro":false,"fullname":"Zhang Yuanhan","user":"ZhangYuanhan","type":"user"},{"_id":"62aba526cae4462c0c6caa0f","avatarUrl":"/avatars/430560ec2c2547f819225769ab432f30.svg","isPro":false,"fullname":"Chunyuan Li","user":"Chunyuan24","type":"user"},{"_id":"63898b61ec1f539adc0f4da2","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674248167280-63898b61ec1f539adc0f4da2.jpeg","isPro":false,"fullname":"Haotian Liu","user":"liuhaotian","type":"user"},{"_id":"64bb77e786e7fb5b8a317a43","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bb77e786e7fb5b8a317a43/J0jOrlZJ9gazdYaeSH2Bo.png","isPro":false,"fullname":"kcz","user":"kcz358","type":"user"},{"_id":"6400ba2b261cfa61f3a00555","avatarUrl":"/avatars/1311e0b5e21b1c94d73fcaf455d3c7f7.svg","isPro":false,"fullname":"Kairui","user":"KairuiHu","type":"user"},{"_id":"65d8c52cda156c05ed073c11","avatarUrl":"/avatars/29501a2393c6fd7431b773eada1bc7f0.svg","isPro":false,"fullname":"Nguyen Quang Trung","user":"ngqtrung","type":"user"},{"_id":"664e25e2e4de44dd282fc542","avatarUrl":"/avatars/f97bba4845096e9a221613e0955e2093.svg","isPro":false,"fullname":"Pham Ba Cong","user":"pbcong","type":"user"},{"_id":"652fbe8cb2acab0b82f855a6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/652fbe8cb2acab0b82f855a6/lVpzeEoFRQ6dnGAoNS9b3.jpeg","isPro":false,"fullname":"Jinming Wu","user":"kimingng","type":"user"},{"_id":"66698901043f031b644a2134","avatarUrl":"/avatars/1bc08b4ecd7e841a20830638584b433f.svg","isPro":false,"fullname":"Yingluo Li","user":"totoluo","type":"user"},{"_id":"65dc33d97d21c0c6e5729f2d","avatarUrl":"/avatars/5ccc9219546e388155ce6d4f06d568e7.svg","isPro":false,"fullname":"Devin Thang","user":"winvswon78","type":"user"},{"_id":"62b5777f593a2c49da69dc02","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1658152070753-62b5777f593a2c49da69dc02.jpeg","isPro":false,"fullname":"Jingkang Yang","user":"Jingkang","type":"user"},{"_id":"67d7b89db35ca6e88d66fa89","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/pCEU5Koqg849mJ14ZM5TN.png","isPro":false,"fullname":"Zihao Deng","user":"TheRealZihaoDeng","type":"user"},{"_id":"668651563c3cf1439f6bc1ef","avatarUrl":"/avatars/a4d8a2b431a995a929701324444d58eb.svg","isPro":false,"fullname":"Yezhen Wang","user":"yezzzz","type":"user"},{"_id":"63bf7ba8da08ed0544ff20e9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1673493776367-63bf7ba8da08ed0544ff20e9.jpeg","isPro":false,"fullname":"Xinyu Huang","user":"xinyu1205","type":"user"},{"_id":"655fed9fdef5905d38b84af3","avatarUrl":"/avatars/2cda4182dfd11a1e94743639e62328ea.svg","isPro":false,"fullname":"Xiyao Wang","user":"russwang","type":"user"},{"_id":"67adc6c2c0d8c0e61945deef","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67adc6c2c0d8c0e61945deef/XEUC8pA6gchvffleDlZjj.jpeg","isPro":false,"fullname":"Gao Yiming","user":"YichenG170","type":"user"},{"_id":"68a2efb78c4a59103e52323e","avatarUrl":"/avatars/07aeeeaf916a6df4d817cd924886c349.svg","isPro":false,"fullname":"Jinghao Guo","user":"Jinghao-Guo","type":"user"},{"_id":"62c6604575f0c7ebf125b408","avatarUrl":"/avatars/f94e3a03f238c3a055acc3110b3b5c0c.svg","isPro":false,"fullname":"Do Duc Anh","user":"KelvinDo","type":"user"},{"_id":"655c70d331c4978366d4b2e6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/655c70d331c4978366d4b2e6/X-KjTNkxtzeYu9ngBOh_C.jpeg","isPro":false,"fullname":"yiyexy","user":"yiyexy","type":"user"},{"_id":"682d44b664daf8623f1e905e","avatarUrl":"/avatars/7ebc2ef56b92ea91fc803a3b5819b645.svg","isPro":false,"fullname":"wkzhang","user":"winking636","type":"user"},{"_id":"6478679d7b370854241b2ad8","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6478679d7b370854241b2ad8/dBczWYYdfEt9tQcnVGhQk.jpeg","isPro":false,"fullname":"xiangan","user":"xiangan","type":"user"},{"_id":"64b4a717aa03b6520839e9b8","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64b4a717aa03b6520839e9b8/Rt3ERG-6BVEA4hAwOz0_I.jpeg","isPro":false,"fullname":"Haiwen Diao","user":"Paranioar","type":"user"},{"_id":"62cc7a38376917c0223dd24b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62cc7a38376917c0223dd24b/VytbnJrtkFy4-GBgWo8pP.jpeg","isPro":false,"fullname":"JiankangDeng","user":"JiankangDeng","type":"user"},{"_id":"652d06833b5997ed71ce5c46","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/xZTXEcnEogEmBm_ledJQr.jpeg","isPro":false,"fullname":"Zhongang Cai","user":"caizhongang","type":"user"},{"_id":"6626a471430a124253f197c8","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6626a471430a124253f197c8/uVEk5nnW-bS6-no0rQ7Wh.png","isPro":false,"fullname":"yl-1993","user":"yl-1993","type":"user"},{"_id":"663aebd866705e594f4b9752","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/663aebd866705e594f4b9752/eso5QaaPOMoHhezTFLedj.png","isPro":false,"fullname":"wangyubo","user":"PeterStacy","type":"user"},{"_id":"67ad672d5fb4bc8dc2b604d3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/-vpPM9z8XjJOegZmG7l0D.png","isPro":false,"fullname":"YANG Zhitao","user":"edwardyzt","type":"user"},{"_id":"6524d665ab1416594149e07e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6524d665ab1416594149e07e/KMsCaAtV0DLC4tqN8f2a7.png","isPro":false,"fullname":"Zuhao Yang","user":"mwxely","type":"user"},{"_id":"66915a572c1a3a8edcc977b4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66915a572c1a3a8edcc977b4/2tANTgj48VQMgCcEcdkwE.jpeg","isPro":false,"fullname":"Yuwei Niu","user":"Yuwei-Niu","type":"user"}],"userCount":32,"collections":[{"slug":"lmms-lab/onevision-encoder-694e9e3fb6935ef899ef241d","title":"OneVision-Encoder","description":"HEVC-Style Vision Transformer","gating":false,"lastUpdated":"2026-02-10T02:19:38.840Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"694e9e44a8d391e3fb77232a","position":0,"type":"model","author":"lmms-lab-encoder","authorData":{"_id":"694cf74dcc8b3ca86253c68b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/E7N-rLDrtG6o3Io7WfDIh.png","fullname":"LMMs-Lab-Encoder","name":"lmms-lab-encoder","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":9,"isUserFollowing":false},"downloads":234,"gated":false,"id":"lmms-lab-encoder/onevision-encoder-large","availableInferenceProviders":[],"lastModified":"2026-02-05T09:20:21.000Z","likes":14,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":315509760},{"_id":"698a95babc861398b8f4847f","position":1,"type":"model","author":"lmms-lab-encoder","authorData":{"_id":"694cf74dcc8b3ca86253c68b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/E7N-rLDrtG6o3Io7WfDIh.png","fullname":"LMMs-Lab-Encoder","name":"lmms-lab-encoder","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":9,"isUserFollowing":false},"downloads":22,"gated":false,"id":"lmms-lab-encoder/onevision-encoder-large-lang","availableInferenceProviders":[],"lastModified":"2026-02-10T04:00:09.000Z","likes":8,"private":false,"repoType":"model","isLikedByUser":false}],"position":0,"theme":"indigo","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/onevision-encoder","upvotes":3,"isUpvotedByUser":false},{"slug":"lmms-lab/longvt-69270daf66ae2cc21c539411","title":"LongVT","description":"","gating":false,"lastUpdated":"2025-12-11T01:03:44.620Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"69270dcaaa025ddc7087e66c","position":0,"type":"space","author":"longvideotool","authorData":{"_id":"69242e1c995126e756fee9b1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6524d665ab1416594149e07e/WOXOzp-67Lluk406AuvxG.png","fullname":"LongVT","name":"longvideotool","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":8,"isUserFollowing":false},"colorFrom":"purple","colorTo":"blue","createdAt":"2025-11-26T11:53:32.000Z","emoji":"🎬","id":"longvideotool/LongVT-Demo","lastModified":"2025-12-02T02:40:35.000Z","likes":3,"pinned":false,"private":false,"sdk":"gradio","repoType":"space","runtime":{"stage":"RUNNING","hardware":{"current":"cpu-basic","requested":"cpu-basic"},"storage":null,"gcTimeout":172800,"errorMessage":null,"replicas":{"requested":1},"devMode":false,"domains":[{"domain":"longvideotool-longvt.hf.space","stage":"READY"},{"domain":"longvideotool-longvt-demo.hf.space","stage":"READY"}]},"title":"LongVT Demo","isLikedByUser":false,"ai_short_description":"Analyze long videos and answer questions about them","ai_category":"Video Generation","trendingScore":0,"tags":["gradio","region:us"],"featured":false},{"_id":"69270dd22777a3a592df0c79","position":1,"type":"model","author":"longvideotool","authorData":{"_id":"69242e1c995126e756fee9b1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6524d665ab1416594149e07e/WOXOzp-67Lluk406AuvxG.png","fullname":"LongVT","name":"longvideotool","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":8,"isUserFollowing":false},"downloads":61,"gated":false,"id":"longvideotool/LongVT-RL","availableInferenceProviders":[],"lastModified":"2025-12-04T11:18:36.000Z","likes":2,"pipeline_tag":"video-text-to-text","private":false,"repoType":"model","isLikedByUser":false},{"_id":"69270dda2f606fd9b2559742","position":2,"type":"model","author":"longvideotool","authorData":{"_id":"69242e1c995126e756fee9b1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6524d665ab1416594149e07e/WOXOzp-67Lluk406AuvxG.png","fullname":"LongVT","name":"longvideotool","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":8,"isUserFollowing":false},"downloads":15,"gated":false,"id":"longvideotool/LongVT-SFT","availableInferenceProviders":[],"lastModified":"2025-12-04T11:20:07.000Z","likes":1,"pipeline_tag":"video-text-to-text","private":false,"repoType":"model","isLikedByUser":false},{"_id":"69270de4c5108bbd1ec7749d","position":3,"type":"model","author":"longvideotool","authorData":{"_id":"69242e1c995126e756fee9b1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6524d665ab1416594149e07e/WOXOzp-67Lluk406AuvxG.png","fullname":"LongVT","name":"longvideotool","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":8,"isUserFollowing":false},"downloads":125,"gated":false,"id":"longvideotool/LongVT-RFT","availableInferenceProviders":[],"lastModified":"2025-12-04T11:19:34.000Z","likes":1,"pipeline_tag":"video-text-to-text","private":false,"repoType":"model","isLikedByUser":false}],"position":1,"theme":"orange","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/longvt","upvotes":8,"isUpvotedByUser":false},{"slug":"lmms-lab/openmmreasoner-691e7eaf2cd7eeefd1db6e0c","title":"OpenMMReasoner","description":"","gating":false,"lastUpdated":"2025-11-24T02:42:30.429Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"691e7ec8387da7054282f6b4","position":0,"type":"model","author":"OpenMMReasoner","authorData":{"_id":"691d522e08a3ab160a1b7dcb","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bb77e786e7fb5b8a317a43/OI-XgSqjbTtYCDoaGVPbE.png","fullname":"OpenMMReasoner","name":"OpenMMReasoner","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":14,"isUserFollowing":false},"downloads":502,"gated":false,"id":"OpenMMReasoner/OpenMMReasoner-ColdStart","availableInferenceProviders":[],"lastModified":"2025-12-30T10:54:49.000Z","likes":3,"pipeline_tag":"image-text-to-text","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":8292166656},{"_id":"691e7ed2d7c12c4dfb27d28b","position":1,"type":"model","author":"OpenMMReasoner","authorData":{"_id":"691d522e08a3ab160a1b7dcb","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bb77e786e7fb5b8a317a43/OI-XgSqjbTtYCDoaGVPbE.png","fullname":"OpenMMReasoner","name":"OpenMMReasoner","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":14,"isUserFollowing":false},"downloads":321,"gated":false,"id":"OpenMMReasoner/OpenMMReasoner-RL","availableInferenceProviders":[],"lastModified":"2025-12-30T10:54:20.000Z","likes":15,"pipeline_tag":"image-text-to-text","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":8292166656},{"_id":"691e7ee20ed6c5a0c22b6bad","position":2,"type":"dataset","author":"OpenMMReasoner","downloads":161,"gated":false,"id":"OpenMMReasoner/OpenMMReasoner-SFT-874K","lastModified":"2025-12-30T10:55:18.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":874357,"libraries":["datasets","pandas","polars","mlcroissant"],"formats":["parquet","optimized-parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":5,"isLikedByUser":false,"isBenchmark":false},{"_id":"691e7ee9d7c12c4dfb27d3da","position":3,"type":"dataset","author":"OpenMMReasoner","downloads":342,"gated":false,"id":"OpenMMReasoner/OpenMMReasoner-RL-74K","lastModified":"2025-11-25T10:16:58.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":74695,"libraries":["datasets","pandas","polars","mlcroissant"],"formats":["parquet","optimized-parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":7,"isLikedByUser":false,"isBenchmark":false}],"position":2,"theme":"pink","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/openmmreasoner","upvotes":11,"isUpvotedByUser":false},{"slug":"lmms-lab/llava-onevision-15-68d385fe73b50bd22de23713","title":"LLaVA-OneVision-1.5","description":"https://github.com/EvolvingLMMs-Lab/LLaVA-OneVision-1.5","gating":false,"lastUpdated":"2025-10-21T07:29:56.484Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"68d38611519e5c58a6ef091f","position":0,"type":"dataset","note":{"html":"Stage-2: Visual Instruction Tuning.","text":"Stage-2: Visual Instruction Tuning."},"author":"mvp-lab","downloads":87396,"gated":false,"id":"mvp-lab/LLaVA-OneVision-1.5-Instruct-Data","lastModified":"2025-11-21T10:19:13.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":21850487,"libraries":[],"formats":[],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":65,"isLikedByUser":false,"isBenchmark":false},{"_id":"68d3861f56ccd14124de9ace","position":1,"type":"dataset","note":{"html":"Stage-1.5: High-Quality Knowledge Learning,","text":"Stage-1.5: High-Quality Knowledge Learning,"},"author":"mvp-lab","downloads":196896,"gated":false,"id":"mvp-lab/LLaVA-OneVision-1.5-Mid-Training-85M","lastModified":"2025-11-24T06:32:02.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":91493097,"libraries":["datasets","dask","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":59,"isLikedByUser":false,"isBenchmark":false},{"_id":"68d38606b8a8afce4ed9e335","position":2,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":37817,"gated":false,"id":"lmms-lab/LLaVA-OneVision-1.5-8B-Instruct","availableInferenceProviders":[],"lastModified":"2025-10-21T05:53:35.000Z","likes":62,"pipeline_tag":"image-text-to-text","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":8527214624},{"_id":"68d65e3dc048f6c2a48a14f8","position":3,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":3456,"gated":false,"id":"lmms-lab/LLaVA-OneVision-1.5-4B-Instruct","availableInferenceProviders":[],"lastModified":"2026-02-06T15:48:30.000Z","likes":17,"pipeline_tag":"image-text-to-text","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":4741610528}],"position":3,"theme":"green","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/llava-onevision-15","upvotes":19,"isUpvotedByUser":false},{"slug":"lmms-lab/llava-critic-r1-68922484e5822b89fab4aca1","title":"LLaVA-Critic-R1","description":"","gating":false,"lastUpdated":"2025-09-03T17:10:10.798Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"6892249dd37a5a9a7fd56538","position":0,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":133,"gated":false,"id":"lmms-lab/LLaVA-Critic-R1-7B","availableInferenceProviders":[],"lastModified":"2025-07-19T04:45:07.000Z","likes":0,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":8292166656},{"_id":"689224a5846ae9ccbf812ec1","position":1,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":3,"gated":false,"id":"lmms-lab/LLaVA-Critic-R1-7B-Plus-Qwen","availableInferenceProviders":[],"lastModified":"2025-07-26T20:23:29.000Z","likes":5,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":8292166656},{"_id":"68b0a4ee0158f2f7c2c5a019","position":2,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":2,"gated":false,"id":"lmms-lab/LLaVA-Critic-R1-7B-Plus-Mimo","availableInferenceProviders":[],"lastModified":"2025-08-28T18:44:16.000Z","likes":1,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":8306217216},{"_id":"68b0a4f9aae41bcebee7d55b","position":3,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":0,"gated":false,"id":"lmms-lab/LLaVA-Critic-R1-7B-LLaMA32v","availableInferenceProviders":[],"lastModified":"2025-08-28T18:48:25.000Z","likes":0,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":10670220835}],"position":4,"theme":"purple","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/llava-critic-r1","upvotes":2,"isUpvotedByUser":false},{"slug":"lmms-lab/mmsearch-r1-6889e975d8651ce2554b1b3e","title":"MMSearch-R1","description":"MMSearch-R1 is a solution designed to train LMMs to perform on-demand multimodal search in real-world environment.","gating":false,"lastUpdated":"2025-08-08T02:05:01.987Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"68947abcc3f8d15ea679e4b5","position":0,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":1,"gated":false,"id":"lmms-lab/MMSearch-R1-7B-0807","availableInferenceProviders":[],"lastModified":"2025-08-07T10:10:32.000Z","likes":0,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":8292166656},{"_id":"6889e983f3cc4c4630f8f502","position":1,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":61,"gated":false,"id":"lmms-lab/MMSearch-R1-7B","availableInferenceProviders":[],"lastModified":"2025-07-30T09:42:52.000Z","likes":9,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":8292166656},{"_id":"6889e98aa4482c269fe402fd","position":2,"type":"dataset","author":"lmms-lab","downloads":138,"gated":false,"id":"lmms-lab/FVQA","lastModified":"2025-08-09T11:39:47.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":6656,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":7,"isLikedByUser":false,"isBenchmark":false},{"_id":"6889e9941baca2e6b5d25d15","position":3,"type":"paper","id":"2506.20670","title":"MMSearch-R1: Incentivizing LMMs to Search","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2506.20670.png","upvotes":64,"publishedAt":"2025-06-25T17:59:42.000Z","isUpvotedByUser":false}],"position":5,"theme":"indigo","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/mmsearch-r1","upvotes":1,"isUpvotedByUser":false},{"slug":"lmms-lab/aero-1-audio-68132e5d5a456cf3a41adfea","title":"Aero-1-Audio","description":"","gating":false,"lastUpdated":"2025-05-01T08:19:20.969Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"68132e715a456cf3a41ae531","position":0,"type":"space","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"colorFrom":"yellow","colorTo":"purple","createdAt":"2025-04-29T13:53:02.000Z","emoji":"💬","id":"lmms-lab/Aero-1-Audio-Demo","lastModified":"2025-05-03T17:43:01.000Z","likes":43,"pinned":false,"private":false,"sdk":"gradio","repoType":"space","runtime":{"stage":"RUNTIME_ERROR","hardware":{"current":null,"requested":"zero-a10g"},"storage":null,"gcTimeout":172800,"errorMessage":"Exit code: 1. Reason: s]\u001b[A\rprocessor_config.json: 100%|██████████| 145/145 [00:00<00:00, 976kB/s]\n\n\rchat_template.json: 0%| | 0.00/1.17k [00:00, ?B/s]\u001b[A\rchat_template.json: 100%|██████████| 1.17k/1.17k [00:00<00:00, 8.65MB/s]\n\n\rchat_template.jinja: 0%| | 0.00/658 [00:00, ?B/s]\u001b[A\rchat_template.jinja: 100%|██████████| 658/658 [00:00<00:00, 5.55MB/s]\n\n\rprocessing_aero.py: 0%| | 0.00/10.4k [00:00, ?B/s]\u001b[A\rprocessing_aero.py: 100%|██████████| 10.4k/10.4k [00:00<00:00, 44.1MB/s]\nA new version of the following files was downloaded from https://huggingface.co/lmms-lab/Aero-1-Audio-1.5B:\n- processing_aero.py\n. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.\nTraceback (most recent call last):\n File \"/home/user/app/app.py\", line 137, in
\n processor = AutoProcessor.from_pretrained(\"lmms-lab/Aero-1-Audio-1.5B\", trust_remote_code=True)\n File \"/usr/local/lib/python3.10/site-packages/transformers/models/auto/processing_auto.py\", line 339, in from_pretrained\n processor_class = get_class_from_dynamic_module(\n File \"/usr/local/lib/python3.10/site-packages/transformers/dynamic_module_utils.py\", line 570, in get_class_from_dynamic_module\n return get_class_in_module(class_name, final_module, force_reload=force_download)\n File \"/usr/local/lib/python3.10/site-packages/transformers/dynamic_module_utils.py\", line 267, in get_class_in_module\n module_spec.loader.exec_module(module)\n File \"\", line 883, in exec_module\n File \"\", line 241, in _call_with_frames_removed\n File \"/home/user/.cache/huggingface/modules/transformers_modules/lmms-lab/Aero-1-Audio-1.5B/2efba6e9de291fa30e11075849631ae410b8485a/processing_aero.py\", line 26, in \n from transformers.video_utils import VideoInput\nModuleNotFoundError: No module named 'transformers.video_utils'\n","replicas":{"requested":1},"devMode":false,"domains":[{"domain":"lmms-lab-aero-1-audio-demo.hf.space","stage":"READY"}]},"shortDescription":"Demo for Aero-1-Audio","title":"Aero 1 Audio Demo","isLikedByUser":false,"ai_short_description":"Generate text from audio and prompt","ai_category":"Text Generation","trendingScore":0,"tags":["gradio","region:us"],"featured":false},{"_id":"68132e885342cbe1ddf93949","position":1,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":109,"gated":false,"id":"lmms-lab/Aero-1-Audio","availableInferenceProviders":[],"lastModified":"2025-06-07T03:40:46.000Z","likes":91,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":2416221184}],"position":7,"theme":"pink","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/aero-1-audio","upvotes":1,"isUpvotedByUser":false},{"slug":"lmms-lab/egolife-67c04574c2a9b64ab312c342","title":"EgoLife","description":"CVPR 2025 - EgoLife: Towards Egocentric Life Assistant. Homepage: https://egolife-ai.github.io/","gating":false,"lastUpdated":"2025-03-07T05:51:01.509Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"67ca89407a3c85850b1fe089","position":0,"type":"paper","id":"2503.03803","title":"EgoLife: Towards Egocentric Life Assistant","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2503.03803.png","upvotes":46,"publishedAt":"2025-03-05T18:54:16.000Z","isUpvotedByUser":false},{"_id":"67c15d40ef9af7490265b92e","position":1,"type":"space","author":"Jingkang","authorData":{"_id":"62b5777f593a2c49da69dc02","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1658152070753-62b5777f593a2c49da69dc02.jpeg","fullname":"Jingkang Yang","name":"Jingkang","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false},"colorFrom":"yellow","colorTo":"purple","createdAt":"2025-01-23T19:57:33.000Z","emoji":"👁","id":"Jingkang/EgoGPT-7B","lastModified":"2025-03-07T06:49:04.000Z","likes":14,"pinned":false,"private":false,"sdk":"gradio","repoType":"space","runtime":{"stage":"RUNTIME_ERROR","hardware":{"current":null,"requested":"zero-a10g"},"storage":null,"gcTimeout":172800,"errorMessage":"Scheduling failure: not enough hardware capacity","replicas":{"requested":1},"devMode":false,"domains":[{"domain":"jingkang-egogpt-7b.hf.space","stage":"READY"}]},"title":"EgoGPT","isLikedByUser":false,"ai_short_description":"Analyze video to describe actions and transcribe audio","ai_category":"Video Generation","trendingScore":0,"tags":["gradio","region:us"],"featured":false},{"_id":"67c0cd9401cef6d4b994df44","position":2,"type":"dataset","author":"lmms-lab","downloads":37597,"gated":false,"id":"lmms-lab/EgoIT-99K","lastModified":"2025-03-07T06:34:54.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":198972,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["audio","image","tabular","text"]},"private":false,"repoType":"dataset","likes":8,"isLikedByUser":false,"isBenchmark":false},{"_id":"67c0ce0f5fddd7069b489edc","position":3,"type":"dataset","author":"lmms-lab","downloads":48936,"gated":false,"id":"lmms-lab/EgoLife","lastModified":"2025-03-13T17:47:56.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":32001,"libraries":["datasets","mlcroissant"],"formats":[],"modalities":["video"]},"private":false,"repoType":"dataset","likes":17,"isLikedByUser":false,"isBenchmark":false}],"position":8,"theme":"green","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/egolife","upvotes":20,"isUpvotedByUser":false},{"slug":"lmms-lab/videommmu-6793399cb843fda452c5a69e","title":"VideoMMMU","description":"","gating":false,"lastUpdated":"2025-02-11T07:25:08.559Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"679388a69b8cf1c401811b8b","position":0,"type":"paper","id":"2501.13826","title":"Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline\n Professional Videos","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2501.13826.png","upvotes":23,"publishedAt":"2025-01-23T16:51:47.000Z","isUpvotedByUser":false},{"_id":"67a994da3d647533cdf3b8ac","position":2,"type":"dataset","author":"lmms-lab","downloads":2255,"gated":"auto","id":"lmms-lab/VideoMMMU","lastModified":"2025-05-05T09:35:20.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":900,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text","video"]},"private":false,"repoType":"dataset","likes":10,"isLikedByUser":false,"isBenchmark":false}],"position":9,"theme":"blue","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/videommmu","upvotes":2,"isUpvotedByUser":false},{"slug":"lmms-lab/multimodal-sae-674026e4e7bc8c29c70bc3a3","title":"Multimodal-SAE","description":"The collection of the sae that hooked on llava","gating":false,"lastUpdated":"2025-03-04T05:19:02.219Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"67c684cb5498323c47b7259e","position":0,"type":"space","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"colorFrom":"yellow","colorTo":"purple","createdAt":"2025-03-03T02:36:17.000Z","emoji":"💬","id":"lmms-lab/Multimodal-SAE","lastModified":"2025-03-14T11:18:44.000Z","likes":9,"pinned":false,"private":false,"sdk":"gradio","repoType":"space","runtime":{"stage":"SLEEPING","hardware":{"current":null,"requested":"zero-a10g"},"storage":null,"gcTimeout":172800,"replicas":{"requested":1},"devMode":false,"domains":[{"domain":"kcz358-multimodal-sae.hf.space","stage":"READY"},{"domain":"lmms-lab-multimodal-sae.hf.space","stage":"READY"}]},"shortDescription":"Demo for Multimodal-SAE","title":"Multimodal SAE","isLikedByUser":false,"ai_short_description":"Explore and manipulate image features to visualize and influence model outputs","ai_category":"Image Analysis","trendingScore":0,"tags":["gradio","region:us"],"featured":false},{"_id":"674485817241fc572aafd227","position":1,"type":"paper","id":"2411.14982","title":"Large Multi-modal Models Can Interpret Features in Large Multi-modal\n Models","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2411.14982.png","upvotes":19,"publishedAt":"2024-11-22T14:41:36.000Z","isUpvotedByUser":false},{"_id":"67407d2d9ba5d8c33b2f6f68","position":2,"type":"dataset","author":"lmms-lab","downloads":50,"gated":false,"id":"lmms-lab/llava-sae-explanations-5k","lastModified":"2024-11-22T12:51:18.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":9797,"libraries":["datasets","dask","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":5,"isLikedByUser":false,"isBenchmark":false},{"_id":"67402711816e3a7f883db9df","position":3,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":9,"gated":false,"id":"lmms-lab/llama3-llava-next-8b-hf-sae-131k","availableInferenceProviders":[],"lastModified":"2024-11-26T00:00:55.000Z","likes":7,"private":false,"repoType":"model","isLikedByUser":false}],"position":12,"theme":"pink","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/multimodal-sae","upvotes":8,"isUpvotedByUser":false},{"slug":"lmms-lab/llava-critic-66fe3ef8c6e586d8435b4af8","title":"LLaVA-Critic","description":"as a general evaluator for assessing model performance","gating":false,"lastUpdated":"2024-10-06T05:43:31.769Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"66ff556a1108132a320e89b4","position":0,"type":"paper","id":"2410.02712","title":"LLaVA-Critic: Learning to Evaluate Multimodal Models","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2410.02712.png","upvotes":37,"publishedAt":"2024-10-03T17:36:33.000Z","isUpvotedByUser":false},{"_id":"66fe3f8aa98cf0a69f1f0cea","position":1,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":197,"gated":false,"id":"lmms-lab/llava-critic-7b","availableInferenceProviders":[],"lastModified":"2024-10-04T21:16:08.000Z","likes":15,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":8030348832},{"_id":"66fe3f9515148a910ccc8b95","position":2,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":12,"gated":false,"id":"lmms-lab/llava-critic-72b","availableInferenceProviders":[],"lastModified":"2024-10-04T21:35:09.000Z","likes":15,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":73180522016},{"_id":"66fe3f9c1add69636b931fc7","position":3,"type":"dataset","author":"lmms-lab","downloads":104,"gated":false,"id":"lmms-lab/llava-critic-113k","lastModified":"2024-10-05T16:08:22.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":112936,"libraries":["datasets","dask","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":28,"isLikedByUser":false,"isBenchmark":false}],"position":13,"theme":"purple","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/llava-critic","upvotes":10,"isUpvotedByUser":false},{"slug":"lmms-lab/llava-video-661e86f5e8dabc3ff793c944","title":"LLaVA-Video","description":"Models focus on video understanding (previously known as LLaVA-NeXT-Video).","gating":false,"lastUpdated":"2025-02-21T14:51:44.965Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"66ff4ffb1e307d0986999554","position":0,"type":"paper","id":"2410.02713","title":"Video Instruction Tuning With Synthetic Data","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2410.02713.png","upvotes":41,"publishedAt":"2024-10-03T17:36:49.000Z","isUpvotedByUser":false},{"_id":"66d551272fa8a088dc9ac550","position":1,"type":"dataset","author":"lmms-lab","downloads":15755,"gated":false,"id":"lmms-lab/LLaVA-Video-178K","lastModified":"2024-10-11T04:59:25.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":1627017,"libraries":[],"formats":[],"modalities":["text"]},"private":false,"repoType":"dataset","likes":187,"isLikedByUser":false,"isBenchmark":false},{"_id":"66d55de58819c81cceca2e8e","position":2,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":98484,"gated":false,"id":"lmms-lab/LLaVA-Video-7B-Qwen2","availableInferenceProviders":[],"lastModified":"2024-10-25T01:38:23.000Z","likes":124,"pipeline_tag":"video-text-to-text","private":false,"repoType":"model","isLikedByUser":false,"numParameters":8030348832},{"_id":"66d55043b5d072b8e935c815","position":3,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":280,"gated":false,"id":"lmms-lab/LLaVA-Video-72B-Qwen2","availableInferenceProviders":[],"lastModified":"2024-10-25T01:38:55.000Z","likes":20,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":73180522016}],"position":14,"theme":"blue","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/llava-video","upvotes":65,"isUpvotedByUser":false},{"slug":"lmms-lab/llava-onevision-66a259c3526e15166d6bba37","title":"LLaVA-OneVision","description":"a model good at arbitrary types of visual input","gating":false,"lastUpdated":"2025-09-17T10:19:50.354Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"66b2e9b144969ce029b730c4","position":0,"type":"paper","id":"2408.03326","title":"LLaVA-OneVision: Easy Visual Task Transfer","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2408.03326.png","upvotes":61,"publishedAt":"2024-08-06T17:59:44.000Z","isUpvotedByUser":false},{"_id":"66b9d3425b3a757a74c176e3","position":1,"type":"dataset","author":"lmms-lab","downloads":160,"gated":false,"id":"lmms-lab/LLaVA-OneVision-Mid-Data","lastModified":"2024-08-26T05:25:07.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":563196,"libraries":["datasets","webdataset","mlcroissant"],"formats":["webdataset"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":21,"isLikedByUser":false,"isBenchmark":false},{"_id":"66a743b2462bf8e8d5c4d464","position":2,"type":"dataset","author":"lmms-lab","downloads":12024,"gated":false,"id":"lmms-lab/LLaVA-OneVision-Data","lastModified":"2025-05-24T06:19:07.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":3938627,"libraries":["datasets","dask","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":228,"isLikedByUser":false,"isBenchmark":false},{"_id":"66b5713b672bb1c984e92ab1","position":3,"type":"dataset","author":"lmms-lab","downloads":3963,"gated":false,"id":"lmms-lab/LLaVA-NeXT-Data","lastModified":"2024-08-30T05:24:23.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":779289,"libraries":["datasets","dask","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":45,"isLikedByUser":false,"isBenchmark":false}],"position":15,"theme":"blue","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/llava-onevision","upvotes":31,"isUpvotedByUser":false},{"slug":"lmms-lab/lmms-eval-661d51f70a9d678b6f43f272","title":"LMMs-Eval","description":"Dataset Collection of LMMs-Eval","gating":false,"lastUpdated":"2024-10-04T06:08:53.913Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"6699ed25eeefee89b1befd9a","position":0,"type":"paper","id":"2407.12772","title":"LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2407.12772.png","upvotes":35,"publishedAt":"2024-07-17T17:51:53.000Z","isUpvotedByUser":false},{"_id":"661d5227f8dcbd5a205ca1a4","position":1,"type":"dataset","author":"lmms-lab","downloads":11082,"gated":false,"id":"lmms-lab/VQAv2","lastModified":"2024-01-26T18:05:06.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":769541,"libraries":["datasets","dask","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":28,"isLikedByUser":false,"isBenchmark":false},{"_id":"661d52329e05575f94746d5d","position":2,"type":"dataset","author":"lmms-lab","downloads":28057,"gated":false,"id":"lmms-lab/MME","lastModified":"2023-12-23T09:13:53.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":2374,"libraries":["datasets","dask","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":27,"isLikedByUser":false,"isBenchmark":false},{"_id":"661d530bccf3ed37ff24e91d","position":3,"type":"dataset","author":"lmms-lab","downloads":20631,"gated":false,"id":"lmms-lab/DocVQA","lastModified":"2024-04-18T05:14:35.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":16626,"libraries":["datasets","dask","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":69,"isLikedByUser":false,"isBenchmark":false}],"position":16,"theme":"blue","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/lmms-eval","upvotes":32,"isUpvotedByUser":false},{"slug":"lmms-lab/longva-667538e09329dbc7ea498057","title":"LongVA","description":"Long Context Transfer From Text To Vision: https://lmms-lab.github.io/posts/longva/","gating":false,"lastUpdated":"2024-10-04T06:08:53.913Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"667a445cbecec8fc51fa493f","position":0,"type":"paper","id":"2406.16852","title":"Long Context Transfer from Language to Vision","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2406.16852.png","upvotes":33,"publishedAt":"2024-06-24T17:58:06.000Z","isUpvotedByUser":false},{"_id":"66753a42246665be1aae5074","position":1,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":176,"gated":false,"id":"lmms-lab/LongVA-7B","availableInferenceProviders":[],"lastModified":"2024-06-26T03:32:33.000Z","likes":15,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":7935646208},{"_id":"66763dd066c4fa6d0c33d155","position":2,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":67,"gated":false,"id":"lmms-lab/LongVA-7B-DPO","availableInferenceProviders":[],"lastModified":"2024-06-26T03:32:53.000Z","likes":10,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":7935646208},{"_id":"66753c82d0970f1efa07519f","position":3,"type":"dataset","author":"lmms-lab","downloads":24,"gated":false,"id":"lmms-lab/v_niah_needles","lastModified":"2024-06-15T02:59:38.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":5,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":4,"isLikedByUser":false,"isBenchmark":false}],"position":17,"theme":"blue","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/longva","upvotes":13,"isUpvotedByUser":false},{"slug":"lmms-lab/llava-next-interleave-66763c55c411b340b35873d1","title":"LLaVA-Next-Interleave","description":"","gating":false,"lastUpdated":"2024-10-04T06:08:53.921Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"668ff57eca8d2957925813b2","position":0,"type":"paper","id":"2407.07895","title":"LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large\n Multimodal Models","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2407.07895.png","upvotes":42,"publishedAt":"2024-07-10T17:59:43.000Z","isUpvotedByUser":false},{"_id":"667bd41b4ae2385fd106f045","position":1,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":464,"gated":false,"id":"lmms-lab/llava-next-interleave-qwen-7b","availableInferenceProviders":[],"lastModified":"2024-07-24T07:40:06.000Z","likes":27,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":8140580384},{"_id":"667c29092441e652945b1d45","position":2,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":62,"gated":false,"id":"lmms-lab/llava-next-interleave-qwen-7b-dpo","availableInferenceProviders":[],"lastModified":"2024-07-12T16:27:04.000Z","likes":12,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":8140580384},{"_id":"667bd42409faa9d48c153edf","position":3,"type":"dataset","author":"lmms-lab","downloads":1190,"gated":false,"id":"lmms-lab/M4-Instruct-Data","lastModified":"2024-07-21T23:10:01.000Z","private":false,"repoType":"dataset","likes":77,"isLikedByUser":false,"isBenchmark":false}],"position":18,"theme":"indigo","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/llava-next-interleave","upvotes":16,"isUpvotedByUser":false},{"slug":"lmms-lab/llava-next-6623288e2d61edba3ddbf5ff","title":"LLaVA-NeXT","description":"Some powerful image models.","gating":false,"lastUpdated":"2024-10-14T07:51:40.056Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"663879184ea0d8fe7372f6fb","position":0,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":9,"gated":false,"id":"lmms-lab/llava-next-110b","availableInferenceProviders":[],"lastModified":"2024-05-14T12:05:00.000Z","likes":21,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":111588943872},{"_id":"6623ee58c1b44fa7357bb556","position":1,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":30,"gated":false,"id":"lmms-lab/llava-next-72b","availableInferenceProviders":[],"lastModified":"2024-08-22T03:55:10.000Z","likes":14,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":72666949632},{"_id":"669605d3a51ccecc909a77ad","position":2,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":14,"gated":false,"id":"lmms-lab/llava-next-qwen-32b","availableInferenceProviders":[],"lastModified":"2024-07-16T10:48:13.000Z","likes":7,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":32942093856},{"_id":"6623eec70e31d65eccc1e471","position":3,"type":"model","author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":2126,"gated":false,"id":"lmms-lab/llama3-llava-next-8b","availableInferenceProviders":[],"lastModified":"2024-08-17T08:47:11.000Z","likes":105,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[]}],"position":19,"theme":"indigo","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/llava-next","upvotes":6,"isUpvotedByUser":false},{"slug":"lmms-lab/lmms-eval-lite-661dd742e9a372d5819cbd9e","title":"LMMs-Eval-Lite","description":"Making Lite version of the dataset to accelerate holistic evaluation during model development!","gating":false,"lastUpdated":"2024-10-04T06:08:53.917Z","owner":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"items":[{"_id":"66862dba677f1a2f5f3d621d","position":18,"type":"dataset","author":"lmms-lab","downloads":191,"gated":false,"id":"lmms-lab/CMMMU","lastModified":"2024-03-08T02:59:03.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":12012,"libraries":["datasets","dask","mlcroissant","polars"],"formats":["parquet"],"modalities":["image","text"]},"private":false,"repoType":"dataset","likes":4,"isLikedByUser":false,"isBenchmark":false}],"position":20,"theme":"orange","private":false,"shareUrl":"https://hf.co/collections/lmms-lab/lmms-eval-lite","upvotes":2,"isUpvotedByUser":false}],"datasets":[{"author":"lmms-lab","downloads":12,"gated":false,"id":"lmms-lab/Spatial457","lastModified":"2026-01-23T09:31:20.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"lmms-lab","downloads":18,"gated":false,"id":"lmms-lab/SciBench_Physics","lastModified":"2025-09-26T02:13:12.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":227,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"lmms-lab","downloads":37,"gated":false,"id":"lmms-lab/SciBench_Chemistry","lastModified":"2025-09-26T02:12:36.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":266,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":1,"isLikedByUser":false,"isBenchmark":false},{"author":"lmms-lab","downloads":88,"gated":false,"id":"lmms-lab/WenetSpeech","lastModified":"2025-09-23T09:38:25.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"lmms-lab","downloads":90,"gated":false,"id":"lmms-lab/CSBench_Assertion","lastModified":"2025-09-18T06:05:52.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":442,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"lmms-lab","downloads":90,"gated":false,"id":"lmms-lab/CSBench_MCQ","lastModified":"2025-09-18T06:05:27.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":1336,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"lmms-lab","downloads":5,"gated":false,"id":"lmms-lab/SuperGPQA","lastModified":"2025-09-18T05:46:35.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":26529,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"lmms-lab","downloads":43,"gated":false,"id":"lmms-lab/MEDQA","lastModified":"2025-09-18T05:43:40.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":1273,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"lmms-lab","downloads":3,"gated":false,"id":"lmms-lab/PUBMEDQA","lastModified":"2025-09-18T05:42:40.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":500,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"lmms-lab","downloads":10,"gated":false,"id":"lmms-lab/MEDMCQA","lastModified":"2025-09-18T05:41:03.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":4183,"libraries":["datasets","pandas","mlcroissant","polars"],"formats":["parquet"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false}],"models":[{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":3456,"gated":false,"id":"lmms-lab/LLaVA-OneVision-1.5-4B-Instruct","availableInferenceProviders":[],"lastModified":"2026-02-06T15:48:30.000Z","likes":17,"pipeline_tag":"image-text-to-text","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":4741610528},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":3880,"gated":false,"id":"lmms-lab/BAGEL-7B-MoT-ver.LE","availableInferenceProviders":[],"lastModified":"2025-12-08T08:43:40.000Z","likes":1,"pipeline_tag":"text-generation","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":14691079811},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":37817,"gated":false,"id":"lmms-lab/LLaVA-OneVision-1.5-8B-Instruct","availableInferenceProviders":[],"lastModified":"2025-10-21T05:53:35.000Z","likes":62,"pipeline_tag":"image-text-to-text","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":8527214624},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":2059,"gated":false,"id":"lmms-lab/LLaVA-OneVision-1.5-4B-Base","availableInferenceProviders":[],"lastModified":"2025-10-05T07:09:03.000Z","likes":1,"pipeline_tag":"image-text-to-text","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":4741610528},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":47,"gated":false,"id":"lmms-lab/LLaVA-OneVision-1.5-8B-Base","availableInferenceProviders":[],"lastModified":"2025-09-30T03:21:07.000Z","likes":1,"pipeline_tag":"image-text-to-text","private":false,"repoType":"model","isLikedByUser":false,"widgetOutputUrls":[],"numParameters":8527214624},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":13,"gated":false,"id":"lmms-lab/LLaVA-OneVision-1.5-4B-stage0","availableInferenceProviders":[],"lastModified":"2025-09-30T02:10:29.000Z","likes":1,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":4352654368},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":5,"gated":false,"id":"lmms-lab/LLaVA-OneVision-1.5-8B-stage0","availableInferenceProviders":[],"lastModified":"2025-09-30T02:09:55.000Z","likes":2,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":8527214624},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":0,"gated":false,"id":"lmms-lab/LLaVA-Critic-R1-7B-LLaMA32v","availableInferenceProviders":[],"lastModified":"2025-08-28T18:48:25.000Z","likes":0,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":10670220835},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":2,"gated":false,"id":"lmms-lab/LLaVA-Critic-R1-7B-Plus-Mimo","availableInferenceProviders":[],"lastModified":"2025-08-28T18:44:16.000Z","likes":1,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":8306217216},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"downloads":1,"gated":false,"id":"lmms-lab/MMSearch-R1-7B-0807","availableInferenceProviders":[],"lastModified":"2025-08-07T10:10:32.000Z","likes":0,"private":false,"repoType":"model","isLikedByUser":false,"numParameters":8292166656}],"paperPreviews":[{"_id":"2602.08683","title":"OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence","id":"2602.08683","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2602.08683.png"},{"_id":"2511.20785","title":"LongVT: Incentivizing \"Thinking with Long Videos\" via Native Tool Calling","id":"2511.20785","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2511.20785.png"}],"spaces":[{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"colorFrom":"green","colorTo":"indigo","createdAt":"2024-07-09T22:20:54.000Z","emoji":"🥇","id":"lmms-lab/LiveBench","lastModified":"2024-10-16T09:14:08.000Z","likes":21,"pinned":true,"private":false,"sdk":"gradio","repoType":"space","runtime":{"stage":"RUNTIME_ERROR","hardware":{"current":null,"requested":"cpu-basic"},"storage":null,"gcTimeout":172800,"errorMessage":"Exit code: 1. Reason: e \"/usr/local/lib/python3.10/site-packages/datasets/load.py\", line 1063, in get_module\n data_files = DataFilesDict.from_patterns(\n File \"/usr/local/lib/python3.10/site-packages/datasets/data_files.py\", line 721, in from_patterns\n else DataFilesList.from_patterns(\n File \"/usr/local/lib/python3.10/site-packages/datasets/data_files.py\", line 624, in from_patterns\n resolve_pattern(\n File \"/usr/local/lib/python3.10/site-packages/datasets/data_files.py\", line 388, in resolve_pattern\n for filepath, info in fs.glob(pattern, detail=True, **glob_kwargs).items()\n File \"/usr/local/lib/python3.10/site-packages/huggingface_hub/hf_file_system.py\", line 409, in glob\n return super().glob(path, **kwargs)\n File \"/usr/local/lib/python3.10/site-packages/fsspec/spec.py\", line 611, in glob\n allpaths = self.find(root, maxdepth=depth, withdirs=True, detail=True, **kwargs)\n File \"/usr/local/lib/python3.10/site-packages/huggingface_hub/hf_file_system.py\", line 422, in find\n return super().find(\n File \"/usr/local/lib/python3.10/site-packages/fsspec/spec.py\", line 502, in find\n out[path] = self.info(path)\n File \"/usr/local/lib/python3.10/site-packages/huggingface_hub/hf_file_system.py\", line 532, in info\n paths_info = self._api.get_paths_info(\n File \"/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py\", line 114, in _inner_fn\n return fn(*args, **kwargs)\n File \"/usr/local/lib/python3.10/site-packages/huggingface_hub/hf_api.py\", line 3237, in get_paths_info\n hf_raise_for_status(response)\n File \"/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_http.py\", line 477, in hf_raise_for_status\n raise _format(HfHubHTTPError, str(e), response) from e\nhuggingface_hub.errors.HfHubHTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/datasets/lmms-lab/LiveBenchResults/paths-info/ec0ecf261e39ee25d7b2734680a0c63b7d295e76 (Request ID: Root=1-67c72d8f-3e38f933782cd5454f28b206;ad1d61a2-27ce-4457-a931-b836146f8c44)\n\nprocess spawn timed out after 10s\n","replicas":{"requested":1},"devMode":false,"domains":[{"domain":"lmms-lab-livebench.hf.space","stage":"READY"}]},"title":"LiveBench","isLikedByUser":false,"originRepo":{"name":"demo-leaderboard-backend/leaderboard","author":{"_id":"655dbd8360009b03e4451217","avatarUrl":"https://www.gravatar.com/avatar/48236a8e5b71950f0708b3f2e3e7925f?d=retro&size=100","fullname":"Demo leaderboard with an integrated backend","name":"demo-leaderboard-backend","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":22,"isUserFollowing":false}},"trendingScore":0,"tags":["gradio","region:us"],"featured":false},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"colorFrom":"indigo","colorTo":"red","createdAt":"2025-09-24T05:57:21.000Z","emoji":"📉","id":"lmms-lab/LLaVA-OneVision-1.5","lastModified":"2025-09-30T02:12:55.000Z","likes":12,"pinned":false,"private":false,"sdk":"gradio","repoType":"space","runtime":{"stage":"RUNNING","hardware":{"current":"cpu-basic","requested":"cpu-basic"},"storage":null,"gcTimeout":172800,"replicas":{"current":1,"requested":1},"devMode":false,"domains":[{"domain":"lmms-lab-llava-onevision-1-5.hf.space","stage":"READY"}],"sha":"5168dc4afd0e760e1a11b3930cc759bd9e4caa21"},"title":"LLaVA OneVision 1.5","isLikedByUser":false,"ai_short_description":"Interact with a multimodal chatbot using text and images","ai_category":"Chatbots","trendingScore":0,"tags":["gradio","region:us"],"featured":false},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"colorFrom":"yellow","colorTo":"purple","createdAt":"2025-04-29T13:53:02.000Z","emoji":"💬","id":"lmms-lab/Aero-1-Audio-Demo","lastModified":"2025-05-03T17:43:01.000Z","likes":43,"pinned":false,"private":false,"sdk":"gradio","repoType":"space","runtime":{"stage":"RUNTIME_ERROR","hardware":{"current":null,"requested":"zero-a10g"},"storage":null,"gcTimeout":172800,"errorMessage":"Exit code: 1. Reason: s]\u001b[A\rprocessor_config.json: 100%|██████████| 145/145 [00:00<00:00, 976kB/s]\n\n\rchat_template.json: 0%| | 0.00/1.17k [00:00, ?B/s]\u001b[A\rchat_template.json: 100%|██████████| 1.17k/1.17k [00:00<00:00, 8.65MB/s]\n\n\rchat_template.jinja: 0%| | 0.00/658 [00:00, ?B/s]\u001b[A\rchat_template.jinja: 100%|██████████| 658/658 [00:00<00:00, 5.55MB/s]\n\n\rprocessing_aero.py: 0%| | 0.00/10.4k [00:00, ?B/s]\u001b[A\rprocessing_aero.py: 100%|██████████| 10.4k/10.4k [00:00<00:00, 44.1MB/s]\nA new version of the following files was downloaded from https://huggingface.co/lmms-lab/Aero-1-Audio-1.5B:\n- processing_aero.py\n. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.\nTraceback (most recent call last):\n File \"/home/user/app/app.py\", line 137, in \n processor = AutoProcessor.from_pretrained(\"lmms-lab/Aero-1-Audio-1.5B\", trust_remote_code=True)\n File \"/usr/local/lib/python3.10/site-packages/transformers/models/auto/processing_auto.py\", line 339, in from_pretrained\n processor_class = get_class_from_dynamic_module(\n File \"/usr/local/lib/python3.10/site-packages/transformers/dynamic_module_utils.py\", line 570, in get_class_from_dynamic_module\n return get_class_in_module(class_name, final_module, force_reload=force_download)\n File \"/usr/local/lib/python3.10/site-packages/transformers/dynamic_module_utils.py\", line 267, in get_class_in_module\n module_spec.loader.exec_module(module)\n File \"\", line 883, in exec_module\n File \"\", line 241, in _call_with_frames_removed\n File \"/home/user/.cache/huggingface/modules/transformers_modules/lmms-lab/Aero-1-Audio-1.5B/2efba6e9de291fa30e11075849631ae410b8485a/processing_aero.py\", line 26, in \n from transformers.video_utils import VideoInput\nModuleNotFoundError: No module named 'transformers.video_utils'\n","replicas":{"requested":1},"devMode":false,"domains":[{"domain":"lmms-lab-aero-1-audio-demo.hf.space","stage":"READY"}]},"shortDescription":"Demo for Aero-1-Audio","title":"Aero 1 Audio Demo","isLikedByUser":false,"ai_short_description":"Generate text from audio and prompt","ai_category":"Text Generation","trendingScore":0,"tags":["gradio","region:us"],"featured":false},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"colorFrom":"yellow","colorTo":"purple","createdAt":"2025-03-03T02:36:17.000Z","emoji":"💬","id":"lmms-lab/Multimodal-SAE","lastModified":"2025-03-14T11:18:44.000Z","likes":9,"pinned":false,"private":false,"sdk":"gradio","repoType":"space","runtime":{"stage":"SLEEPING","hardware":{"current":null,"requested":"zero-a10g"},"storage":null,"gcTimeout":172800,"replicas":{"requested":1},"devMode":false,"domains":[{"domain":"kcz358-multimodal-sae.hf.space","stage":"READY"},{"domain":"lmms-lab-multimodal-sae.hf.space","stage":"READY"}]},"shortDescription":"Demo for Multimodal-SAE","title":"Multimodal SAE","isLikedByUser":false,"ai_short_description":"Explore and manipulate image features to visualize and influence model outputs","ai_category":"Image Analysis","trendingScore":0,"tags":["gradio","region:us"],"featured":false},{"author":"lmms-lab","authorData":{"_id":"6583eb89bed3689928f5d845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d3f7d84b0933c48f3cdd9c/0sliNO9xGhOjVWw20A1Ge.png","fullname":"LMMs-Lab","name":"lmms-lab","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":774,"isUserFollowing":false},"colorFrom":"indigo","colorTo":"red","createdAt":"2024-07-11T02:46:04.000Z","emoji":"😻","id":"lmms-lab/LLaVA-NeXT-Interleave-Demo","lastModified":"2024-07-24T23:54:16.000Z","likes":51,"pinned":false,"private":false,"sdk":"gradio","repoType":"space","runtime":{"stage":"RUNTIME_ERROR","hardware":{"current":null,"requested":"zero-a10g"},"storage":null,"gcTimeout":172800,"errorMessage":"Exit code: 1. Reason: SchemaHandler = current_handler,\n File \"/usr/local/lib/python3.10/site-packages/pydantic/json_schema.py\", line 1823, in definitions_schema\n The name of the argument.\n File \"/usr/local/lib/python3.10/site-packages/pydantic/json_schema.py\", line 553, in generate_inner\n def new_handler_func(\n File \"/usr/local/lib/python3.10/site-packages/pydantic/_internal/_schema_generation_shared.py\", line 37, in __call__\n def __call__(self, core_schema: CoreSchemaOrField, /) -> JsonSchemaValue:\n File \"/usr/local/lib/python3.10/site-packages/pydantic/json_schema.py\", line 527, in new_handler_func\n # similar to typing issue in _update_class_schema when we're working with callable js extra\n File \"/usr/local/lib/python3.10/site-packages/pydantic/main.py\", line 669, in __get_pydantic_json_schema__\n obj: The object containing string data to validate.\n File \"/usr/local/lib/python3.10/site-packages/pydantic/_internal/_schema_generation_shared.py\", line 37, in __call__\n def __call__(self, core_schema: CoreSchemaOrField, /) -> JsonSchemaValue:\n File \"/usr/local/lib/python3.10/site-packages/pydantic/json_schema.py\", line 527, in new_handler_func\n # similar to typing issue in _update_class_schema when we're working with callable js extra\n File \"/usr/local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py\", line 223, in modify_model_json_schema\n File \"/usr/local/lib/python3.10/site-packages/pydantic/dataclasses.py\", line 14, in \n from ._internal import _dataclasses as _pydantic_dataclasses\n File \"/usr/local/lib/python3.10/site-packages/pydantic/_internal/_dataclasses.py\", line 30, in \n from ._utils import LazyClassAttribute\nImportError: cannot import name 'LazyClassAttribute' from 'pydantic._internal._utils' (/usr/local/lib/python3.10/site-packages/pydantic/_internal/_utils.py)\n/usr/local/lib/python3.10/site-packages/gradio/analytics.py:106: UserWarning: IMPORTANT: You are using gradio version 4.35.0, however version 4.44.1 is available, please upgrade. \n--------\n print(\n","replicas":{"requested":1},"devMode":false,"domains":[{"domain":"lmms-lab-llava-next-interleave-demo.hf.space","stage":"READY"}]},"title":"LLaVA-NeXT-Interleave-Demo","isLikedByUser":false,"trendingScore":0,"tags":["gradio","region:us"],"featured":false}],"buckets":[],"numBuckets":0,"numDatasets":165,"numModels":61,"numSpaces":7,"lastOrgActivities":[{"time":"2026-02-18T13:39:39.057Z","user":"Chunyuan24","userAvatarUrl":"/avatars/430560ec2c2547f819225769ab432f30.svg","type":"paper","paper":{"id":"2602.12279","title":"UniT: Unified Multimodal Chain-of-Thought Test-time Scaling","publishedAt":"2026-02-12T18:59:49.000Z","upvotes":19,"isUpvotedByUser":true}},{"time":"2026-02-17T15:50:44.095Z","user":"xiangan","userAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6478679d7b370854241b2ad8/dBczWYYdfEt9tQcnVGhQk.jpeg","type":"paper","paper":{"id":"2602.08683","title":"OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence","publishedAt":"2026-02-09T14:06:17.000Z","upvotes":45,"isUpvotedByUser":true}},{"time":"2026-02-16T03:40:27.485Z","user":"yiyexy","userAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/655c70d331c4978366d4b2e6/X-KjTNkxtzeYu9ngBOh_C.jpeg","type":"paper-daily","paper":{"id":"2602.08683","title":"OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2602.08683.png","upvotes":45,"publishedAt":"2026-02-09T14:06:17.000Z","isUpvotedByUser":true}}],"acceptLanguages":["*"],"canReadRepos":false,"canReadSpaces":false,"blogPosts":[],"currentRepoPage":0,"filters":{},"paperView":false}">
We are looking for collaborators and compute sponsors! Email to drluodian@gmail.com if you want to build with us.
[2025-12] 🤞🤞 See the real world in motion | Presenting OneVision Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence.
💻 GitHub | 🤗 Model | 📖 Paper
[2025-11] 🔭🔭 Introducing LongVT: an end-to-end agentic framework for "Thinking with Long Videos" via native tool calling
💻 GitHub | 🤗 Model and Dataset | 📖 Paper | 📚 Blog | 💻 Demo
[2025-11] 🔥🔥 Introducing OpenMMReasoner: a fully transparent two-stage recipe for multimodal reasoning spanning supervised fine-tuning (SFT) and reinforcement learning (RL).
💻 GitHub | 🤗 Model and Dataset | 📖 Paper | 📚 Blog
[2025-9] 🔥🔥 Introducing LLaVA-OneVision-1.5: a novel family of fully open-source Large Multimodal Models (LMMs) that achieves state-of-the-art performance with substantially lower cost through training on native resolution images.
💻 GitHub | 🤗 Model and Dataset | 📖 Paper
[2025-9] 🔥🔥 Introducing LLaVA-Critic-R1: A family of generative critic VLM trained through GRPO using pairwise critic data. LLaVA-Critic-R1 not only demonstrates strong critic capability, but also achieves SoTA policy performance at the 7B scale.
💻 GitHub | 🤗 Model and Dataset | 📖 Paper
[2025-4] 🔈🔈 Introducing Aero-1-Audio: It is a compact audio model adept at various audio tasks, including speech recognition, audio understanding, and following audio instructions.
📚 Blog | 🤗 Model Checkpoints | 📖 Evaluation Results | 📚 Cookbook
[2025-3] 👓👓 Introducing EgoLife: Towards Egocentric Life Assistant. For one week, six individuals lived together, capturing every moment through AI glasses, and creating the EgoLife dataset. Based on this we build models and benchmarks to drive the future of AI life assistants that capable of recalling past events, tracking habits, and providing personalized, long-context assistance to enhance daily life.
Homepage | Github | Blog | Paper | Demo
[2025-1] 🎬🎬 Introducing VideoMMMU: Evaluating Knowledge Acquisition from Professional Videos. Spanning 6 professional disciplines (Art, Business, Science, Medicine, Humanities, Engineering) and 30 diverse subjects, Video-MMMU challenges models to learn and apply college-level knowledge from videos.
Homepage | Github | Paper
[2024-11] 🔔🔔 We are excited to introduce LMMs-Eval/v0.3.0, focusing on audio understanding. Building upon LMMs-Eval/v0.2.0, we have added audio models and tasks. Now, LMMs-Eval provides a consistent evaluation toolkit across image, video, and audio modalities.
GitHub | Documentation
[2024-11] 🤯🤯 We introduce Multimodal SAE, the first framework designed to interpret learned features in large-scale multimodal models using Sparse Autoencoders. Through our approach, we leverage LLaVA-OneVision-72B to analyze and explain the SAE-derived features of LLaVA-NeXT-LLaMA3-8B. Furthermore, we demonstrate the ability to steer model behavior by clamping specific features to alleviate hallucinations and avoid safety-related issues.
GitHub | Paper
[2024-10] 🔥🔥 We present LLaVA-Critic, the first open-source large multimodal model as a generalist evaluator for assessing LMM-generated responses across diverse multimodal tasks and scenarios.
GitHub | Blog
[2024-10] 🎬🎬 Introducing LLaVA-Video, a family of open large multimodal models designed specifically for advanced video understanding. We're open-sourcing LLaVA-Video-178K, a high-quality, synthetic dataset for video instruction tuning.
GitHub | Blog
[2024-08] 🤞🤞 We present LLaVA-OneVision, a family of LMMs developed by consolidating insights into data, models, and visual representations.
GitHub | Blog
[2024-06] 🧑🎨🧑🎨 We release LLaVA-NeXT-Interleave, an LMM extending capabilities to real-world settings: Multi-image, Multi-frame (videos), Multi-view (3D), and Multi-patch (single-image).
GitHub | Blog
[2024-06] 🚀🚀 We release LongVA, a long language model with state-of-the-art video understanding performance.
GitHub | Blog
Older Updates (2024-06 and earlier)
[2024-06] 🎬🎬 The lmms-eval/v0.2 toolkit now supports video evaluations for models like LLaVA-NeXT Video and Gemini 1.5 Pro.
GitHub | Blog
[2024-05] 🚀🚀 We release LLaVA-NeXT Video, a model performing at Google's Gemini level on video understanding tasks.
GitHub | Blog
[2024-05] 🚀🚀 The LLaVA-NeXT model family reaches near GPT-4V performance on multimodal benchmarks, with models up to 110B parameters.
GitHub | Blog
[2024-03] We release lmms-eval, a toolkit for holistic evaluations with 50+ multimodal datasets and 10+ models.
GitHub | Blog