Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
[go: Go Back, main page]

https://github.com/salesforce/LAVIS/tree/xgen-mm

\n","updatedAt":"2024-08-19T02:03:10.363Z","author":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","fullname":"AK","name":"akhaliq","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":9179,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5111336708068848},"editors":["akhaliq"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg"],"reactions":[],"isReport":false},"replies":[{"id":"66c32c40a67395037455d02f","author":{"_id":"6184efcbb2a35bb9f9bd1560","avatarUrl":"/avatars/69470556b30b82c8bf24a8ef9c31932e.svg","fullname":"Viljami Laurmaa","name":"Vil","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false},"createdAt":"2024-08-19T11:28:00.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"The link gives a 404, I assume xgen-mm hasn't been merged yet?","html":"

The link gives a 404, I assume xgen-mm hasn't been merged yet?

\n","updatedAt":"2024-08-19T11:28:00.176Z","author":{"_id":"6184efcbb2a35bb9f9bd1560","avatarUrl":"/avatars/69470556b30b82c8bf24a8ef9c31932e.svg","fullname":"Viljami Laurmaa","name":"Vil","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9599324464797974},"editors":["Vil"],"editorAvatarUrls":["/avatars/69470556b30b82c8bf24a8ef9c31932e.svg"],"reactions":[],"isReport":false,"parentCommentId":"66c2a7de26fef28c3e9e1d3d"}},{"id":"66c37fe9ef1f45cd7189a0b3","author":{"_id":"63dd73e7422ca8d7f7e3698c","avatarUrl":"/avatars/7b0f8419f6941230b81dbbbb4f273edf.svg","fullname":"Le Xue","name":"SFXX","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false},"createdAt":"2024-08-19T17:24:57.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi, our model/datasets links are live now","html":"

Hi, our model/datasets links are live now

\n","updatedAt":"2024-08-19T17:24:57.908Z","author":{"_id":"63dd73e7422ca8d7f7e3698c","avatarUrl":"/avatars/7b0f8419f6941230b81dbbbb4f273edf.svg","fullname":"Le Xue","name":"SFXX","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8327708840370178},"editors":["SFXX"],"editorAvatarUrls":["/avatars/7b0f8419f6941230b81dbbbb4f273edf.svg"],"reactions":[{"reaction":"πŸ”₯","users":["Vil"],"count":1}],"isReport":false,"parentCommentId":"66c2a7de26fef28c3e9e1d3d"}}]},{"id":"66c372acd6719dab3eb8c3bc","author":{"_id":"63dd73e7422ca8d7f7e3698c","avatarUrl":"/avatars/7b0f8419f6941230b81dbbbb4f273edf.svg","fullname":"Le Xue","name":"SFXX","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false},"createdAt":"2024-08-19T16:28:28.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi, we plan to make the links public today. Since yesterday was the weekend, we need infrastructure's access to turn things public on Monday.","html":"

Hi, we plan to make the links public today. Since yesterday was the weekend, we need infrastructure's access to turn things public on Monday.

\n","updatedAt":"2024-08-19T16:28:28.347Z","author":{"_id":"63dd73e7422ca8d7f7e3698c","avatarUrl":"/avatars/7b0f8419f6941230b81dbbbb4f273edf.svg","fullname":"Le Xue","name":"SFXX","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.978103518486023},"editors":["SFXX"],"editorAvatarUrls":["/avatars/7b0f8419f6941230b81dbbbb4f273edf.svg"],"reactions":[{"reaction":"πŸ‘","users":["Vil"],"count":1}],"isReport":false}},{"id":"66c37ff58e95eabff2732ced","author":{"_id":"63dd73e7422ca8d7f7e3698c","avatarUrl":"/avatars/7b0f8419f6941230b81dbbbb4f273edf.svg","fullname":"Le Xue","name":"SFXX","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false},"createdAt":"2024-08-19T17:25:09.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi, our model/dataset links are live now","html":"

Hi, our model/dataset links are live now

\n","updatedAt":"2024-08-19T17:25:09.145Z","author":{"_id":"63dd73e7422ca8d7f7e3698c","avatarUrl":"/avatars/7b0f8419f6941230b81dbbbb4f273edf.svg","fullname":"Le Xue","name":"SFXX","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8216283321380615},"editors":["SFXX"],"editorAvatarUrls":["/avatars/7b0f8419f6941230b81dbbbb4f273edf.svg"],"reactions":[],"isReport":false},"replies":[{"id":"66cb24980aaaa6753b0109e2","author":{"_id":"641d6378ceba1907f200a234","avatarUrl":"/avatars/8fa5e8b2ac72445ce0808faa52479f14.svg","fullname":"Vignesh Thangaraju","name":"viggypiggy","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2024-08-25T12:33:28.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi,\nhttps://huggingface.co/datasets/Salesforce/blip3-ocr-200m\nhttps://huggingface.co/datasets/Salesforce/blip3-grounding-50m\n\nIt's still giving me a 404 error. \nCan you please let us know? Thanks in advance. :) ","html":"

Hi,
https://huggingface.co/datasets/Salesforce/blip3-ocr-200m
https://huggingface.co/datasets/Salesforce/blip3-grounding-50m

\n

It's still giving me a 404 error.
Can you please let us know? Thanks in advance. :)

\n","updatedAt":"2024-08-25T12:33:28.921Z","author":{"_id":"641d6378ceba1907f200a234","avatarUrl":"/avatars/8fa5e8b2ac72445ce0808faa52479f14.svg","fullname":"Vignesh Thangaraju","name":"viggypiggy","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7767264246940613},"editors":["viggypiggy"],"editorAvatarUrls":["/avatars/8fa5e8b2ac72445ce0808faa52479f14.svg"],"reactions":[],"isReport":false,"parentCommentId":"66c37ff58e95eabff2732ced"}}]},{"id":"66c3f21485c8bcb5f26444a1","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2024-08-20T01:32:04.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models](https://huggingface.co/papers/2407.07895) (2024)\n* [CROME: Cross-Modal Adapters for Efficient Multimodal LLM](https://huggingface.co/papers/2408.06610) (2024)\n* [MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity](https://huggingface.co/papers/2407.15838) (2024)\n* [Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs](https://huggingface.co/papers/2406.16860) (2024)\n* [mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models](https://huggingface.co/papers/2408.04840) (2024)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2024-08-20T01:32:04.895Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7454788684844971},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}},{"id":"683597f76b72976943a1177a","author":{"_id":"6439f4c4de858bcf53c1564b","avatarUrl":"/avatars/509beda507914b633ca5b6fa8b5ab478.svg","fullname":"The Admin","name":"theschoolofai","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isUserFollowing":false},"createdAt":"2025-05-27T10:46:15.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"\n","html":"","updatedAt":"2025-05-27T10:46:54.112Z","author":{"_id":"6439f4c4de858bcf53c1564b","avatarUrl":"/avatars/509beda507914b633ca5b6fa8b5ab478.svg","fullname":"The Admin","name":"theschoolofai","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"ja","probability":0.2196073830127716},"editors":["theschoolofai"],"editorAvatarUrls":["/avatars/509beda507914b633ca5b6fa8b5ab478.svg"],"reactions":[],"isReport":false}},{"id":"68359810c682e155a8d6a417","author":{"_id":"6439f4c4de858bcf53c1564b","avatarUrl":"/avatars/509beda507914b633ca5b6fa8b5ab478.svg","fullname":"The Admin","name":"theschoolofai","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isUserFollowing":false},"createdAt":"2025-05-27T10:46:40.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"bounding boxe","html":"

bounding boxe

\n","updatedAt":"2025-05-27T10:48:17.539Z","author":{"_id":"6439f4c4de858bcf53c1564b","avatarUrl":"/avatars/509beda507914b633ca5b6fa8b5ab478.svg","fullname":"The Admin","name":"theschoolofai","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.8430189490318298},"editors":["theschoolofai"],"editorAvatarUrls":["/avatars/509beda507914b633ca5b6fa8b5ab478.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2408.08872","authors":[{"_id":"66c2a7d20836dd7a55716f8e","user":{"_id":"63dd73e7422ca8d7f7e3698c","avatarUrl":"/avatars/7b0f8419f6941230b81dbbbb4f273edf.svg","isPro":false,"fullname":"Le Xue","user":"SFXX","type":"user"},"name":"Le Xue","status":"claimed_verified","statusLastChangedAt":"2024-08-22T19:52:05.876Z","hidden":false},{"_id":"66c2a7d20836dd7a55716f8f","user":{"_id":"63bdcb129a15a3e94198e7d3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63bdcb129a15a3e94198e7d3/wFVtkekqyQLbS0-78fqcs.jpeg","isPro":false,"fullname":"Manli Shu","user":"Manli","type":"user"},"name":"Manli Shu","status":"claimed_verified","statusLastChangedAt":"2024-08-19T22:13:15.619Z","hidden":false},{"_id":"66c2a7d20836dd7a55716f90","user":{"_id":"60f1ab3afe0d78e01037eeb1","avatarUrl":"/avatars/f784fa423fd84fffb4683fa837ffc5a3.svg","isPro":false,"fullname":"Anas Awadalla","user":"anas-awadalla","type":"user"},"name":"Anas Awadalla","status":"claimed_verified","statusLastChangedAt":"2024-08-19T22:13:07.818Z","hidden":false},{"_id":"66c2a7d20836dd7a55716f91","name":"Jun Wang","hidden":false},{"_id":"66c2a7d20836dd7a55716f92","user":{"_id":"634dfb9bbe5a827d48749f39","avatarUrl":"/avatars/a8df0eeb1df39f3935dd686a94768f30.svg","isPro":false,"fullname":"An Yan","user":"zzxslp","type":"user"},"name":"An Yan","status":"claimed_verified","statusLastChangedAt":"2024-08-19T22:13:13.154Z","hidden":false},{"_id":"66c2a7d20836dd7a55716f93","name":"Senthil Purushwalkam","hidden":false},{"_id":"66c2a7d20836dd7a55716f94","name":"Honglu Zhou","hidden":false},{"_id":"66c2a7d20836dd7a55716f95","name":"Viraj Prabhu","hidden":false},{"_id":"66c2a7d20836dd7a55716f96","name":"Yutong Dai","hidden":false},{"_id":"66c2a7d20836dd7a55716f97","name":"Michael S Ryoo","hidden":false},{"_id":"66c2a7d20836dd7a55716f98","name":"Shrikant Kendre","hidden":false},{"_id":"66c2a7d20836dd7a55716f99","user":{"_id":"640131b08ba76abe4b71b5d0","avatarUrl":"/avatars/2288b96a9a0ae8f584768f54e098def1.svg","isPro":false,"fullname":"Jieyu Zhang","user":"jieyuz2","type":"user"},"name":"Jieyu Zhang","status":"claimed_verified","statusLastChangedAt":"2024-08-22T19:52:03.251Z","hidden":false},{"_id":"66c2a7d20836dd7a55716f9a","user":{"_id":"66a9bdfc35c89692442ba4b7","avatarUrl":"/avatars/ad87d1d2d81775eeb0920bf3ebe08cc2.svg","isPro":false,"fullname":"Can Qin","user":"canqin001","type":"user"},"name":"Can Qin","status":"claimed_verified","statusLastChangedAt":"2025-10-10T10:15:27.117Z","hidden":false},{"_id":"66c2a7d20836dd7a55716f9b","name":"Shu Zhang","hidden":false},{"_id":"66c2a7d20836dd7a55716f9c","name":"Chia-Chih Chen","hidden":false},{"_id":"66c2a7d20836dd7a55716f9d","user":{"_id":"6362bcbe8f43a912fc722969","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6362bcbe8f43a912fc722969/ktl2ePfpOIseqlIbuldNa.png","isPro":false,"fullname":"Ning Yu","user":"ningyu1991","type":"user"},"name":"Ning Yu","status":"claimed_verified","statusLastChangedAt":"2025-01-23T09:22:11.542Z","hidden":false},{"_id":"66c2a7d20836dd7a55716f9e","name":"Juntao Tan","hidden":false},{"_id":"66c2a7d20836dd7a55716f9f","name":"Tulika Manoj Awalgaonkar","hidden":false},{"_id":"66c2a7d20836dd7a55716fa0","name":"Shelby Heinecke","hidden":false},{"_id":"66c2a7d20836dd7a55716fa1","name":"Huan Wang","hidden":false},{"_id":"66c2a7d20836dd7a55716fa2","name":"Yejin Choi","hidden":false},{"_id":"66c2a7d20836dd7a55716fa3","name":"Ludwig Schmidt","hidden":false},{"_id":"66c2a7d20836dd7a55716fa4","name":"Zeyuan Chen","hidden":false},{"_id":"66c2a7d20836dd7a55716fa5","name":"Silvio Savarese","hidden":false},{"_id":"66c2a7d20836dd7a55716fa6","user":{"_id":"65bb26b5d61b51a50805aba2","avatarUrl":"/avatars/966c8469f55776948a7cf42d98accff3.svg","isPro":false,"fullname":"Juan Carlos Niebles","user":"niebles","type":"user"},"name":"Juan Carlos Niebles","status":"claimed_verified","statusLastChangedAt":"2024-08-19T22:13:05.476Z","hidden":false},{"_id":"66c2a7d20836dd7a55716fa7","name":"Caiming Xiong","hidden":false},{"_id":"66c2a7d20836dd7a55716fa8","user":{"_id":"6465c4c863e7e09dd02e3e1b","avatarUrl":"/avatars/200b029184d2616f98296a2c212f0785.svg","isPro":false,"fullname":"Ran Xu","user":"xurantju","type":"user"},"name":"Ran Xu","status":"claimed_verified","statusLastChangedAt":"2024-08-19T22:13:10.424Z","hidden":false}],"publishedAt":"2024-08-16T17:57:01.000Z","submittedOnDailyAt":"2024-08-19T00:33:10.355Z","title":"xGen-MM (BLIP-3): A Family of Open Large Multimodal Models","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"This report introduces xGen-MM (also known as BLIP-3), a framework for\ndeveloping Large Multimodal Models (LMMs). The framework comprises meticulously\ncurated datasets, a training recipe, model architectures, and a resulting suite\nof LMMs. xGen-MM, short for xGen-MultiModal, expands the Salesforce xGen\ninitiative on foundation AI models. Our models undergo rigorous evaluation\nacross a range of tasks, including both single and multi-image benchmarks. Our\npre-trained base model exhibits strong in-context learning capabilities and the\ninstruction-tuned model demonstrates competitive performance among open-source\nLMMs with similar model sizes. In addition, we introduce a safety-tuned model\nwith DPO, aiming to mitigate harmful behaviors such as hallucinations and\nimprove safety. We open-source our models, curated large-scale datasets, and\nour fine-tuning codebase to facilitate further advancements in LMM research.\nAssociated resources will be available on our project page above.","upvotes":101,"discussionId":"66c2a7d40836dd7a55717076","ai_summary":"xGen-MM, an extension of Salesforce's xGen initiative, provides a framework for developing Large Multimodal Models with pre-trained, instruction-tuned, and safety-tuned variants.","ai_keywords":["Large Multimodal Models","LMMs","in-context learning","instruction-tuned","safety-tuned","DPO","hallucinations"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"635cada2c017767a629db012","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1667018139063-noauth.jpeg","isPro":false,"fullname":"Ojasvi Singh Yadav","user":"ojasvisingh786","type":"user"},{"_id":"64747f7e33192631bacd8831","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64747f7e33192631bacd8831/dstkZJ4sHJSeqLesV5cOC.jpeg","isPro":false,"fullname":"Taufiq Dwi Purnomo","user":"taufiqdp","type":"user"},{"_id":"64f58c8bab97f9a83eca4963","avatarUrl":"/avatars/b525fa19dcb6c051fe3ae260d85926e5.svg","isPro":false,"fullname":"haogengliu","user":"lllliuhhhhggg","type":"user"},{"_id":"6434b6619bd5a84b5dcfa4de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6434b6619bd5a84b5dcfa4de/h8Q6kPNjFNc03wmdboHzq.jpeg","isPro":true,"fullname":"Young-Jun Lee","user":"passing2961","type":"user"},{"_id":"6177322d37f32ecb1e2d4cdf","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1635201569275-noauth.jpeg","isPro":false,"fullname":"Hugo LaurenΓ§on","user":"HugoLaurencon","type":"user"},{"_id":"6474d134a855203d8fec250c","avatarUrl":"/avatars/b5760d4e6d266449fc5cb09f1acebb34.svg","isPro":false,"fullname":"Richard Reed","user":"rreed-pha","type":"user"},{"_id":"63bdcb129a15a3e94198e7d3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63bdcb129a15a3e94198e7d3/wFVtkekqyQLbS0-78fqcs.jpeg","isPro":false,"fullname":"Manli Shu","user":"Manli","type":"user"},{"_id":"63053858acc17ce4ad3580e6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63053858acc17ce4ad3580e6/Fg1bMOPRpOhk6xMhnCOi4.jpeg","isPro":false,"fullname":"Zhongpai Gao","user":"gaozhongpai","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"650dce38c945dfc9385db188","avatarUrl":"/avatars/df3a3da79196093f01d491014a6f1429.svg","isPro":false,"fullname":"Anush Mohan","user":"anushmohan","type":"user"},{"_id":"6528a57bf0042c8301d217dc","avatarUrl":"/avatars/b7e1398aec545a0342c05c67c5493c8b.svg","isPro":false,"fullname":"HanSaem Kim","user":"kensaem","type":"user"},{"_id":"636f533c1ca0ea5107ed171d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/636f533c1ca0ea5107ed171d/jLwsrcPtUiHj8WhcE0Y67.jpeg","isPro":false,"fullname":"Bhimraj Yadav","user":"bhimrazy","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2408.08872

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Published on Aug 16, 2024
Β· Submitted by
AK
on Aug 19, 2024
Authors:
Le Xue ,
,
An Yan ,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

xGen-MM, an extension of Salesforce's xGen initiative, provides a framework for developing Large Multimodal Models with pre-trained, instruction-tuned, and safety-tuned variants.

AI-generated summary

This report introduces xGen-MM (also known as BLIP-3), a framework for developing Large Multimodal Models (LMMs). The framework comprises meticulously curated datasets, a training recipe, model architectures, and a resulting suite of LMMs. xGen-MM, short for xGen-MultiModal, expands the Salesforce xGen initiative on foundation AI models. Our models undergo rigorous evaluation across a range of tasks, including both single and multi-image benchmarks. Our pre-trained base model exhibits strong in-context learning capabilities and the instruction-tuned model demonstrates competitive performance among open-source LMMs with similar model sizes. In addition, we introduce a safety-tuned model with DPO, aiming to mitigate harmful behaviors such as hallucinations and improve safety. We open-source our models, curated large-scale datasets, and our fine-tuning codebase to facilitate further advancements in LMM research. Associated resources will be available on our project page above.

Community

Paper submitter

The link gives a 404, I assume xgen-mm hasn't been merged yet?

Paper author

Hi, we plan to make the links public today. Since yesterday was the weekend, we need infrastructure's access to turn things public on Monday.

Paper author

Hi, our model/dataset links are live now

Β·

Hi,
https://huggingface.co/datasets/Salesforce/blip3-ocr-200m
https://huggingface.co/datasets/Salesforce/blip3-grounding-50m

It's still giving me a 404 error.
Can you please let us know? Thanks in advance. :)

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

No description provided.

bounding boxe

Sign up or log in to comment

Models citing this paper 6

Browse 6 models citing this paper

Datasets citing this paper 2

Spaces citing this paper 3

Collections including this paper 25