The following papers were recommended by the Semantic Scholar API
\n- \n
- Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility (2026) \n
- APEX: Academic Poster Editing Agentic Expert (2026) \n
- SciFig: Towards Automating Scientific Figure Generation (2026) \n
- SlidesGen-Bench: Evaluating Slides Generation via Computational and Quantitative Metrics (2026) \n
- ProImage-Bench: Rubric-Based Evaluation for Professional Image Generation (2025) \n
- ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement (2025) \n
- Unified Thinker: A General Reasoning Modular Core for Image Generation (2026) \n
Please give a thumbs up to this comment if you found it helpful!
\nIf you want recommendations for any Paper on Hugging Face checkout this Space
\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
Amazing work! ๐
\nIt would be very cool to get a space demo of paperbanana so we can understand and try out how the vlms and generators are orchestrated.
\n","updatedAt":"2026-02-03T15:43:52.337Z","author":{"_id":"62d648291fa3e4e7ae3fa6e8","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d648291fa3e4e7ae3fa6e8/oatOwf8Xqe5eDbCSuYqCd.png","fullname":"ben burtenshaw","name":"burtenshaw","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":4318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9498290419578552},"editors":["burtenshaw"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/62d648291fa3e4e7ae3fa6e8/oatOwf8Xqe5eDbCSuYqCd.png"],"reactions":[{"reaction":"๐","users":["sergiopaniego","Kamal0303","YellowjacketGames"],"count":3},{"reaction":"๐ฅ","users":["sergiopaniego"],"count":1},{"reaction":"๐","users":["tyb343"],"count":1}],"isReport":false},"replies":[{"id":"698366723a4fa563d27e2f72","author":{"_id":"6947f69751d7ae7c3c7b6908","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/PuIDZB9XDShHohKhYmdmp.png","fullname":"Ben Kelly","name":"YellowjacketGames","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false},"createdAt":"2026-02-04T15:32:02.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"someone should take inspiration from the twitter account \"science diagrams that look like s***posts\" and make a meme generator using this.","html":"someone should take inspiration from the twitter account \"science diagrams that look like s***posts\" and make a meme generator using this.
\n","updatedAt":"2026-02-04T15:32:02.202Z","author":{"_id":"6947f69751d7ae7c3c7b6908","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/PuIDZB9XDShHohKhYmdmp.png","fullname":"Ben Kelly","name":"YellowjacketGames","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8717743158340454},"editors":["YellowjacketGames"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/PuIDZB9XDShHohKhYmdmp.png"],"reactions":[{"reaction":"โค๏ธ","users":["dippatel1994"],"count":1}],"isReport":false,"parentCommentId":"698217b82650edae2abc0b54"}},{"id":"6984875c8db89f0704e81081","author":{"_id":"64bacb06f346e6651476780c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png","fullname":"Dipkumar Patel","name":"dippatel1994","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false},"createdAt":"2026-02-05T12:04:44.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"We can create a simple Gradio app here in space where a user can pass their Gemini API key and things would work. Nice idea, but alternatively try adding paperbanana in Claude code/cursor. MCP server/skills are supported now.\n\nDisclaimer: This is not an official implementation, but I tried to implement it as close as possible. Just couldn't mimic those ~230 examples the research team added. However, with the help of the open-source community, we can add even more examples from diverse backgrounds (e.g., biology, AI research, math papers) and a nice retrieval mechanism to surpass the performance reported in the paper! \n\nMore on this here - https://github.com/llmsresearch/paperbanana/wiki","html":"We can create a simple Gradio app here in space where a user can pass their Gemini API key and things would work. Nice idea, but alternatively try adding paperbanana in Claude code/cursor. MCP server/skills are supported now.
\nDisclaimer: This is not an official implementation, but I tried to implement it as close as possible. Just couldn't mimic those ~230 examples the research team added. However, with the help of the open-source community, we can add even more examples from diverse backgrounds (e.g., biology, AI research, math papers) and a nice retrieval mechanism to surpass the performance reported in the paper!
\nMore on this here - https://github.com/llmsresearch/paperbanana/wiki
\n","updatedAt":"2026-02-05T12:04:44.013Z","author":{"_id":"64bacb06f346e6651476780c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png","fullname":"Dipkumar Patel","name":"dippatel1994","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.922322690486908},"editors":["dippatel1994"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png"],"reactions":[],"isReport":false,"parentCommentId":"698217b82650edae2abc0b54"}},{"id":"6984882d9b6e6bbead24f460","author":{"_id":"64bacb06f346e6651476780c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png","fullname":"Dipkumar Patel","name":"dippatel1994","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false},"createdAt":"2026-02-05T12:08:13.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Check on project page a sample output for the quality check. Will add some more examples for quick validation.\n","html":"Check on project page a sample output for the quality check. Will add some more examples for quick validation.
\n","updatedAt":"2026-02-05T12:08:13.877Z","author":{"_id":"64bacb06f346e6651476780c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png","fullname":"Dipkumar Patel","name":"dippatel1994","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7457436919212341},"editors":["dippatel1994"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png"],"reactions":[],"isReport":false,"parentCommentId":"698217b82650edae2abc0b54"}},{"id":"6984d652dfab2d63f834c124","author":{"_id":"64bacb06f346e6651476780c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png","fullname":"Dipkumar Patel","name":"dippatel1994","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false},"createdAt":"2026-02-05T17:41:38.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"@burtenshaw here you go! Created a playground to try it. Just bring your own Gemini API key and test it directly in the space below. Right now it uses the Gemini 2.0 Flash model. Iโll add an option to switch models soon, but this is a great place to start experimenting.\n\nTry it here: - https://huggingface.co/spaces/dippatel1994/paperbanana","html":"\n\n@burtenshaw\n\t here you go! Created a playground to try it. Just bring your own Gemini API key and test it directly in the space below. Right now it uses the Gemini 2.0 Flash model. Iโll add an option to switch models soon, but this is a great place to start experimenting.
\nTry it here: - https://huggingface.co/spaces/dippatel1994/paperbanana
\n","updatedAt":"2026-02-05T17:42:44.014Z","author":{"_id":"64bacb06f346e6651476780c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png","fullname":"Dipkumar Patel","name":"dippatel1994","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.8138831853866577},"editors":["dippatel1994"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png"],"reactions":[],"isReport":false,"parentCommentId":"698217b82650edae2abc0b54"}}]},{"id":"69830c91cbcea27a63c0a9f4","author":{"_id":"679240f4bf5cc40508f460bb","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/8HGZry14fzgAn5pVWNbi3.jpeg","fullname":"Krishn Jatav","name":"krishnjatav5","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-02-04T09:08:33.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"\n","html":"","updatedAt":"2026-02-04T09:08:49.416Z","author":{"_id":"679240f4bf5cc40508f460bb","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/8HGZry14fzgAn5pVWNbi3.jpeg","fullname":"Krishn Jatav","name":"krishnjatav5","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.36188793182373047},"editors":["krishnjatav5"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/8HGZry14fzgAn5pVWNbi3.jpeg"],"reactions":[],"isReport":false}},{"id":"698365aaa94181edfd5df306","author":{"_id":"64bacb06f346e6651476780c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png","fullname":"Dipkumar Patel","name":"dippatel1994","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false},"createdAt":"2026-02-04T15:28:42.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Released unofficial implementation with MCP server support: https://github.com/llmsresearch/paperbanana\nWe can use it until we have an official version from the Google research team.","html":"Released unofficial implementation with MCP server support: https://github.com/llmsresearch/paperbanana
We can use it until we have an official version from the Google research team.
Stunned! This is a huge tool for scientists! I was need this myself hahaha, I hate doing academic ilustrations manually.
\n","updatedAt":"2026-02-04T17:40:08.038Z","author":{"_id":"67b10a7bba726eda5c5300d9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67b10a7bba726eda5c5300d9/v4JU_FPuuxSuaj7xb-LET.jpeg","fullname":"Juan David","name":"Jdcloude","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9739241600036621},"editors":["Jdcloude"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/67b10a7bba726eda5c5300d9/v4JU_FPuuxSuaj7xb-LET.jpeg"],"reactions":[{"reaction":"โค๏ธ","users":["dippatel1994"],"count":1}],"isReport":false}},{"id":"6984869de90cf9cce2f2d9ec","author":{"_id":"64bacb06f346e6651476780c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png","fullname":"Dipkumar Patel","name":"dippatel1994","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false},"createdAt":"2026-02-05T12:01:33.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"MCP server & skills support is available now. You just need to use \"uvx --from \"paperbanana[mcp]\" paperbanana-mcp\" to configure the paperbanana mcp server. or, \"claude mcp add paperbanana -e GOOGLE_API_KEY=your-key -- uvx --from \"paperbanana[mcp]\"\n paperbanana-mcp\" if using Claude. ","html":"MCP server & skills support is available now. You just need to use \"uvx --from \"paperbanana[mcp]\" paperbanana-mcp\" to configure the paperbanana mcp server. or, \"claude mcp add paperbanana -e GOOGLE_API_KEY=your-key -- uvx --from \"paperbanana[mcp]\"
paperbanana-mcp\" if using Claude.
Abstract
_paperbanana is an agentic framework that automates the creation of publication-ready academic illustrations using advanced vision-language models and image generation techniques.
Despite rapid advances in autonomous AI scientists powered by language models, generating publication-ready illustrations remains a labor-intensive bottleneck in the research workflow. To lift this burden, we introduce PaperBanana, an agentic framework for automated generation of publication-ready academic illustrations. Powered by state-of-the-art VLMs and image generation models, PaperBanana orchestrates specialized agents to retrieve references, plan content and style, render images, and iteratively refine via self-critique. To rigorously evaluate our framework, we introduce PaperBananaBench, comprising 292 test cases for methodology diagrams curated from NeurIPS 2025 publications, covering diverse research domains and illustration styles. Comprehensive experiments demonstrate that PaperBanana consistently outperforms leading baselines in faithfulness, conciseness, readability, and aesthetics. We further show that our method effectively extends to the generation of high-quality statistical plots. Collectively, PaperBanana paves the way for the automated generation of publication-ready illustrations.
Community
PaperBanana automates publication-ready AI research illustrations via an agentic framework using VLMs and image models, orchestrating reference retrieval, planning, rendering, and self-critique with a benchmarking suite.
This is excellent, I never considered science illustrations as a use-case for image gen models, but it makes total sense and I can see this applying to technical blogging as well.
Interestingly, I had to design a similar pipeline for illustrating games. We're a game studio trying to play "research lab" to push our frontiers, and the need to create structured illustrations at scale, with precision, seems to be a shared objective here.
We're just learning how to write up our results in a more "scientific" way besides "comments.md", and this is a helpful piece of the puzzle.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility (2026)
- APEX: Academic Poster Editing Agentic Expert (2026)
- SciFig: Towards Automating Scientific Figure Generation (2026)
- SlidesGen-Bench: Evaluating Slides Generation via Computational and Quantitative Metrics (2026)
- ProImage-Bench: Rubric-Based Evaluation for Professional Image Generation (2025)
- ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement (2025)
- Unified Thinker: A General Reasoning Modular Core for Image Generation (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Amazing work! ๐
It would be very cool to get a space demo of paperbanana so we can understand and try out how the vlms and generators are orchestrated.
someone should take inspiration from the twitter account "science diagrams that look like s***posts" and make a meme generator using this.
Released unofficial implementation with MCP server support: https://github.com/llmsresearch/paperbanana
We can use it until we have an official version from the Google research team.
Stunned! This is a huge tool for scientists! I was need this myself hahaha, I hate doing academic ilustrations manually.
MCP server & skills support is available now. You just need to use "uvx --from "paperbanana[mcp]" paperbanana-mcp" to configure the paperbanana mcp server. or, "claude mcp add paperbanana -e GOOGLE_API_KEY=your-key -- uvx --from "paperbanana[mcp]"
paperbanana-mcp" if using Claude.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper