https://github.com/dwright37/unstructured-evidence-sunset\n","updatedAt":"2025-02-21T08:33:40.663Z","author":{"_id":"60a643b9213fe60589b8fdf9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60a643b9213fe60589b8fdf9/OOXmW3MkSf88r63tAE6-n.jpeg","fullname":"Dustin Wright","name":"dwright37","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":5,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6729052662849426},"editors":["dwright37"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/60a643b9213fe60589b8fdf9/OOXmW3MkSf88r63tAE6-n.jpeg"],"reactions":[],"isReport":false}},{"id":"67b929d0225257ced67a8965","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2025-02-22T01:35:12.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [HERA: Improving Long Document Summarization using Large Language Models with Context Packaging and Reordering](https://huggingface.co/papers/2502.00448) (2025)\n* [MODS: Moderating a Mixture of Document Speakers to Summarize Debatable Queries in Document Collections](https://huggingface.co/papers/2502.00322) (2025)\n* [Unraveling the Capabilities of Language Models in News Summarization](https://huggingface.co/papers/2501.18128) (2025)\n* [Context-Aware Hierarchical Merging for Long Document Summarization](https://huggingface.co/papers/2502.00977) (2025)\n* [Discourse-Driven Evaluation: Unveiling Factual Inconsistency in Long Document Summarization](https://huggingface.co/papers/2502.06185) (2025)\n* [Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles](https://huggingface.co/papers/2501.07718) (2025)\n* [Diversity Enhances an LLM's Performance in RAG and Long-context Task](https://huggingface.co/papers/2502.09017) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
\n
The following papers were recommended by the Semantic Scholar API
\n
\n
Please give a thumbs up to this comment if you found it helpful!
\n
If you want recommendations for any Paper on Hugging Face checkout this Space
\n
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
\n","updatedAt":"2025-02-22T01:35:12.014Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7056630849838257},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2502.14409","authors":[{"_id":"67b83a20a9fa331061e84ecd","user":{"_id":"60a643b9213fe60589b8fdf9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60a643b9213fe60589b8fdf9/OOXmW3MkSf88r63tAE6-n.jpeg","isPro":false,"fullname":"Dustin Wright","user":"dwright37","type":"user"},"name":"Dustin Wright","status":"claimed_verified","statusLastChangedAt":"2025-02-21T09:58:02.288Z","hidden":false},{"_id":"67b83a20a9fa331061e84ece","user":{"_id":"637e8b1b66ee00bcb2468ed0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1669240174964-637e8b1b66ee00bcb2468ed0.jpeg","isPro":false,"fullname":"Zain","user":"zainmujahid","type":"user"},"name":"Zain Muhammad Mujahid","status":"admin_assigned","statusLastChangedAt":"2025-02-21T15:16:52.600Z","hidden":false},{"_id":"67b83a20a9fa331061e84ecf","name":"Lu Wang","hidden":false},{"_id":"67b83a20a9fa331061e84ed0","user":{"_id":"608918b7df398c3b285ce960","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1621507769190-608918b7df398c3b285ce960.jpeg","isPro":false,"fullname":"Isabelle Augenstein","user":"IAugenstein","type":"user"},"name":"Isabelle Augenstein","status":"admin_assigned","statusLastChangedAt":"2025-02-21T15:17:02.420Z","hidden":false},{"_id":"67b83a20a9fa331061e84ed1","user":{"_id":"63516acdce7cf1fe8a854cdc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63516acdce7cf1fe8a854cdc/TlOI7iPdG7zJKWyGLoPQN.jpeg","isPro":false,"fullname":"David Jurgens","user":"davidjurgens","type":"user"},"name":"David Jurgens","status":"admin_assigned","statusLastChangedAt":"2025-02-21T15:17:08.686Z","hidden":false}],"publishedAt":"2025-02-20T09:57:42.000Z","submittedOnDailyAt":"2025-02-21T06:03:40.641Z","title":"Unstructured Evidence Attribution for Long Context Query Focused\n Summarization","submittedOnDailyBy":{"_id":"60a643b9213fe60589b8fdf9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60a643b9213fe60589b8fdf9/OOXmW3MkSf88r63tAE6-n.jpeg","isPro":false,"fullname":"Dustin Wright","user":"dwright37","type":"user"},"summary":"Large language models (LLMs) are capable of generating coherent summaries\nfrom very long contexts given a user query. Extracting and properly citing\nevidence spans could help improve the transparency and reliability of these\nsummaries. At the same time, LLMs suffer from positional biases in terms of\nwhich information they understand and attend to, which could affect evidence\ncitation. Whereas previous work has focused on evidence citation with\npredefined levels of granularity (e.g. sentence, paragraph, document, etc.), we\npropose the task of long-context query focused summarization with unstructured\nevidence citation. We show how existing systems struggle to generate and\nproperly cite unstructured evidence from their context, and that evidence tends\nto be \"lost-in-the-middle\". To help mitigate this, we create the Summaries with\nUnstructured Evidence Text dataset (SUnsET), a synthetic dataset generated\nusing a novel domain-agnostic pipeline which can be used as supervision to\nadapt LLMs to this task. We demonstrate across 5 LLMs of different sizes and 4\ndatasets with varying document types and lengths that LLMs adapted with SUnsET\ndata generate more relevant and factually consistent evidence than their base\nmodels, extract evidence from more diverse locations in their context, and can\ngenerate more relevant and consistent summaries.","upvotes":3,"discussionId":"67b83a21a9fa331061e84f36","githubRepo":"https://github.com/dwright37/unstructured-evidence-sunset","githubRepoAddedBy":"auto","ai_summary":"LMMs adapted with a new synthetic dataset improve evidence citation and summary generation from long contexts by extracting diverse and consistent evidence.","ai_keywords":["large language models","LLMs","query focused summarization","unstructured evidence citation","Summaries with Unstructured Evidence Text dataset","SUnsET","positional biases","evidence citation","document types","lengths","relevance","consistency"],"githubStars":11},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"60a643b9213fe60589b8fdf9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60a643b9213fe60589b8fdf9/OOXmW3MkSf88r63tAE6-n.jpeg","isPro":false,"fullname":"Dustin Wright","user":"dwright37","type":"user"},{"_id":"63516acdce7cf1fe8a854cdc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63516acdce7cf1fe8a854cdc/TlOI7iPdG7zJKWyGLoPQN.jpeg","isPro":false,"fullname":"David Jurgens","user":"davidjurgens","type":"user"},{"_id":"637e8b1b66ee00bcb2468ed0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1669240174964-637e8b1b66ee00bcb2468ed0.jpeg","isPro":false,"fullname":"Zain","user":"zainmujahid","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Unstructured Evidence Attribution for Long Context Query Focused
Summarization
Published on Feb 20, 2025
Abstract
LMMs adapted with a new synthetic dataset improve evidence citation and summary generation from long contexts by extracting diverse and consistent evidence.
Large language models (LLMs) are capable of generating coherent summaries
from very long contexts given a user query. Extracting and properly citing
evidence spans could help improve the transparency and reliability of these
summaries. At the same time, LLMs suffer from positional biases in terms of
which information they understand and attend to, which could affect evidence
citation. Whereas previous work has focused on evidence citation with
predefined levels of granularity (e.g. sentence, paragraph, document, etc.), we
propose the task of long-context query focused summarization with unstructured
evidence citation. We show how existing systems struggle to generate and
properly cite unstructured evidence from their context, and that evidence tends
to be "lost-in-the-middle". To help mitigate this, we create the Summaries with
Unstructured Evidence Text dataset (SUnsET), a synthetic dataset generated
using a novel domain-agnostic pipeline which can be used as supervision to
adapt LLMs to this task. We demonstrate across 5 LLMs of different sizes and 4
datasets with varying document types and lengths that LLMs adapted with SUnsET
data generate more relevant and factually consistent evidence than their base
models, extract evidence from more diverse locations in their context, and can
generate more relevant and consistent summaries.