Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive
Memory
\n","updatedAt":"2024-10-15T04:31:29.127Z","author":{"_id":"639bf367445b133a4e97ef9c","avatarUrl":"/avatars/51b59f4616a01796e07c05c9aa5286f8.svg","fullname":"Di Wu","name":"xiaowu0162","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":7,"isUserFollowing":false}},"numEdits":0,"editors":["xiaowu0162"],"editorAvatarUrls":["/avatars/51b59f4616a01796e07c05c9aa5286f8.svg"],"reactions":[],"isReport":false}},{"id":"670f17dc97bbe79a502b0a32","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2024-10-16T01:33:16.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [LongGenBench: Long-context Generation Benchmark](https://huggingface.co/papers/2410.04199) (2024)\n* [MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery](https://huggingface.co/papers/2409.05591) (2024)\n* [SEGMENT+: Long Text Processing with Short-Context Language Models](https://huggingface.co/papers/2410.06519) (2024)\n* [ALR$^2$: A Retrieve-then-Reason Framework for Long-context Question Answering](https://huggingface.co/papers/2410.03227) (2024)\n* [MemLong: Memory-Augmented Retrieval for Long Text Modeling](https://huggingface.co/papers/2408.16967) (2024)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
\n
The following papers were recommended by the Semantic Scholar API
Please give a thumbs up to this comment if you found it helpful!
\n
If you want recommendations for any Paper on Hugging Face checkout this Space
\n
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
\n","updatedAt":"2024-10-16T01:33:16.993Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2410.10813","authors":[{"_id":"670defb480363ae01368542c","user":{"_id":"639bf367445b133a4e97ef9c","avatarUrl":"/avatars/51b59f4616a01796e07c05c9aa5286f8.svg","isPro":false,"fullname":"Di Wu","user":"xiaowu0162","type":"user"},"name":"Di Wu","status":"claimed_verified","statusLastChangedAt":"2024-10-15T08:07:50.570Z","hidden":false},{"_id":"670defb480363ae01368542d","name":"Hongwei Wang","hidden":false},{"_id":"670defb480363ae01368542e","user":{"_id":"5feab3a28a3201f8e554c969","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1660795228685-5feab3a28a3201f8e554c969.png","isPro":false,"fullname":"Wenhao Yu","user":"wyu1","type":"user"},"name":"Wenhao Yu","status":"admin_assigned","statusLastChangedAt":"2024-10-15T16:41:35.824Z","hidden":false},{"_id":"670defb480363ae01368542f","user":{"_id":"663187c1a2354b0f50ab10a0","avatarUrl":"/avatars/bef0fd9d2afa6a4990bc32bd55cbe163.svg","isPro":false,"fullname":"Yuwei Zhang","user":"YWZBrandon","type":"user"},"name":"Yuwei Zhang","status":"claimed_verified","statusLastChangedAt":"2024-10-15T21:17:57.174Z","hidden":false},{"_id":"670defb480363ae013685430","user":{"_id":"60b7b9d71b90c5d07c23fbd0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1622653364258-noauth.jpeg","isPro":false,"fullname":"Kai-Wei Chang","user":"kaiweichang","type":"user"},"name":"Kai-Wei Chang","status":"admin_assigned","statusLastChangedAt":"2024-10-15T16:42:29.388Z","hidden":false},{"_id":"670defb480363ae013685431","name":"Dong Yu","hidden":false}],"publishedAt":"2024-10-14T17:59:44.000Z","submittedOnDailyAt":"2024-10-15T03:01:29.103Z","title":"LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive\n Memory","submittedOnDailyBy":{"_id":"639bf367445b133a4e97ef9c","avatarUrl":"/avatars/51b59f4616a01796e07c05c9aa5286f8.svg","isPro":false,"fullname":"Di Wu","user":"xiaowu0162","type":"user"},"summary":"Recent large language model (LLM)-driven chat assistant systems have\nintegrated memory components to track user-assistant chat histories, enabling\nmore accurate and personalized responses. However, their long-term memory\ncapabilities in sustained interactions remain underexplored. This paper\nintroduces LongMemEval, a comprehensive benchmark designed to evaluate five\ncore long-term memory abilities of chat assistants: information extraction,\nmulti-session reasoning, temporal reasoning, knowledge updates, and abstention.\nWith 500 meticulously curated questions embedded within freely scalable\nuser-assistant chat histories, LongMemEval presents a significant challenge to\nexisting long-term memory systems, with commercial chat assistants and\nlong-context LLMs showing 30% accuracy drop on memorizing information across\nsustained interactions. We then present a unified framework that breaks down\nthe long-term memory design into four design choices across the indexing,\nretrieval, and reading stages. Built upon key experimental insights, we propose\nseveral memory designs including session decomposition for optimizing value\ngranularity, fact-augmented key expansion for enhancing the index structure,\nand time-aware query expansion for refining the search scope. Experiment\nresults show that these optimizations greatly improve both memory recall and\ndownstream question answering on LongMemEval. Overall, our study provides\nvaluable resources and guidance for advancing the long-term memory capabilities\nof LLM-based chat assistants, paving the way toward more personalized and\nreliable conversational AI.","upvotes":14,"discussionId":"670defb680363ae0136854c2","githubRepo":"https://github.com/xiaowu0162/longmemeval","githubRepoAddedBy":"auto","ai_summary":"LongMemEval assesses long-term memory in chat assistants through five core abilities, identifying gaps and proposing memory design optimizations that enhance recall and question answering.","ai_keywords":["long-term memory","LongMemEval","information extraction","multi-session reasoning","temporal reasoning","knowledge updates","abstention","indexing","retrieval","reading stages","session decomposition","fact-augmented key expansion","time-aware query expansion"],"githubStars":407},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"639bf367445b133a4e97ef9c","avatarUrl":"/avatars/51b59f4616a01796e07c05c9aa5286f8.svg","isPro":false,"fullname":"Di Wu","user":"xiaowu0162","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"643b19f8a856622f978df30f","avatarUrl":"/avatars/c82779fdf94f80cdb5020504f83c818b.svg","isPro":false,"fullname":"Yatharth Sharma","user":"YaTharThShaRma999","type":"user"},{"_id":"65025370b6595dc45c397340","avatarUrl":"/avatars/9469599b176034548042922c0afa7051.svg","isPro":false,"fullname":"J C","user":"dark-pen","type":"user"},{"_id":"663187c1a2354b0f50ab10a0","avatarUrl":"/avatars/bef0fd9d2afa6a4990bc32bd55cbe163.svg","isPro":false,"fullname":"Yuwei Zhang","user":"YWZBrandon","type":"user"},{"_id":"64c8b2c5c547ed5243e14a6e","avatarUrl":"/avatars/96d4a9010f96001c8cff235915926390.svg","isPro":false,"fullname":"Feng Yao","user":"fengyao1909","type":"user"},{"_id":"64ae4f6280f308a395fd7c19","avatarUrl":"/avatars/5f1330f8187cd5e66aa517303659f110.svg","isPro":false,"fullname":"Kaixin Ma","user":"kaixinm","type":"user"},{"_id":"65decc75beffeb39ba679eba","avatarUrl":"/avatars/735b678bd5863a0c1b1bdd3bbf8858fa.svg","isPro":true,"fullname":"r","user":"oceansweep","type":"user"},{"_id":"5f32b2367e583543386214d9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1635314457124-5f32b2367e583543386214d9.jpeg","isPro":false,"fullname":"Sergei Averkiev","user":"averoo","type":"user"},{"_id":"67b19419ed70237ba49d29da","avatarUrl":"/avatars/10231935b867dd425c0c2f2969448a63.svg","isPro":false,"fullname":"weixuchen","user":"KageXu","type":"user"},{"_id":"660a87a6e5a164f3f53d9025","avatarUrl":"/avatars/d7918a03238abaed397a90e2ae62f9be.svg","isPro":false,"fullname":"Đoàn Ngọc Cường","user":"doanngoccuong","type":"user"},{"_id":"637ecb6c6df7e8f7df7694ba","avatarUrl":"/avatars/9f4d532e7bae2467c8839e102d02e3a9.svg","isPro":false,"fullname":"Geoff","user":"xbno","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
LongMemEval assesses long-term memory in chat assistants through five core abilities, identifying gaps and proposing memory design optimizations that enhance recall and question answering.
AI-generated summary
Recent large language model (LLM)-driven chat assistant systems have
integrated memory components to track user-assistant chat histories, enabling
more accurate and personalized responses. However, their long-term memory
capabilities in sustained interactions remain underexplored. This paper
introduces LongMemEval, a comprehensive benchmark designed to evaluate five
core long-term memory abilities of chat assistants: information extraction,
multi-session reasoning, temporal reasoning, knowledge updates, and abstention.
With 500 meticulously curated questions embedded within freely scalable
user-assistant chat histories, LongMemEval presents a significant challenge to
existing long-term memory systems, with commercial chat assistants and
long-context LLMs showing 30% accuracy drop on memorizing information across
sustained interactions. We then present a unified framework that breaks down
the long-term memory design into four design choices across the indexing,
retrieval, and reading stages. Built upon key experimental insights, we propose
several memory designs including session decomposition for optimizing value
granularity, fact-augmented key expansion for enhancing the index structure,
and time-aware query expansion for refining the search scope. Experiment
results show that these optimizations greatly improve both memory recall and
downstream question answering on LongMemEval. Overall, our study provides
valuable resources and guidance for advancing the long-term memory capabilities
of LLM-based chat assistants, paving the way toward more personalized and
reliable conversational AI.