Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Retrieval-Augmented Decision Transformer: External Memory for In-context RL
[go: Go Back, main page]

https://github.com/ml-jku/RA-DT
Datasets: https://huggingface.co/ml-jku

\n","updatedAt":"2024-10-10T07:13:42.416Z","author":{"_id":"648826f845a9218318e0272c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648826f845a9218318e0272c/zWrSQblli4PkmiIzWH1RS.jpeg","fullname":"Fabian Paischer","name":"paischer101","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":5,"isUserFollowing":false}},"numEdits":1,"editors":["paischer101"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/648826f845a9218318e0272c/zWrSQblli4PkmiIzWH1RS.jpeg"],"reactions":[],"isReport":false}},{"id":"670880870e79a8b46f7ff5dd","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2024-10-11T01:33:59.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [ReLIC: A Recipe for 64k Steps of In-Context Reinforcement Learning for Embodied AI](https://huggingface.co/papers/2410.02751) (2024)\n* [Predictive Coding for Decision Transformer](https://huggingface.co/papers/2410.03408) (2024)\n* [In-Context Imitation Learning via Next-Token Prediction](https://huggingface.co/papers/2408.15980) (2024)\n* [Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining](https://huggingface.co/papers/2410.00564) (2024)\n* [RAG-Modulo: Solving Sequential Tasks using Experience, Critics, and Language Models](https://huggingface.co/papers/2409.12294) (2024)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2024-10-11T01:33:59.158Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2410.07071","authors":[{"_id":"67077d8146c9e0a80114a1ac","user":{"_id":"64c3849269b1a6796052eac7","avatarUrl":"/avatars/9f0c832d5b51b659c7bb83074f02a648.svg","isPro":false,"fullname":"Thomas Schmied","user":"thomasschmied","type":"user"},"name":"Thomas Schmied","status":"admin_assigned","statusLastChangedAt":"2024-10-10T09:29:24.323Z","hidden":false},{"_id":"67077d8146c9e0a80114a1ad","user":{"_id":"648826f845a9218318e0272c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648826f845a9218318e0272c/zWrSQblli4PkmiIzWH1RS.jpeg","isPro":false,"fullname":"Fabian Paischer","user":"paischer101","type":"user"},"name":"Fabian Paischer","status":"admin_assigned","statusLastChangedAt":"2024-10-10T09:29:18.411Z","hidden":false},{"_id":"67077d8146c9e0a80114a1ae","user":{"_id":"63dfcf6742591dda0b951a5b","avatarUrl":"/avatars/3cceda20169516bd52b0d7f9090ab41e.svg","isPro":false,"fullname":"vihang patil","user":"vihangp","type":"user"},"name":"Vihang Patil","status":"admin_assigned","statusLastChangedAt":"2024-10-10T09:29:12.588Z","hidden":false},{"_id":"67077d8146c9e0a80114a1af","name":"Markus Hofmarcher","hidden":false},{"_id":"67077d8146c9e0a80114a1b0","user":{"_id":"64b9310403124195cd9778ec","avatarUrl":"/avatars/57c594d3d0f97d3010b15b6a0806451c.svg","isPro":false,"fullname":"Razvan Pascanu","user":"razp","type":"user"},"name":"Razvan Pascanu","status":"admin_assigned","statusLastChangedAt":"2024-10-10T09:29:01.158Z","hidden":false},{"_id":"67077d8146c9e0a80114a1b1","name":"Sepp Hochreiter","hidden":false}],"publishedAt":"2024-10-09T17:15:30.000Z","submittedOnDailyAt":"2024-10-10T05:39:04.231Z","title":"Retrieval-Augmented Decision Transformer: External Memory for In-context\n RL","submittedOnDailyBy":{"_id":"648826f845a9218318e0272c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648826f845a9218318e0272c/zWrSQblli4PkmiIzWH1RS.jpeg","isPro":false,"fullname":"Fabian Paischer","user":"paischer101","type":"user"},"summary":"In-context learning (ICL) is the ability of a model to learn a new task by\nobserving a few exemplars in its context. While prevalent in NLP, this\ncapability has recently also been observed in Reinforcement Learning (RL)\nsettings. Prior in-context RL methods, however, require entire episodes in the\nagent's context. Given that complex environments typically lead to long\nepisodes with sparse rewards, these methods are constrained to simple\nenvironments with short episodes. To address these challenges, we introduce\nRetrieval-Augmented Decision Transformer (RA-DT). RA-DT employs an external\nmemory mechanism to store past experiences from which it retrieves only\nsub-trajectories relevant for the current situation. The retrieval component in\nRA-DT does not require training and can be entirely domain-agnostic. We\nevaluate the capabilities of RA-DT on grid-world environments, robotics\nsimulations, and procedurally-generated video games. On grid-worlds, RA-DT\noutperforms baselines, while using only a fraction of their context length.\nFurthermore, we illuminate the limitations of current in-context RL methods on\ncomplex environments and discuss future directions. To facilitate future\nresearch, we release datasets for four of the considered environments.","upvotes":7,"discussionId":"67077d8346c9e0a80114a272","githubRepo":"https://github.com/ml-jku/RA-DT","githubRepoAddedBy":"auto","ai_summary":"RA-DT, a retrieval-augmented decision transformer that uses external memory for in-context learning, outperforms baselines in grid-world environments and robotics simulations by leveraging relevant sub-trajectories without full episode training.","ai_keywords":["in-context learning","retrieval-augmented decision transformer","external memory","sub-trajectories","in-context RL","grid-world environments","robotics simulations","procedural-generated video games"],"githubStars":24},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"648826f845a9218318e0272c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648826f845a9218318e0272c/zWrSQblli4PkmiIzWH1RS.jpeg","isPro":false,"fullname":"Fabian Paischer","user":"paischer101","type":"user"},{"_id":"6041ff7ff84ebe399f1c85ea","avatarUrl":"/avatars/a5e2306f3cd27e0ea1c30eeb81f870fa.svg","isPro":false,"fullname":"Lukas","user":"sirluk","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"64c3849269b1a6796052eac7","avatarUrl":"/avatars/9f0c832d5b51b659c7bb83074f02a648.svg","isPro":false,"fullname":"Thomas Schmied","user":"thomasschmied","type":"user"},{"_id":"6707c949ceaa2578b5645450","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/zPlCru7Y9bQBCTYcAJ6Ah.png","isPro":false,"fullname":"Jeff badman","user":"Lastat22","type":"user"},{"_id":"65decc75beffeb39ba679eba","avatarUrl":"/avatars/735b678bd5863a0c1b1bdd3bbf8858fa.svg","isPro":true,"fullname":"r","user":"oceansweep","type":"user"},{"_id":"63dfcf6742591dda0b951a5b","avatarUrl":"/avatars/3cceda20169516bd52b0d7f9090ab41e.svg","isPro":false,"fullname":"vihang patil","user":"vihangp","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2410.07071

Retrieval-Augmented Decision Transformer: External Memory for In-context RL

Published on Oct 9, 2024
· Submitted by
Fabian Paischer
on Oct 10, 2024
Authors:
,

Abstract

RA-DT, a retrieval-augmented decision transformer that uses external memory for in-context learning, outperforms baselines in grid-world environments and robotics simulations by leveraging relevant sub-trajectories without full episode training.

AI-generated summary

In-context learning (ICL) is the ability of a model to learn a new task by observing a few exemplars in its context. While prevalent in NLP, this capability has recently also been observed in Reinforcement Learning (RL) settings. Prior in-context RL methods, however, require entire episodes in the agent's context. Given that complex environments typically lead to long episodes with sparse rewards, these methods are constrained to simple environments with short episodes. To address these challenges, we introduce Retrieval-Augmented Decision Transformer (RA-DT). RA-DT employs an external memory mechanism to store past experiences from which it retrieves only sub-trajectories relevant for the current situation. The retrieval component in RA-DT does not require training and can be entirely domain-agnostic. We evaluate the capabilities of RA-DT on grid-world environments, robotics simulations, and procedurally-generated video games. On grid-worlds, RA-DT outperforms baselines, while using only a fraction of their context length. Furthermore, we illuminate the limitations of current in-context RL methods on complex environments and discuss future directions. To facilitate future research, we release datasets for four of the considered environments.

Community

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2410.07071 in a model README.md to link it from this page.

Datasets citing this paper 3

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2410.07071 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.