Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
[go: Go Back, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-03-03T02:36:49.801Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7166228890419006},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2404.05892","authors":[{"_id":"6615f99924c94fd8a22311c1","user":{"_id":"62b3d8d651b07307bd12b7f0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1655953609090-noauth.jpeg","isPro":false,"fullname":"BlinkDL","user":"BlinkDL","type":"user"},"name":"Bo Peng","status":"claimed_verified","statusLastChangedAt":"2025-03-19T09:49:08.497Z","hidden":false},{"_id":"6615f99924c94fd8a22311c2","user":{"_id":"647f4bac45baf21ad709fcd0","avatarUrl":"/avatars/14c04cdda95de676aeefa9ae3e7c19ba.svg","isPro":false,"fullname":"Dan Goldstein","user":"SmerkyG","type":"user"},"name":"Daniel Goldstein","status":"claimed_verified","statusLastChangedAt":"2025-03-19T09:49:10.506Z","hidden":false},{"_id":"6615f99924c94fd8a22311c3","user":{"_id":"63b9a261efe99543b34e9579","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63b9a261efe99543b34e9579/aZMhhnurORuPU_WryDUiu.jpeg","isPro":false,"fullname":"Quentin Anthony","user":"qanthony","type":"user"},"name":"Quentin Anthony","status":"claimed_verified","statusLastChangedAt":"2024-06-20T07:34:51.007Z","hidden":false},{"_id":"6615f99924c94fd8a22311c4","user":{"_id":"611a7ec4289467cafea62d13","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/611a7ec4289467cafea62d13/pck-0fmPQkoU7yzh6-WoL.jpeg","isPro":false,"fullname":"Alon Albalak","user":"alon-albalak","type":"user"},"name":"Alon Albalak","status":"claimed_verified","statusLastChangedAt":"2024-05-15T10:43:23.918Z","hidden":false},{"_id":"6615f99924c94fd8a22311c5","user":{"_id":"646c8a411ee398a4e940cab1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/646c8a411ee398a4e940cab1/5W3dGj8eQqQrxeGPQO1n2.png","isPro":false,"fullname":"hypnopump","user":"hypnopump","type":"user"},"name":"Eric Alcaide","status":"claimed_verified","statusLastChangedAt":"2024-04-14T12:54:30.685Z","hidden":false},{"_id":"6615f99924c94fd8a22311c6","user":{"_id":"60347d3660e3dd96631c9093","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60347d3660e3dd96631c9093/B3fuZer5N04tZIAYrLnz4.jpeg","isPro":false,"fullname":"Stella Biderman","user":"stellaathena","type":"user"},"name":"Stella Biderman","status":"claimed_verified","statusLastChangedAt":"2025-06-07T05:51:30.096Z","hidden":false},{"_id":"6615f99924c94fd8a22311c7","name":"Eugene Cheah","hidden":false},{"_id":"6615f99924c94fd8a22311c8","name":"Teddy Ferdinan","hidden":false},{"_id":"6615f99924c94fd8a22311c9","user":{"_id":"64414d0a66a62c605c94d14d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64414d0a66a62c605c94d14d/KGjlf9tRwslr7Q503YLnJ.jpeg","isPro":false,"fullname":"howard-hou","user":"howard-hou","type":"user"},"name":"Haowen Hou","status":"claimed_verified","statusLastChangedAt":"2025-03-20T19:17:49.238Z","hidden":false},{"_id":"6615f99924c94fd8a22311ca","name":"Przemysław Kazienko","hidden":false},{"_id":"6615f99924c94fd8a22311cb","name":"Kranthi Kiran GV","hidden":false},{"_id":"6615f99924c94fd8a22311cc","name":"Jan Kocoń","hidden":false},{"_id":"6615f99924c94fd8a22311cd","name":"Bartłomiej Koptyra","hidden":false},{"_id":"6615f99924c94fd8a22311ce","user":{"_id":"6186fef1b1085ab638324e7f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6186fef1b1085ab638324e7f/BL6_WJCkxB-BatBUBilT8.jpeg","isPro":false,"fullname":"Satya","user":"skrishna","type":"user"},"name":"Satyapriya Krishna","status":"claimed_verified","statusLastChangedAt":"2024-04-23T07:15:39.514Z","hidden":false},{"_id":"6615f99924c94fd8a22311cf","name":"Ronald McClelland Jr.","hidden":false},{"_id":"6615f99924c94fd8a22311d0","name":"Niklas Muennighoff","hidden":false},{"_id":"6615f99924c94fd8a22311d1","user":{"_id":"643c7c0586ab6dbe34f1eae5","avatarUrl":"/avatars/aba29ed4a6092658900cf16f32c90f02.svg","isPro":false,"fullname":"Fares Obeid","user":"Fareso","type":"user"},"name":"Fares Obeid","status":"claimed_verified","statusLastChangedAt":"2024-04-10T10:58:45.716Z","hidden":false},{"_id":"6615f99924c94fd8a22311d2","user":{"_id":"606b1d984b47ca0dcae61aed","avatarUrl":"/avatars/2457293c5469f2bc5116f6e915f045c8.svg","isPro":false,"fullname":"AtsushiSaito","user":"atsushi3110","type":"user"},"name":"Atsushi Saito","status":"claimed_verified","statusLastChangedAt":"2024-04-17T13:15:26.703Z","hidden":false},{"_id":"6615f99924c94fd8a22311d3","name":"Guangyu Song","hidden":false},{"_id":"6615f99924c94fd8a22311d4","user":{"_id":"604ae011caabafacfa48e3de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1615519738679-noauth.jpeg","isPro":false,"fullname":"Haoqin Tu","user":"PahaII","type":"user"},"name":"Haoqin Tu","status":"admin_assigned","statusLastChangedAt":"2024-07-23T14:10:48.584Z","hidden":false},{"_id":"6615f99924c94fd8a22311d5","name":"Stanisław Woźniak","hidden":false},{"_id":"6615f99924c94fd8a22311d6","user":{"_id":"6418629fd13ffa408128d7ae","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1679319546731-noauth.png","isPro":false,"fullname":"Zhang Ruichong","user":"ZhangRC","type":"user"},"name":"Ruichong Zhang","status":"claimed_verified","statusLastChangedAt":"2024-04-10T10:58:34.768Z","hidden":false},{"_id":"6615f99924c94fd8a22311d7","user":{"_id":"62dcd71075e9787ec5aa41ba","avatarUrl":"/avatars/f37ce036b76180ed0fa004f9c8c09363.svg","isPro":true,"fullname":"Bingchen Zhao","user":"tennant","type":"user"},"name":"Bingchen Zhao","status":"claimed_verified","statusLastChangedAt":"2024-06-19T07:39:47.208Z","hidden":false},{"_id":"6615f99924c94fd8a22311d8","name":"Qihang Zhao","hidden":false},{"_id":"6615f99924c94fd8a22311d9","name":"Peng Zhou","hidden":false},{"_id":"6615f99924c94fd8a22311da","name":"Jian Zhu","hidden":false},{"_id":"6615f99924c94fd8a22311db","user":{"_id":"63ff09f24852102d4871c19c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63ff09f24852102d4871c19c/lyE3xemtZss3qebK5sEXw.png","isPro":false,"fullname":"Rui-Jie Zhu","user":"ridger","type":"user"},"name":"Rui-Jie Zhu","status":"claimed_verified","statusLastChangedAt":"2024-10-14T19:12:40.226Z","hidden":false}],"publishedAt":"2024-04-08T22:20:59.000Z","submittedOnDailyAt":"2024-04-10T00:59:47.487Z","title":"Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon\nthe RWKV (RWKV-4) architecture. Our architectural design advancements include\nmulti-headed matrix-valued states and a dynamic recurrence mechanism that\nimprove expressivity while maintaining the inference efficiency characteristics\nof RNNs. We introduce a new multilingual corpus with 1.12 trillion tokens and a\nfast tokenizer based on greedy matching for enhanced multilinguality. We\ntrained four Eagle models, ranging from 0.46 to 7.5 billion parameters, and two\nFinch models with 1.6 and 3.1 billion parameters and find that they achieve\ncompetitive performance across a wide variety of benchmarks. We release all our\nmodels on HuggingFace under the Apache 2.0 license. Models at:\nhttps://huggingface.co/RWKV Training code at: https://github.com/RWKV/RWKV-LM\nInference code at: https://github.com/RWKV/ChatRWKV Time-parallel training code\nat: https://github.com/RWKV/RWKV-infctx-trainer","upvotes":40,"discussionId":"6615f99b24c94fd8a2231242","githubRepo":"https://github.com/rwkv/rwkv-infctx-trainer","githubRepoAddedBy":"auto","ai_summary":"Eagle and Finch, improvements over RWKV-4, use multi-headed matrix-valued states and dynamic recurrence for enhanced expressivity and efficiency, trained on a multilingual corpus.","ai_keywords":["multi-headed matrix-valued states","dynamic recurrence mechanism","RNNs","greedy matching","multilingual corpus"],"githubStars":148},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64bbe9b236eb058cd9d6a5b9","avatarUrl":"/avatars/c7c01a3fa8809e73800392679abff6d5.svg","isPro":false,"fullname":"Kai Zuberbühler","user":"kaizuberbuehler","type":"user"},{"_id":"6101c620900eaa0057c2ce1d","avatarUrl":"/avatars/bd282166c120711c65b5409dc860ac58.svg","isPro":false,"fullname":"Abdel-Dayane Marcos","user":"admarcosai","type":"user"},{"_id":"64137e2150358a805203cbac","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64137e2150358a805203cbac/vU4OjyvOlu2g5PEOOik-t.jpeg","isPro":false,"fullname":"euclaise","user":"euclaise","type":"user"},{"_id":"64ca7c04710645aa7bdbbfff","avatarUrl":"/avatars/c12f4cb6dc1ff0010edb3ef4cfcccd7c.svg","isPro":false,"fullname":"Lize Pirenne","user":"Inversta","type":"user"},{"_id":"64037bb1929304a3c8006453","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64037bb1929304a3c8006453/bHnxEEBZ9PQnoITK-1Mvc.jpeg","isPro":false,"fullname":"Rogério Freitas","user":"RogerioFreitas","type":"user"},{"_id":"63a369d98c0c89dcae3b8329","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63a369d98c0c89dcae3b8329/AiH2zjy1cnt9OADAAZMLD.jpeg","isPro":false,"fullname":"Adina Yakefu","user":"AdinaY","type":"user"},{"_id":"627d51db401f42c57b6c94ce","avatarUrl":"/avatars/793d0a07bc37dacc5b0a486e4bf11d7f.svg","isPro":false,"fullname":"Peter Kis","user":"NePe","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"604ae011caabafacfa48e3de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1615519738679-noauth.jpeg","isPro":false,"fullname":"Haoqin Tu","user":"PahaII","type":"user"},{"_id":"6418629fd13ffa408128d7ae","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1679319546731-noauth.png","isPro":false,"fullname":"Zhang Ruichong","user":"ZhangRC","type":"user"},{"_id":"6538119803519fddb4a17e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538119803519fddb4a17e10/ffJMkdx-rM7VvLTCM6ri_.jpeg","isPro":false,"fullname":"samusenps","user":"samusenps","type":"user"},{"_id":"63d6144f652b74a957e21867","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63d6144f652b74a957e21867/EWTDSIwxB5WbOc_4nIPLQ.jpeg","isPro":false,"fullname":"Vaclav Kosar","user":"vaclavkosar","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2404.05892

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Published on Apr 8, 2024
· Submitted by
AK
on Apr 10, 2024
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

Eagle and Finch, improvements over RWKV-4, use multi-headed matrix-valued states and dynamic recurrence for enhanced expressivity and efficiency, trained on a multilingual corpus.

AI-generated summary

We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture. Our architectural design advancements include multi-headed matrix-valued states and a dynamic recurrence mechanism that improve expressivity while maintaining the inference efficiency characteristics of RNNs. We introduce a new multilingual corpus with 1.12 trillion tokens and a fast tokenizer based on greedy matching for enhanced multilinguality. We trained four Eagle models, ranging from 0.46 to 7.5 billion parameters, and two Finch models with 1.6 and 3.1 billion parameters and find that they achieve competitive performance across a wide variety of benchmarks. We release all our models on HuggingFace under the Apache 2.0 license. Models at: https://huggingface.co/RWKV Training code at: https://github.com/RWKV/RWKV-LM Inference code at: https://github.com/RWKV/ChatRWKV Time-parallel training code at: https://github.com/RWKV/RWKV-infctx-trainer

Community

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 5

Browse 5 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2404.05892 in a dataset README.md to link it from this page.

Spaces citing this paper 2

Collections including this paper 11