Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Please give a thumbs up to this comment if you found it helpful!
\n
If you want recommendations for any Paper on Hugging Face checkout this Space
\n
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
\n","updatedAt":"2025-03-03T02:36:49.801Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7166228890419006},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2404.05892","authors":[{"_id":"6615f99924c94fd8a22311c1","user":{"_id":"62b3d8d651b07307bd12b7f0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1655953609090-noauth.jpeg","isPro":false,"fullname":"BlinkDL","user":"BlinkDL","type":"user"},"name":"Bo Peng","status":"claimed_verified","statusLastChangedAt":"2025-03-19T09:49:08.497Z","hidden":false},{"_id":"6615f99924c94fd8a22311c2","user":{"_id":"647f4bac45baf21ad709fcd0","avatarUrl":"/avatars/14c04cdda95de676aeefa9ae3e7c19ba.svg","isPro":false,"fullname":"Dan Goldstein","user":"SmerkyG","type":"user"},"name":"Daniel Goldstein","status":"claimed_verified","statusLastChangedAt":"2025-03-19T09:49:10.506Z","hidden":false},{"_id":"6615f99924c94fd8a22311c3","user":{"_id":"63b9a261efe99543b34e9579","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63b9a261efe99543b34e9579/aZMhhnurORuPU_WryDUiu.jpeg","isPro":false,"fullname":"Quentin Anthony","user":"qanthony","type":"user"},"name":"Quentin Anthony","status":"claimed_verified","statusLastChangedAt":"2024-06-20T07:34:51.007Z","hidden":false},{"_id":"6615f99924c94fd8a22311c4","user":{"_id":"611a7ec4289467cafea62d13","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/611a7ec4289467cafea62d13/pck-0fmPQkoU7yzh6-WoL.jpeg","isPro":false,"fullname":"Alon Albalak","user":"alon-albalak","type":"user"},"name":"Alon Albalak","status":"claimed_verified","statusLastChangedAt":"2024-05-15T10:43:23.918Z","hidden":false},{"_id":"6615f99924c94fd8a22311c5","user":{"_id":"646c8a411ee398a4e940cab1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/646c8a411ee398a4e940cab1/5W3dGj8eQqQrxeGPQO1n2.png","isPro":false,"fullname":"hypnopump","user":"hypnopump","type":"user"},"name":"Eric Alcaide","status":"claimed_verified","statusLastChangedAt":"2024-04-14T12:54:30.685Z","hidden":false},{"_id":"6615f99924c94fd8a22311c6","user":{"_id":"60347d3660e3dd96631c9093","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60347d3660e3dd96631c9093/B3fuZer5N04tZIAYrLnz4.jpeg","isPro":false,"fullname":"Stella Biderman","user":"stellaathena","type":"user"},"name":"Stella Biderman","status":"claimed_verified","statusLastChangedAt":"2025-06-07T05:51:30.096Z","hidden":false},{"_id":"6615f99924c94fd8a22311c7","name":"Eugene Cheah","hidden":false},{"_id":"6615f99924c94fd8a22311c8","name":"Teddy Ferdinan","hidden":false},{"_id":"6615f99924c94fd8a22311c9","user":{"_id":"64414d0a66a62c605c94d14d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64414d0a66a62c605c94d14d/KGjlf9tRwslr7Q503YLnJ.jpeg","isPro":false,"fullname":"howard-hou","user":"howard-hou","type":"user"},"name":"Haowen Hou","status":"claimed_verified","statusLastChangedAt":"2025-03-20T19:17:49.238Z","hidden":false},{"_id":"6615f99924c94fd8a22311ca","name":"Przemysław Kazienko","hidden":false},{"_id":"6615f99924c94fd8a22311cb","name":"Kranthi Kiran GV","hidden":false},{"_id":"6615f99924c94fd8a22311cc","name":"Jan Kocoń","hidden":false},{"_id":"6615f99924c94fd8a22311cd","name":"Bartłomiej Koptyra","hidden":false},{"_id":"6615f99924c94fd8a22311ce","user":{"_id":"6186fef1b1085ab638324e7f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6186fef1b1085ab638324e7f/BL6_WJCkxB-BatBUBilT8.jpeg","isPro":false,"fullname":"Satya","user":"skrishna","type":"user"},"name":"Satyapriya Krishna","status":"claimed_verified","statusLastChangedAt":"2024-04-23T07:15:39.514Z","hidden":false},{"_id":"6615f99924c94fd8a22311cf","name":"Ronald McClelland Jr.","hidden":false},{"_id":"6615f99924c94fd8a22311d0","name":"Niklas Muennighoff","hidden":false},{"_id":"6615f99924c94fd8a22311d1","user":{"_id":"643c7c0586ab6dbe34f1eae5","avatarUrl":"/avatars/aba29ed4a6092658900cf16f32c90f02.svg","isPro":false,"fullname":"Fares Obeid","user":"Fareso","type":"user"},"name":"Fares Obeid","status":"claimed_verified","statusLastChangedAt":"2024-04-10T10:58:45.716Z","hidden":false},{"_id":"6615f99924c94fd8a22311d2","user":{"_id":"606b1d984b47ca0dcae61aed","avatarUrl":"/avatars/2457293c5469f2bc5116f6e915f045c8.svg","isPro":false,"fullname":"AtsushiSaito","user":"atsushi3110","type":"user"},"name":"Atsushi Saito","status":"claimed_verified","statusLastChangedAt":"2024-04-17T13:15:26.703Z","hidden":false},{"_id":"6615f99924c94fd8a22311d3","name":"Guangyu Song","hidden":false},{"_id":"6615f99924c94fd8a22311d4","user":{"_id":"604ae011caabafacfa48e3de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1615519738679-noauth.jpeg","isPro":false,"fullname":"Haoqin Tu","user":"PahaII","type":"user"},"name":"Haoqin Tu","status":"admin_assigned","statusLastChangedAt":"2024-07-23T14:10:48.584Z","hidden":false},{"_id":"6615f99924c94fd8a22311d5","name":"Stanisław Woźniak","hidden":false},{"_id":"6615f99924c94fd8a22311d6","user":{"_id":"6418629fd13ffa408128d7ae","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1679319546731-noauth.png","isPro":false,"fullname":"Zhang Ruichong","user":"ZhangRC","type":"user"},"name":"Ruichong Zhang","status":"claimed_verified","statusLastChangedAt":"2024-04-10T10:58:34.768Z","hidden":false},{"_id":"6615f99924c94fd8a22311d7","user":{"_id":"62dcd71075e9787ec5aa41ba","avatarUrl":"/avatars/f37ce036b76180ed0fa004f9c8c09363.svg","isPro":true,"fullname":"Bingchen Zhao","user":"tennant","type":"user"},"name":"Bingchen Zhao","status":"claimed_verified","statusLastChangedAt":"2024-06-19T07:39:47.208Z","hidden":false},{"_id":"6615f99924c94fd8a22311d8","name":"Qihang Zhao","hidden":false},{"_id":"6615f99924c94fd8a22311d9","name":"Peng Zhou","hidden":false},{"_id":"6615f99924c94fd8a22311da","name":"Jian Zhu","hidden":false},{"_id":"6615f99924c94fd8a22311db","user":{"_id":"63ff09f24852102d4871c19c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63ff09f24852102d4871c19c/lyE3xemtZss3qebK5sEXw.png","isPro":false,"fullname":"Rui-Jie Zhu","user":"ridger","type":"user"},"name":"Rui-Jie Zhu","status":"claimed_verified","statusLastChangedAt":"2024-10-14T19:12:40.226Z","hidden":false}],"publishedAt":"2024-04-08T22:20:59.000Z","submittedOnDailyAt":"2024-04-10T00:59:47.487Z","title":"Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon\nthe RWKV (RWKV-4) architecture. Our architectural design advancements include\nmulti-headed matrix-valued states and a dynamic recurrence mechanism that\nimprove expressivity while maintaining the inference efficiency characteristics\nof RNNs. We introduce a new multilingual corpus with 1.12 trillion tokens and a\nfast tokenizer based on greedy matching for enhanced multilinguality. We\ntrained four Eagle models, ranging from 0.46 to 7.5 billion parameters, and two\nFinch models with 1.6 and 3.1 billion parameters and find that they achieve\ncompetitive performance across a wide variety of benchmarks. We release all our\nmodels on HuggingFace under the Apache 2.0 license. Models at:\nhttps://huggingface.co/RWKV Training code at: https://github.com/RWKV/RWKV-LM\nInference code at: https://github.com/RWKV/ChatRWKV Time-parallel training code\nat: https://github.com/RWKV/RWKV-infctx-trainer","upvotes":40,"discussionId":"6615f99b24c94fd8a2231242","githubRepo":"https://github.com/rwkv/rwkv-infctx-trainer","githubRepoAddedBy":"auto","ai_summary":"Eagle and Finch, improvements over RWKV-4, use multi-headed matrix-valued states and dynamic recurrence for enhanced expressivity and efficiency, trained on a multilingual corpus.","ai_keywords":["multi-headed matrix-valued states","dynamic recurrence mechanism","RNNs","greedy matching","multilingual corpus"],"githubStars":148},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64bbe9b236eb058cd9d6a5b9","avatarUrl":"/avatars/c7c01a3fa8809e73800392679abff6d5.svg","isPro":false,"fullname":"Kai Zuberbühler","user":"kaizuberbuehler","type":"user"},{"_id":"6101c620900eaa0057c2ce1d","avatarUrl":"/avatars/bd282166c120711c65b5409dc860ac58.svg","isPro":false,"fullname":"Abdel-Dayane Marcos","user":"admarcosai","type":"user"},{"_id":"64137e2150358a805203cbac","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64137e2150358a805203cbac/vU4OjyvOlu2g5PEOOik-t.jpeg","isPro":false,"fullname":"euclaise","user":"euclaise","type":"user"},{"_id":"64ca7c04710645aa7bdbbfff","avatarUrl":"/avatars/c12f4cb6dc1ff0010edb3ef4cfcccd7c.svg","isPro":false,"fullname":"Lize Pirenne","user":"Inversta","type":"user"},{"_id":"64037bb1929304a3c8006453","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64037bb1929304a3c8006453/bHnxEEBZ9PQnoITK-1Mvc.jpeg","isPro":false,"fullname":"Rogério Freitas","user":"RogerioFreitas","type":"user"},{"_id":"63a369d98c0c89dcae3b8329","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63a369d98c0c89dcae3b8329/AiH2zjy1cnt9OADAAZMLD.jpeg","isPro":false,"fullname":"Adina Yakefu","user":"AdinaY","type":"user"},{"_id":"627d51db401f42c57b6c94ce","avatarUrl":"/avatars/793d0a07bc37dacc5b0a486e4bf11d7f.svg","isPro":false,"fullname":"Peter Kis","user":"NePe","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"604ae011caabafacfa48e3de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1615519738679-noauth.jpeg","isPro":false,"fullname":"Haoqin Tu","user":"PahaII","type":"user"},{"_id":"6418629fd13ffa408128d7ae","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1679319546731-noauth.png","isPro":false,"fullname":"Zhang Ruichong","user":"ZhangRC","type":"user"},{"_id":"6538119803519fddb4a17e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538119803519fddb4a17e10/ffJMkdx-rM7VvLTCM6ri_.jpeg","isPro":false,"fullname":"samusenps","user":"samusenps","type":"user"},{"_id":"63d6144f652b74a957e21867","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63d6144f652b74a957e21867/EWTDSIwxB5WbOc_4nIPLQ.jpeg","isPro":false,"fullname":"Vaclav Kosar","user":"vaclavkosar","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Eagle and Finch, improvements over RWKV-4, use multi-headed matrix-valued states and dynamic recurrence for enhanced expressivity and efficiency, trained on a multilingual corpus.
AI-generated summary
We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon
the RWKV (RWKV-4) architecture. Our architectural design advancements include
multi-headed matrix-valued states and a dynamic recurrence mechanism that
improve expressivity while maintaining the inference efficiency characteristics
of RNNs. We introduce a new multilingual corpus with 1.12 trillion tokens and a
fast tokenizer based on greedy matching for enhanced multilinguality. We
trained four Eagle models, ranging from 0.46 to 7.5 billion parameters, and two
Finch models with 1.6 and 3.1 billion parameters and find that they achieve
competitive performance across a wide variety of benchmarks. We release all our
models on HuggingFace under the Apache 2.0 license. Models at:
https://huggingface.co/RWKV Training code at: https://github.com/RWKV/RWKV-LM
Inference code at: https://github.com/RWKV/ChatRWKV Time-parallel training code
at: https://github.com/RWKV/RWKV-infctx-trainer