Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark
[go: Go Back, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-05-28T01:39:12.647Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7257345914840698},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}},{"id":"68414f368cb0edba3ac46680","author":{"_id":"656864e12d73834278a8dea7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/656864e12d73834278a8dea7/sfAWS2eyPtFHb_2GZIypp.jpeg","fullname":"Ahmed Heakl","name":"ahmedheakl","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":61,"isUserFollowing":false},"createdAt":"2025-06-05T08:03:02.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"We introduce CASS, the first large-scale dataset and model suite for cross-architecture GPU code transpilation, targeting both source-level (CUDA <--> HIP) and assembly-level (Nvidia SASS <--> AMD RDNA3) translation. The dataset comprises 70k verified code pairs across host and device, addressing a critical gap in low-level GPU code portability. Leveraging this resource, we train the CASS family of domain-specific language models, achieving 95% source translation accuracy and 37.5% assembly translation accuracy, substantially outperforming commercial baselines such as GPT-4o, Claude, and Hipify. Our generated code matches native performance in over 85% of test cases, preserving runtime and memory behavior. To support rigorous evaluation, we introduce CASS-Bench, a curated benchmark spanning 16 GPU domains with ground-truth execution. All data, models, and evaluation tools are released as open source to foster progress in GPU compiler tooling, binary compatibility, and LLM-guided hardware translation.","html":"

We introduce CASS, the first large-scale dataset and model suite for cross-architecture GPU code transpilation, targeting both source-level (CUDA &lt;--&gt; HIP) and assembly-level (Nvidia SASS &lt;--&gt; AMD RDNA3) translation. The dataset comprises 70k verified code pairs across host and device, addressing a critical gap in low-level GPU code portability. Leveraging this resource, we train the CASS family of domain-specific language models, achieving 95% source translation accuracy and 37.5% assembly translation accuracy, substantially outperforming commercial baselines such as GPT-4o, Claude, and Hipify. Our generated code matches native performance in over 85% of test cases, preserving runtime and memory behavior. To support rigorous evaluation, we introduce CASS-Bench, a curated benchmark spanning 16 GPU domains with ground-truth execution. All data, models, and evaluation tools are released as open source to foster progress in GPU compiler tooling, binary compatibility, and LLM-guided hardware translation.

\n","updatedAt":"2025-06-05T08:03:02.653Z","author":{"_id":"656864e12d73834278a8dea7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/656864e12d73834278a8dea7/sfAWS2eyPtFHb_2GZIypp.jpeg","fullname":"Ahmed Heakl","name":"ahmedheakl","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":61,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8503537178039551},"editors":["ahmedheakl"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/656864e12d73834278a8dea7/sfAWS2eyPtFHb_2GZIypp.jpeg"],"reactions":[],"isReport":false}},{"id":"68415453009b9585d74c7c95","author":{"_id":"6837ea66ce337353cf1eea47","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6837ea66ce337353cf1eea47/qfKDoa-YcIql5ZSdg6UwE.jpeg","fullname":"Sarah Johnson","name":"sarahjohnson2","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-06-05T08:24:51.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"Groundbreaking work! I believe CASS will set a new standard for cross-architecture GPU code translation! ","html":"

Groundbreaking work! I believe CASS will set a new standard for cross-architecture GPU code translation!

\n","updatedAt":"2025-06-05T08:50:00.884Z","author":{"_id":"6837ea66ce337353cf1eea47","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6837ea66ce337353cf1eea47/qfKDoa-YcIql5ZSdg6UwE.jpeg","fullname":"Sarah Johnson","name":"sarahjohnson2","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":2,"identifiedLanguage":{"language":"en","probability":0.77162766456604},"editors":["sarahjohnson2"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6837ea66ce337353cf1eea47/qfKDoa-YcIql5ZSdg6UwE.jpeg"],"reactions":[{"reaction":"🔥","users":["ahmedheakl","Sarim-Hash"],"count":2}],"isReport":false}},{"id":"684157d4c649707d91119d4c","author":{"_id":"656864e12d73834278a8dea7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/656864e12d73834278a8dea7/sfAWS2eyPtFHb_2GZIypp.jpeg","fullname":"Ahmed Heakl","name":"ahmedheakl","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":61,"isUserFollowing":false},"createdAt":"2025-06-05T08:39:48.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Thanks @sarahjohnson2 ","html":"

Thanks \n\n@sarahjohnson2\n\t

\n","updatedAt":"2025-06-05T08:39:48.404Z","author":{"_id":"656864e12d73834278a8dea7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/656864e12d73834278a8dea7/sfAWS2eyPtFHb_2GZIypp.jpeg","fullname":"Ahmed Heakl","name":"ahmedheakl","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":61,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6252254843711853},"editors":["ahmedheakl"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/656864e12d73834278a8dea7/sfAWS2eyPtFHb_2GZIypp.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2505.16968","authors":[{"_id":"683656aefd55e753bf26ed3e","user":{"_id":"656864e12d73834278a8dea7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/656864e12d73834278a8dea7/sfAWS2eyPtFHb_2GZIypp.jpeg","isPro":true,"fullname":"Ahmed Heakl","user":"ahmedheakl","type":"user"},"name":"Ahmed Heakl","status":"claimed_verified","statusLastChangedAt":"2025-05-28T08:58:30.760Z","hidden":false},{"_id":"683656aefd55e753bf26ed3f","user":{"_id":"62676a94dacab364889bb36c","avatarUrl":"/avatars/0ead41b44957eb30564ea685ed22781a.svg","isPro":false,"fullname":"SARIM HASHMI","user":"Sarim-Hash","type":"user"},"name":"Sarim Hashmi","status":"claimed_verified","statusLastChangedAt":"2025-06-02T07:49:01.879Z","hidden":false},{"_id":"683656aefd55e753bf26ed40","user":{"_id":"62eaadf4086bd1debb30a122","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62eaadf4086bd1debb30a122/wgxsPVnkOuEfq1oqlUhiB.jpeg","isPro":false,"fullname":"Gustavo Stahl","user":"GustavoStahl","type":"user"},"name":"Gustavo Bertolo Stahl","status":"claimed_verified","statusLastChangedAt":"2025-06-05T08:31:48.782Z","hidden":false},{"_id":"683656aefd55e753bf26ed41","name":"Seung Hun Eddie Han","hidden":false},{"_id":"683656aefd55e753bf26ed42","name":"Salman Khan","hidden":false},{"_id":"683656aefd55e753bf26ed43","name":"Abdulrahman Mahmoud","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/656864e12d73834278a8dea7/T4ESSrZsC7163P3I8p17C.png","https://cdn-uploads.huggingface.co/production/uploads/656864e12d73834278a8dea7/WF-SJEyKKtpa3Zq0JvBXA.png","https://cdn-uploads.huggingface.co/production/uploads/656864e12d73834278a8dea7/Hl8Dkgmc4QL_l9YKPhRvD.png","https://cdn-uploads.huggingface.co/production/uploads/656864e12d73834278a8dea7/p-io7OU8TtxwvBp4_M1Hd.png","https://cdn-uploads.huggingface.co/production/uploads/656864e12d73834278a8dea7/bu6bpeVfonZgrXopd9f79.png"],"publishedAt":"2025-05-22T17:48:53.000Z","submittedOnDailyAt":"2025-06-05T06:33:02.615Z","title":"CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark","submittedOnDailyBy":{"_id":"656864e12d73834278a8dea7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/656864e12d73834278a8dea7/sfAWS2eyPtFHb_2GZIypp.jpeg","isPro":true,"fullname":"Ahmed Heakl","user":"ahmedheakl","type":"user"},"summary":"We introduce CASS, the first large-scale dataset and model suite for\ncross-architecture GPU code transpilation, targeting both source-level (CUDA\nleftrightarrow HIP) and assembly-level (Nvidia SASS leftrightarrow AMD\nRDNA3) translation. The dataset comprises 70k verified code pairs across host\nand device, addressing a critical gap in low-level GPU code portability.\nLeveraging this resource, we train the CASS family of domain-specific language\nmodels, achieving 95% source translation accuracy and 37.5% assembly\ntranslation accuracy, substantially outperforming commercial baselines such as\nGPT-4o, Claude, and Hipify. Our generated code matches native performance in\nover 85% of test cases, preserving runtime and memory behavior. To support\nrigorous evaluation, we introduce CASS-Bench, a curated benchmark spanning 16\nGPU domains with ground-truth execution. All data, models, and evaluation tools\nare released as open source to foster progress in GPU compiler tooling, binary\ncompatibility, and LLM-guided hardware translation. Dataset and benchmark are\non\nhttps://huggingface.co/datasets/MBZUAI/cass{blue{HuggingFace}},\nwith code at\nhttps://github.com/GustavoStahl/CASS{blue{GitHub}}.","upvotes":40,"discussionId":"683656b0fd55e753bf26edf7","projectPage":"https://gustavostahl.github.io/CASS/","githubRepo":"https://github.com/GustavoStahl/CASS","githubRepoAddedBy":"user","ai_summary":"CASS is a dataset and model suite for GPU code transpilation at both source and assembly levels, achieving high accuracy and performance matching with native code.","ai_keywords":["cross-architecture GPU code transpilation","CASS","CUDA","HIP","Nvidia SASS","AMD RDNA3","domain-specific language models","source translation accuracy","assembly translation accuracy","native performance","CASS-Bench","GPU compiler tooling","binary compatibility","LLM-guided hardware translation"],"githubStars":34,"organization":{"_id":"61fb9e24dc607a42af5f193f","name":"MBZUAI","fullname":"Mohamed Bin Zayed University of Artificial Intelligence","avatar":"https://cdn-uploads.huggingface.co/production/uploads/1643879908583-603ab5664a944b99e81476e8.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"656864e12d73834278a8dea7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/656864e12d73834278a8dea7/sfAWS2eyPtFHb_2GZIypp.jpeg","isPro":true,"fullname":"Ahmed Heakl","user":"ahmedheakl","type":"user"},{"_id":"65262a396b41932089fd7bae","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65262a396b41932089fd7bae/6YIEoAfJojuTW1UOKlwZT.png","isPro":false,"fullname":"Mukul Ranjan","user":"mukul54","type":"user"},{"_id":"672e4574b60c3a27d783a1ac","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/aut4W4hJcOT8jvQnlWs-y.png","isPro":false,"fullname":"Muhammad Abdullah","user":"mabdullahsohail","type":"user"},{"_id":"6824d7374df727f1f602bc95","avatarUrl":"/avatars/c38a9b3616ec7f43afd9c4e84ff77dd3.svg","isPro":false,"fullname":"Gustavo Bertolo Stahl","user":"GustavoStahlMBZUAI","type":"user"},{"_id":"62eaadf4086bd1debb30a122","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62eaadf4086bd1debb30a122/wgxsPVnkOuEfq1oqlUhiB.jpeg","isPro":false,"fullname":"Gustavo Stahl","user":"GustavoStahl","type":"user"},{"_id":"6837ea66ce337353cf1eea47","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6837ea66ce337353cf1eea47/qfKDoa-YcIql5ZSdg6UwE.jpeg","isPro":false,"fullname":"Sarah Johnson","user":"sarahjohnson2","type":"user"},{"_id":"6837e9e7612e76702ec4a8d7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/AU3DN8rMxp2K2XrMbZq4p.png","isPro":false,"fullname":"hopix","user":"hopix30456","type":"user"},{"_id":"6837e98069b4f4d859b5533e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/LeiEektUbUshl93a6ch5U.png","isPro":false,"fullname":"lage Sol","user":"lagesol430","type":"user"},{"_id":"6837e90b9edb3136ac734da8","avatarUrl":"/avatars/539be1c3b350e94d2c9db198afe9ee89.svg","isPro":false,"fullname":"Roda De","user":"rodade9168","type":"user"},{"_id":"6837e84b195d26eacd569c4a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/nSONfgxHlcWz9rD_EOibJ.png","isPro":false,"fullname":"Vofe Bok","user":"vofebok580","type":"user"},{"_id":"62676a94dacab364889bb36c","avatarUrl":"/avatars/0ead41b44957eb30564ea685ed22781a.svg","isPro":false,"fullname":"SARIM HASHMI","user":"Sarim-Hash","type":"user"},{"_id":"6841533654d7c6b3f9757d08","avatarUrl":"/avatars/e0f93d4ecfb6f4bd0eae1258e8170da2.svg","isPro":false,"fullname":"lixeh56927","user":"lixeh56927","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0,"organization":{"_id":"61fb9e24dc607a42af5f193f","name":"MBZUAI","fullname":"Mohamed Bin Zayed University of Artificial Intelligence","avatar":"https://cdn-uploads.huggingface.co/production/uploads/1643879908583-603ab5664a944b99e81476e8.jpeg"}}">
Papers
arxiv:2505.16968

CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

Published on May 22, 2025
· Submitted by
Ahmed Heakl
on Jun 5, 2025
Authors:
,
,

Abstract

CASS is a dataset and model suite for GPU code transpilation at both source and assembly levels, achieving high accuracy and performance matching with native code.

AI-generated summary

We introduce CASS, the first large-scale dataset and model suite for cross-architecture GPU code transpilation, targeting both source-level (CUDA leftrightarrow HIP) and assembly-level (Nvidia SASS leftrightarrow AMD RDNA3) translation. The dataset comprises 70k verified code pairs across host and device, addressing a critical gap in low-level GPU code portability. Leveraging this resource, we train the CASS family of domain-specific language models, achieving 95% source translation accuracy and 37.5% assembly translation accuracy, substantially outperforming commercial baselines such as GPT-4o, Claude, and Hipify. Our generated code matches native performance in over 85% of test cases, preserving runtime and memory behavior. To support rigorous evaluation, we introduce CASS-Bench, a curated benchmark spanning 16 GPU domains with ground-truth execution. All data, models, and evaluation tools are released as open source to foster progress in GPU compiler tooling, binary compatibility, and LLM-guided hardware translation. Dataset and benchmark are on https://huggingface.co/datasets/MBZUAI/cass{blue{HuggingFace}}, with code at https://github.com/GustavoStahl/CASS{blue{GitHub}}.

Community

Paper author Paper submitter

We introduce CASS, the first large-scale dataset and model suite for cross-architecture GPU code transpilation, targeting both source-level (CUDA ↔ HIP) and assembly-level (Nvidia SASS ↔ AMD RDNA3) translation. The dataset comprises 70k verified code pairs across host and device, addressing a critical gap in low-level GPU code portability. Leveraging this resource, we train the CASS family of domain-specific language models, achieving 95% source translation accuracy and 37.5% assembly translation accuracy, substantially outperforming commercial baselines such as GPT-4o, Claude, and Hipify. Our generated code matches native performance in over 85% of test cases, preserving runtime and memory behavior. To support rigorous evaluation, we introduce CASS-Bench, a curated benchmark spanning 16 GPU domains with ground-truth execution. All data, models, and evaluation tools are released as open source to foster progress in GPU compiler tooling, binary compatibility, and LLM-guided hardware translation.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Paper author Paper submitter

We introduce CASS, the first large-scale dataset and model suite for cross-architecture GPU code transpilation, targeting both source-level (CUDA <--> HIP) and assembly-level (Nvidia SASS <--> AMD RDNA3) translation. The dataset comprises 70k verified code pairs across host and device, addressing a critical gap in low-level GPU code portability. Leveraging this resource, we train the CASS family of domain-specific language models, achieving 95% source translation accuracy and 37.5% assembly translation accuracy, substantially outperforming commercial baselines such as GPT-4o, Claude, and Hipify. Our generated code matches native performance in over 85% of test cases, preserving runtime and memory behavior. To support rigorous evaluation, we introduce CASS-Bench, a curated benchmark spanning 16 GPU domains with ground-truth execution. All data, models, and evaluation tools are released as open source to foster progress in GPU compiler tooling, binary compatibility, and LLM-guided hardware translation.

Groundbreaking work! I believe CASS will set a new standard for cross-architecture GPU code translation!

Paper author Paper submitter

Sign up or log in to comment

Models citing this paper 7

Browse 7 models citing this paper

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2505.16968 in a Space README.md to link it from this page.

Collections including this paper 3