this https URL.

\n","updatedAt":"2025-02-28T04:04:14.627Z","author":{"_id":"643be8879f5d314db2d9ed23","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/643be8879f5d314db2d9ed23/VrW2UtJ7ppOnGIYjTWd7b.png","fullname":"Chen Dongping","name":"shuaishuaicdp","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":14,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.786759614944458},"editors":["shuaishuaicdp"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/643be8879f5d314db2d9ed23/VrW2UtJ7ppOnGIYjTWd7b.png"],"reactions":[],"isReport":false}},{"id":"67c26451eae42569f48b32bc","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2025-03-01T01:35:13.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code Generation](https://huggingface.co/papers/2502.19166) (2025)\n* [RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation](https://huggingface.co/papers/2502.09183) (2025)\n* [SolEval: Benchmarking Large Language Models for Repository-level Solidity Code Generation](https://huggingface.co/papers/2502.18793) (2025)\n* [ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments](https://huggingface.co/papers/2502.19852) (2025)\n* [SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors](https://huggingface.co/papers/2502.11167) (2025)\n* [Focused-DPO: Enhancing Code Generation Through Focused Preference Optimization on Error-Prone Points](https://huggingface.co/papers/2502.11475) (2025)\n* [DeepRTL: Bridging Verilog Understanding and Generation with a Unified Representation Model](https://huggingface.co/papers/2502.15832) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-03-01T01:35:13.318Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6897639632225037},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2502.16645","authors":[{"_id":"67c12e60d8247a49b805694f","user":{"_id":"6441270ead24e9b2cfbc45e0","avatarUrl":"/avatars/92eab1ae50efaaee070674ae20244fc0.svg","isPro":false,"fullname":"Wang Chenlong","user":"Wildxxxxx75","type":"user"},"name":"Chenlong Wang","status":"admin_assigned","statusLastChangedAt":"2025-02-28T12:29:50.564Z","hidden":false},{"_id":"67c12e60d8247a49b8056950","user":{"_id":"64fb128552e82dd432682b06","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64fb128552e82dd432682b06/GYcOiwa4R3RrgcM2tSuV_.png","isPro":false,"fullname":"Zhaoyang Chu","user":"chuzy","type":"user"},"name":"Zhaoyang Chu","status":"admin_assigned","statusLastChangedAt":"2025-02-28T12:29:56.482Z","hidden":false},{"_id":"67c12e60d8247a49b8056951","user":{"_id":"669096da35cddb688a352ca8","avatarUrl":"/avatars/5dd096cb7360682016d0fca909ab9744.svg","isPro":false,"fullname":"zxiang","user":"zx10086","type":"user"},"name":"Zhengxiang Cheng","status":"claimed_verified","statusLastChangedAt":"2025-02-28T09:28:33.569Z","hidden":false},{"_id":"67c12e60d8247a49b8056952","user":{"_id":"6743e9d4303e7ce5b9d13e9b","avatarUrl":"/avatars/cdaf150380e9c8916547185b968a2670.svg","isPro":false,"fullname":"xy","user":"yxy0807","type":"user"},"name":"Xuyi Yang","status":"claimed_verified","statusLastChangedAt":"2025-02-28T09:28:31.564Z","hidden":false},{"_id":"67c12e60d8247a49b8056953","name":"Kaiyue Qiu","hidden":false},{"_id":"67c12e60d8247a49b8056954","name":"Yao Wan","hidden":false},{"_id":"67c12e60d8247a49b8056955","name":"Zhou Zhao","hidden":false},{"_id":"67c12e60d8247a49b8056956","name":"Xuanhua Shi","hidden":false},{"_id":"67c12e60d8247a49b8056957","user":{"_id":"65e2be1e630e2db23829ee8d","avatarUrl":"/avatars/294f9ba909037f03669dc0bb80cabfe3.svg","isPro":false,"fullname":"Dongping Chen","user":"fjchendp","type":"user"},"name":"Dongping Chen","status":"admin_assigned","statusLastChangedAt":"2025-02-28T12:30:19.705Z","hidden":false}],"publishedAt":"2025-02-23T16:46:18.000Z","submittedOnDailyAt":"2025-02-28T01:34:14.619Z","title":"CODESYNC: Synchronizing Large Language Models with Dynamic Code\n Evolution at Scale","submittedOnDailyBy":{"_id":"643be8879f5d314db2d9ed23","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/643be8879f5d314db2d9ed23/VrW2UtJ7ppOnGIYjTWd7b.png","isPro":false,"fullname":"Chen Dongping","user":"shuaishuaicdp","type":"user"},"summary":"Large Language Models (LLMs) have exhibited exceptional performance in\nsoftware engineering yet face challenges in adapting to continually evolving\ncode knowledge, particularly regarding the frequent updates of third-party\nlibrary APIs. This limitation, stemming from static pre-training datasets,\noften results in non-executable code or implementations with suboptimal safety\nand efficiency. To this end, this paper introduces CODESYNC, a data engine for\nidentifying outdated code patterns and collecting real-time code knowledge\nupdates from Python third-party libraries. Building upon CODESYNC, we develop\nCODESYNCBENCH, a comprehensive benchmark for assessing LLMs' ability to stay\nsynchronized with code evolution, which covers real-world updates for 220 APIs\nfrom six Python libraries. Our benchmark offers 3,300 test cases across three\nevaluation tasks and an update-aware instruction tuning dataset consisting of\n2,200 training samples. Extensive experiments on 14 state-of-the-art LLMs\nreveal that they struggle with dynamic code evolution, even with the support of\nadvanced knowledge updating methods (e.g., DPO, ORPO, and SimPO). We believe\nthat our benchmark can offer a strong foundation for the development of more\neffective methods for real-time code knowledge updating in the future. The\nexperimental code and dataset are publicly available at:\nhttps://github.com/Lucky-voyage/Code-Sync.","upvotes":21,"discussionId":"67c12e61d8247a49b805698f","githubRepo":"https://github.com/lucky-voyage/code-sync","githubRepoAddedBy":"auto","ai_summary":"CODESYNCBENCH is introduced to evaluate LLMs' adaptation to evolving third-party library APIs, revealing their challenges with dynamic code changes.","ai_keywords":["LARGE LANGUAGE MODELS (LLMs)","CODESYNC","CODESYNCBENCH","third-party library APIs","real-time code knowledge updates","non-executable code","suboptimal safety and efficiency","DPO","ORPO","SimPO"],"githubStars":25},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"643be8879f5d314db2d9ed23","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/643be8879f5d314db2d9ed23/VrW2UtJ7ppOnGIYjTWd7b.png","isPro":false,"fullname":"Chen Dongping","user":"shuaishuaicdp","type":"user"},{"_id":"64fb128552e82dd432682b06","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64fb128552e82dd432682b06/GYcOiwa4R3RrgcM2tSuV_.png","isPro":false,"fullname":"Zhaoyang Chu","user":"chuzy","type":"user"},{"_id":"669096da35cddb688a352ca8","avatarUrl":"/avatars/5dd096cb7360682016d0fca909ab9744.svg","isPro":false,"fullname":"zxiang","user":"zx10086","type":"user"},{"_id":"6743e9d4303e7ce5b9d13e9b","avatarUrl":"/avatars/cdaf150380e9c8916547185b968a2670.svg","isPro":false,"fullname":"xy","user":"yxy0807","type":"user"},{"_id":"67c13890743428a2595a8b60","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/Ine7JrAr_ytNzHCtPFvFe.png","isPro":false,"fullname":"yiwen yang","user":"yywmia","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"67b2b5ed9becd2d04456712a","avatarUrl":"/avatars/2fa16576bad5f26fc37221d8b038fa66.svg","isPro":false,"fullname":"Hu ZhiHan","user":"dyzxHZH","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"6697e7e55ef2828a1ff371c3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6697e7e55ef2828a1ff371c3/U7-_BtDtSsrf02LIdUTN8.jpeg","isPro":false,"fullname":"Zetong Zhou","user":"Frywind","type":"user"},{"_id":"659977d7a7f2d2491750584d","avatarUrl":"/avatars/92cef323e6545b32a7038ae361bd6428.svg","isPro":false,"fullname":"Amarulloh M Khoeri","user":"maarut","type":"user"},{"_id":"648eb1eb59c4e5c87dc116e0","avatarUrl":"/avatars/c636cea39c2c0937f01398c94ead5dad.svg","isPro":false,"fullname":"fdsqefsgergd","user":"T-representer","type":"user"},{"_id":"66ee4ec36babd2a70556b8e4","avatarUrl":"/avatars/d7b4c3ce1367e5b4ff8eab5647abbe0b.svg","isPro":false,"fullname":"YanruWu","user":"YanruWu","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">

Papers

arxiv:2502.16645

CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale

Published on Feb 23, 2025

· Submitted by

Chen Dongping on Feb 28, 2025

Upvote

Authors:

Chenlong Wang ,

Zhaoyang Chu ,

Zhengxiang Cheng ,

Xuyi Yang ,

Dongping Chen

Abstract

CODESYNCBENCH is introduced to evaluate LLMs' adaptation to evolving third-party library APIs, revealing their challenges with dynamic code changes.

AI-generated summary

Large Language Models (LLMs) have exhibited exceptional performance in software engineering yet face challenges in adapting to continually evolving code knowledge, particularly regarding the frequent updates of third-party library APIs. This limitation, stemming from static pre-training datasets, often results in non-executable code or implementations with suboptimal safety and efficiency. To this end, this paper introduces CODESYNC, a data engine for identifying outdated code patterns and collecting real-time code knowledge updates from Python third-party libraries. Building upon CODESYNC, we develop CODESYNCBENCH, a comprehensive benchmark for assessing LLMs' ability to stay synchronized with code evolution, which covers real-world updates for 220 APIs from six Python libraries. Our benchmark offers 3,300 test cases across three evaluation tasks and an update-aware instruction tuning dataset consisting of 2,200 training samples. Extensive experiments on 14 state-of-the-art LLMs reveal that they struggle with dynamic code evolution, even with the support of advanced knowledge updating methods (e.g., DPO, ORPO, and SimPO). We believe that our benchmark can offer a strong foundation for the development of more effective methods for real-time code knowledge updating in the future. The experimental code and dataset are publicly available at: https://github.com/Lucky-voyage/Code-Sync.

View arXiv page View PDF GitHub 25 auto Add to collection

Community

shuaishuaicdp

Paper submitter Feb 28, 2025

This paper introduces CODESYNC, a data engine for identifying outdated code patterns and collecting real-time API knowledge updates from Python third-party libraries.

The experimental code and dataset are publicly available at: this https URL.