PDF: https://arxiv.org/pdf/2407.01100
Html: https://arxiv.org/html/2407.01100
Abstract: https://arxiv.org/abs/2407.01100
Github: https://github.com/wzq016/PINE\n","updatedAt":"2024-07-04T20:19:11.475Z","author":{"_id":"651c321c4e15c57e1b0efca6","avatarUrl":"/avatars/241c4d975356bd986bad7303e1534bce.svg","fullname":"Ziqi wang","name":"wzq016","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7783470153808594},"editors":["wzq016"],"editorAvatarUrls":["/avatars/241c4d975356bd986bad7303e1534bce.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2407.01100","authors":[{"_id":"668365c0f8fd8217cc21a8a3","user":{"_id":"651c321c4e15c57e1b0efca6","avatarUrl":"/avatars/241c4d975356bd986bad7303e1534bce.svg","isPro":false,"fullname":"Ziqi wang","user":"wzq016","type":"user"},"name":"Ziqi Wang","status":"claimed_verified","statusLastChangedAt":"2024-07-05T06:55:41.246Z","hidden":false},{"_id":"668365c0f8fd8217cc21a8a4","user":{"_id":"624054bcc2c17da6a63eb539","avatarUrl":"/avatars/bf52dc0683b4100733f8696a97696d0e.svg","isPro":true,"fullname":"hlzhang109","user":"hlzhang109","type":"user"},"name":"Hanlin Zhang","status":"claimed_verified","statusLastChangedAt":"2024-12-11T09:23:26.518Z","hidden":false},{"_id":"668365c0f8fd8217cc21a8a5","name":"Xiner Li","hidden":false},{"_id":"668365c0f8fd8217cc21a8a6","user":{"_id":"658a57b4126b8d7eae07b983","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/658a57b4126b8d7eae07b983/reZrbaObwq_kNVsuTR_iR.jpeg","isPro":false,"fullname":"Kuan-Hao Huang","user":"kuanhaoh","type":"user"},"name":"Kuan-Hao Huang","status":"claimed_verified","statusLastChangedAt":"2025-07-28T08:48:20.585Z","hidden":false},{"_id":"668365c0f8fd8217cc21a8a7","user":{"_id":"645b11ff5987734177a7cdcf","avatarUrl":"/avatars/3b04d2ca166eb28f227fc8d800e55c1c.svg","isPro":false,"fullname":"Chi Han","user":"Glaciohound","type":"user"},"name":"Chi Han","status":"claimed_verified","statusLastChangedAt":"2025-03-18T08:12:55.939Z","hidden":false},{"_id":"668365c0f8fd8217cc21a8a8","name":"Shuiwang Ji","hidden":false},{"_id":"668365c0f8fd8217cc21a8a9","name":"Sham M. Kakade","hidden":false},{"_id":"668365c0f8fd8217cc21a8aa","name":"Hao Peng","hidden":false},{"_id":"668365c0f8fd8217cc21a8ab","name":"Heng Ji","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/651c321c4e15c57e1b0efca6/FlNTeR4J4PQpIOe1b0vAX.png","https://cdn-uploads.huggingface.co/production/uploads/651c321c4e15c57e1b0efca6/cgAOTpZJ1GtM4jlEkSh9E.png","https://cdn-uploads.huggingface.co/production/uploads/651c321c4e15c57e1b0efca6/H8z2SBPkx3W_5og5VAsdh.png"],"publishedAt":"2024-07-01T09:06:57.000Z","submittedOnDailyAt":"2024-07-04T18:49:11.468Z","title":"Eliminating Position Bias of Language Models: A Mechanistic Approach","submittedOnDailyBy":{"_id":"651c321c4e15c57e1b0efca6","avatarUrl":"/avatars/241c4d975356bd986bad7303e1534bce.svg","isPro":false,"fullname":"Ziqi wang","user":"wzq016","type":"user"},"summary":"Position bias has proven to be a prevalent issue of modern language models\n(LMs), where the models prioritize content based on its position within the\ngiven context. This bias often leads to unexpected model failures and hurts\nperformance, robustness, and reliability across various applications. Our\nmechanistic analysis attributes the position bias to two components employed in\nnearly all state-of-the-art LMs: causal attention and relative positional\nencodings. Specifically, we find that causal attention generally causes models\nto favor distant content, while relative positional encodings like RoPE prefer\nnearby ones based on the analysis of retrieval-augmented question answering\n(QA). Further, our empirical study on object detection reveals that position\nbias is also present in vision-language models (VLMs).\n Based on the above analyses, we propose to ELIMINATE position bias caused by\ndifferent input segment orders (e.g., options in LM-as-a-judge, retrieved\ndocuments in QA) in a TRAINING-FREE ZERO-SHOT manner. Our method changes the\ncausal attention to bidirectional attention between segments and utilizes model\nattention values to decide the relative orders of segments instead of using the\norder provided in input prompts, therefore enabling Position-INvariant\ninferencE (PINE) at the segment level. By eliminating position bias, models\nachieve better performance and reliability in downstream tasks where position\nbias widely exists, such as LM-as-a-judge and retrieval-augmented QA.\n Notably, PINE is especially useful when adapting LMs for evaluating reasoning\npairs: it consistently provides 8 to 10 percentage points performance gains in\nmost cases, and makes Llama-3-70B-Instruct perform even better than\nGPT-4-0125-preview on the RewardBench reasoning subset.","upvotes":8,"discussionId":"668365c1f8fd8217cc21a977","githubRepo":"https://github.com/wzq016/pine","githubRepoAddedBy":"auto","ai_summary":"The proposed Position-Invariant Inference (PINE) method eliminates position bias in language models by changing causal to bidirectional attention, improving performance and reliability in tasks like LM-as-a-judge and retrieval-augmented QA.","ai_keywords":["positional bias","causal attention","relative positional encodings","RoPE","feature detection","object detection","vision-language models","VLMs","bidirectional attention","RewardBench","reasoning subset"],"githubStars":19},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"651c321c4e15c57e1b0efca6","avatarUrl":"/avatars/241c4d975356bd986bad7303e1534bce.svg","isPro":false,"fullname":"Ziqi wang","user":"wzq016","type":"user"},{"_id":"64587be872b60ae7a3817858","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64587be872b60ae7a3817858/BbdOOxOCEzWTvEpkWp8MM.png","isPro":false,"fullname":"Minbyul Jeong","user":"Minbyul","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"6538119803519fddb4a17e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538119803519fddb4a17e10/ffJMkdx-rM7VvLTCM6ri_.jpeg","isPro":false,"fullname":"samusenps","user":"samusenps","type":"user"},{"_id":"6167229d6b5a2f75073ec599","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6167229d6b5a2f75073ec599/8o_8YndZpdjoFmpYpnU48.jpeg","isPro":false,"fullname":"Matt Mistele","user":"mmistele","type":"user"},{"_id":"648c9605565e3a44f3c9bb7b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648c9605565e3a44f3c9bb7b/W5chvk17Zol6-2QSWkFVR.jpeg","isPro":true,"fullname":"Orr Zohar","user":"orrzohar","type":"user"},{"_id":"67c71a12c11cd8376c5aaa0d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67c71a12c11cd8376c5aaa0d/nZXfN4t7CMXlJ4W3LWvoE.jpeg","isPro":false,"fullname":"Adrien Moreau","user":"AdMoreau","type":"user"},{"_id":"663ccbff3a74a20189d4aa2e","avatarUrl":"/avatars/83a54455e0157480f65c498cd9057cf2.svg","isPro":false,"fullname":"Nguyen Van Thanh","user":"NguyenVanThanhHust","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Eliminating Position Bias of Language Models: A Mechanistic Approach
Abstract
The proposed Position-Invariant Inference (PINE) method eliminates position bias in language models by changing causal to bidirectional attention, improving performance and reliability in tasks like LM-as-a-judge and retrieval-augmented QA.
Position bias has proven to be a prevalent issue of modern language models (LMs), where the models prioritize content based on its position within the given context. This bias often leads to unexpected model failures and hurts performance, robustness, and reliability across various applications. Our mechanistic analysis attributes the position bias to two components employed in nearly all state-of-the-art LMs: causal attention and relative positional encodings. Specifically, we find that causal attention generally causes models to favor distant content, while relative positional encodings like RoPE prefer nearby ones based on the analysis of retrieval-augmented question answering (QA). Further, our empirical study on object detection reveals that position bias is also present in vision-language models (VLMs). Based on the above analyses, we propose to ELIMINATE position bias caused by different input segment orders (e.g., options in LM-as-a-judge, retrieved documents in QA) in a TRAINING-FREE ZERO-SHOT manner. Our method changes the causal attention to bidirectional attention between segments and utilizes model attention values to decide the relative orders of segments instead of using the order provided in input prompts, therefore enabling Position-INvariant inferencE (PINE) at the segment level. By eliminating position bias, models achieve better performance and reliability in downstream tasks where position bias widely exists, such as LM-as-a-judge and retrieval-augmented QA. Notably, PINE is especially useful when adapting LMs for evaluating reasoning pairs: it consistently provides 8 to 10 percentage points performance gains in most cases, and makes Llama-3-70B-Instruct perform even better than GPT-4-0125-preview on the RewardBench reasoning subset.
Community
Hi readers! In this paper, we propose a training-free approach to eliminate the position bias in LLMs, which is useful for tasks like LM-as-judge and RAG-QA. In short, our method converts causal attention to bidirectional attention, and use attention sorting to re-assign positions.
A promising result is that Llama 70B with our method can beat GPT-4 in the RewardBench reasoning subset!
More details:
Twitter: https://x.com/wzq016/status/1808568703229046792
PDF: https://arxiv.org/pdf/2407.01100
Html: https://arxiv.org/html/2407.01100
Abstract: https://arxiv.org/abs/2407.01100
Github: https://github.com/wzq016/PINE
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper