[Disclaimer : I don't claim to be an expert, I just want to have an insightfull discussion with domain experts]
\nFormidable work ! I learned a lot reading this article ! As I was reading your article, a question sparked.
\nIn the introduction you have said that \"However, spreadsheets pose unique challenges for LLMs due to their expansive grids that usually exceed the token limitations of popular LLMs, as well as their inherent two-dimensional layouts and structures, which are poorly suited to linear and sequential input.\"
This sentence then sparked the idea that yes LLMs struggles to comprehend the 2D architecture of tabular data, but is it possible to chunk our data into \"sub-array\" the same way that Dosovitskiy et. al. did in their paper (arXiv:2010.11929) regarding ViT ? I remember that they chunked their input matrices into smaller matrices to reduce the cost of self-attention.
So I was wondering if it's possible take this idea from matrices as image to matrices as spreadsheets ? Is it relevant to adapt this technique to enhence tabular comprehension for LLMs ?
\n","updatedAt":"2024-07-22T09:05:54.426Z","author":{"_id":"668c0009fb0ffdae39bbd7b3","avatarUrl":"/avatars/1ddd8093b7d86171d4afd7d46a52929d.svg","fullname":"Aymeric","name":"SEIITavinot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":5,"identifiedLanguage":{"language":"en","probability":0.9734712243080139},"editors":["SEIITavinot"],"editorAvatarUrls":["/avatars/1ddd8093b7d86171d4afd7d46a52929d.svg"],"reactions":[],"isReport":false,"parentCommentId":"6694881e8eea8fdc923b7b2a"}},{"id":"669e2394626fcd3ba8b5de0d","author":{"_id":"668c0009fb0ffdae39bbd7b3","avatarUrl":"/avatars/1ddd8093b7d86171d4afd7d46a52929d.svg","fullname":"Aymeric","name":"SEIITavinot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2024-07-22T09:17:08.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Just noticed that you have already tackled this issue in the Related Work section with this paper : arXiv:2402.12424\nNew readings !","html":"Just noticed that you have already tackled this issue in the Related Work section with this paper : arXiv:2402.12424
New readings !
I had expected exploration of modified positional encoding schemes in two dimensions for this problem. Was that considered at all?
\n","updatedAt":"2024-07-15T10:44:44.206Z","author":{"_id":"645d39b95ebf379fd6da0065","avatarUrl":"/avatars/3d3c25517b9f79b93ed4b758648594a8.svg","fullname":"Victor Biederbeck","name":"victorbiederbeck","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.962404191493988},"editors":["victorbiederbeck"],"editorAvatarUrls":["/avatars/3d3c25517b9f79b93ed4b758648594a8.svg"],"reactions":[{"reaction":"๐","users":["aslessor","Dogeeelin"],"count":2}],"isReport":false}},{"id":"6695ea06417bbfcd51c394d7","author":{"_id":"5ead1b914e876668a0c37772","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5ead1b914e876668a0c37772/ftW3bs6hy2Q_J63_OUKKW.png","fullname":"PenutChen","name":"penut85420","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false},"createdAt":"2024-07-16T03:33:26.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"ms, no code, no weight, again?","html":"ms, no code, no weight, again?
\n","updatedAt":"2024-07-16T03:33:26.871Z","author":{"_id":"5ead1b914e876668a0c37772","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5ead1b914e876668a0c37772/ftW3bs6hy2Q_J63_OUKKW.png","fullname":"PenutChen","name":"penut85420","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9302689433097839},"editors":["penut85420"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/5ead1b914e876668a0c37772/ftW3bs6hy2Q_J63_OUKKW.png"],"reactions":[{"reaction":"๐","users":["biglion666","benkmoore","konilse","AleAle2423","bggmyfuture-ai","Digital-Prometheus","dhruva-sarma","andaero","kevineen","richardprobe","andresarpi3","nbk34","pershin","Momal","NoOnesDead","cpcdoy","segaa","GordonM","minhhien0811","nguyenthanh159","anonPixel","ggts","Oded","qqppla","dmis-lab","JeetSharma"],"count":26}],"isReport":false},"replies":[{"id":"669b8b7247606a4c9de61902","author":{"_id":"668fabb6a534fc4f3cf02faa","avatarUrl":"/avatars/436069ecfeebed6014b836fbdcca2291.svg","fullname":"Harshal","name":"NoOnesDead","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2024-07-20T10:03:30.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"They mentioned about supplementary material in the paper but I have no Idea where it is. ","html":"They mentioned about supplementary material in the paper but I have no Idea where it is.
\n","updatedAt":"2024-07-20T10:03:30.347Z","author":{"_id":"668fabb6a534fc4f3cf02faa","avatarUrl":"/avatars/436069ecfeebed6014b836fbdcca2291.svg","fullname":"Harshal","name":"NoOnesDead","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.975621223449707},"editors":["NoOnesDead"],"editorAvatarUrls":["/avatars/436069ecfeebed6014b836fbdcca2291.svg"],"reactions":[],"isReport":false,"parentCommentId":"6695ea06417bbfcd51c394d7"}},{"id":"669d52613197305664c7b10d","author":{"_id":"630c85d04ca0a22768b5c75f","avatarUrl":"/avatars/1f091843c857d73d5f2c91d6593150cc.svg","fullname":"cpc doy","name":"cpcdoy","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false},"createdAt":"2024-07-21T18:24:33.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Yeah, I've also been looking for their supplementary material but I haven't found it anywhere and it's not linked in the paper. I think it might still be a WIP to clean the code and build a quick showcase page probably. Let's hope we'll have it in the coming days/weeks.","html":"Yeah, I've also been looking for their supplementary material but I haven't found it anywhere and it's not linked in the paper. I think it might still be a WIP to clean the code and build a quick showcase page probably. Let's hope we'll have it in the coming days/weeks.
\n","updatedAt":"2024-07-21T18:24:33.904Z","author":{"_id":"630c85d04ca0a22768b5c75f","avatarUrl":"/avatars/1f091843c857d73d5f2c91d6593150cc.svg","fullname":"cpc doy","name":"cpcdoy","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9873893857002258},"editors":["cpcdoy"],"editorAvatarUrls":["/avatars/1f091843c857d73d5f2c91d6593150cc.svg"],"reactions":[],"isReport":false,"parentCommentId":"6695ea06417bbfcd51c394d7"}},{"id":"669db2830060107f0906706d","author":{"_id":"631134e8d43c55e811fa7ef3","avatarUrl":"/avatars/5c69ec361765440382bd3a9b22f8f734.svg","fullname":"Umesh Bhatt","name":"8thcross","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2024-07-22T01:14:43.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"SpreadSheetLLM is a framework, not a model. I had hoped they share an implementation of their SHEETCOMPRESSOR. it does sound impressive and plenty of secondary uses!","html":"SpreadSheetLLM is a framework, not a model. I had hoped they share an implementation of their SHEETCOMPRESSOR. it does sound impressive and plenty of secondary uses!
\n","updatedAt":"2024-07-22T01:14:43.896Z","author":{"_id":"631134e8d43c55e811fa7ef3","avatarUrl":"/avatars/5c69ec361765440382bd3a9b22f8f734.svg","fullname":"Umesh Bhatt","name":"8thcross","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9663070440292358},"editors":["8thcross"],"editorAvatarUrls":["/avatars/5c69ec361765440382bd3a9b22f8f734.svg"],"reactions":[],"isReport":false,"parentCommentId":"6695ea06417bbfcd51c394d7"}},{"id":"66aad6ba0f667da90f3e722b","author":{"_id":"6664baec994b80c65b192dc3","avatarUrl":"/avatars/c69820a3af57e6e3d832f9464c5d60ba.svg","fullname":"segaa segaa","name":"segaa","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2024-08-01T00:28:42.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"What I read is that SHEETCOMPRESSOR is based on several models that they fine-tuned on the spreadsheet table detection task. The datasets for the fine tuning (WebSheet10K and WebSheet400, as I've mentioned in my post below) are from their previous Microsoft Asia paper named \"TableSense: Spreadsheet Table Detection with Convolutional Neural Networks\". ClosedXML (or OpenPyXL??, also mentioned in the paper) libraries were used to extract data from these datasets. Here is the (mostly empty) repo for [TableSense](https://github.com/microsoft/TableSense), but for some reason it contains links and description to different datasets (VEnron2, VFUSE, VEUSES), so not sure if it can be of any use.","html":"What I read is that SHEETCOMPRESSOR is based on several models that they fine-tuned on the spreadsheet table detection task. The datasets for the fine tuning (WebSheet10K and WebSheet400, as I've mentioned in my post below) are from their previous Microsoft Asia paper named \"TableSense: Spreadsheet Table Detection with Convolutional Neural Networks\". ClosedXML (or OpenPyXL??, also mentioned in the paper) libraries were used to extract data from these datasets. Here is the (mostly empty) repo for TableSense, but for some reason it contains links and description to different datasets (VEnron2, VFUSE, VEUSES), so not sure if it can be of any use.
\n","updatedAt":"2024-08-01T00:29:53.033Z","author":{"_id":"6664baec994b80c65b192dc3","avatarUrl":"/avatars/c69820a3af57e6e3d832f9464c5d60ba.svg","fullname":"segaa segaa","name":"segaa","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.9525779485702515},"editors":["segaa"],"editorAvatarUrls":["/avatars/c69820a3af57e6e3d832f9464c5d60ba.svg"],"reactions":[],"isReport":false,"parentCommentId":"6695ea06417bbfcd51c394d7"}},{"id":"66ce78b5fcb7b0d255432936","author":{"_id":"6620375a480c985f968936e9","avatarUrl":"/avatars/eb0d4b78dc20e7bb67edc3f20c072ac3.svg","fullname":"Arnaud Baguet","name":"quantresearch1","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2024-08-28T01:09:09.000Z","type":"comment","data":{"edited":true,"hidden":true,"hiddenBy":"","latest":{"raw":"This comment has been hidden","html":"This comment has been hidden","updatedAt":"2024-08-28T01:26:57.698Z","author":{"_id":"6620375a480c985f968936e9","avatarUrl":"/avatars/eb0d4b78dc20e7bb67edc3f20c072ac3.svg","fullname":"Arnaud Baguet","name":"quantresearch1","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"editors":[],"editorAvatarUrls":[],"reactions":[],"parentCommentId":"6695ea06417bbfcd51c394d7"}},{"id":"68dccd1ae527f544bd55061f","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false},"createdAt":"2025-10-01T06:41:30.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"This link may help with you: https://aclanthology.org/2024.emnlp-main.1154/","html":"This link may help with you: https://aclanthology.org/2024.emnlp-main.1154/
\n","updatedAt":"2025-10-01T06:43:50.371Z","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.7506183385848999},"editors":["HaoyuDong"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png"],"reactions":[],"isReport":false,"parentCommentId":"6695ea06417bbfcd51c394d7"}},{"id":"6901e6b8d4e09be6b8f989c7","author":{"_id":"685278f387f8efcb62d22cf1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/uC2l9xJEyBLvMyG5RxjS8.png","fullname":"Ricky Tong","name":"RickyTong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-10-29T10:04:40.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":" ๐ ","html":"๐
\n","updatedAt":"2025-10-29T10:04:40.089Z","author":{"_id":"685278f387f8efcb62d22cf1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/uC2l9xJEyBLvMyG5RxjS8.png","fullname":"Ricky Tong","name":"RickyTong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"fr","probability":0.3993570804595947},"editors":["RickyTong"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/uC2l9xJEyBLvMyG5RxjS8.png"],"reactions":[],"isReport":false,"parentCommentId":"6695ea06417bbfcd51c394d7"}}]},{"id":"669606620e08a6505b84f1d9","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2024-07-16T05:34:26.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs](https://huggingface.co/papers/2406.02376) (2024)\n* [CHESS: Contextual Harnessing for Efficient SQL Synthesis](https://huggingface.co/papers/2405.16755) (2024)\n* [One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models](https://huggingface.co/papers/2405.19670) (2024)\n* [SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation](https://huggingface.co/papers/2406.14991) (2024)\n* [QuickLLaMA: Query-aware Inference Acceleration for Large Language Models](https://huggingface.co/papers/2406.07528) (2024)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
\nThe following papers were recommended by the Semantic Scholar API
\n- \n
- Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs (2024) \n
- CHESS: Contextual Harnessing for Efficient SQL Synthesis (2024) \n
- One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models (2024) \n
- SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation (2024) \n
- QuickLLaMA: Query-aware Inference Acceleration for Large Language Models (2024) \n
Please give a thumbs up to this comment if you found it helpful!
\nIf you want recommendations for any Paper on Hugging Face checkout this Space
\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
Yeah, where is their model? Do they even publish models?
\n","updatedAt":"2024-07-17T01:24:27.957Z","author":{"_id":"654eca42259ab60296e01f82","avatarUrl":"/avatars/5e4beb3c7636df678da792e3fcf2b541.svg","fullname":"Richard Hsu","name":"richardprobe","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9857624173164368},"editors":["richardprobe"],"editorAvatarUrls":["/avatars/5e4beb3c7636df678da792e3fcf2b541.svg"],"reactions":[],"isReport":false},"replies":[{"id":"68dcccf6c1ade2005676938d","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false},"createdAt":"2025-10-01T06:40:54.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"This link may help with you: https://aclanthology.org/2024.emnlp-main.1154/","html":"This link may help with you: https://aclanthology.org/2024.emnlp-main.1154/
\n","updatedAt":"2025-10-01T06:43:37.067Z","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.7506183385848999},"editors":["HaoyuDong"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png"],"reactions":[],"isReport":false,"parentCommentId":"66971d4b776578cbfee8c483"}}]},{"id":"669a673edac1eb34c0b9bbac","author":{"_id":"6047c7582d91124a58b0da44","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6047c7582d91124a58b0da44/omyKMSweUbwCbyZaZwvIM.jpeg","fullname":"Sahar M","name":"saharmor","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2024-07-19T13:16:46.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Kudos Yuzhang and team. I've featured this paper in my AI research newsletter https://www.aitidbits.ai/p/july-18th-2024\nLooking forward to more novel papers and methods.","html":"Kudos Yuzhang and team. I've featured this paper in my AI research newsletter https://www.aitidbits.ai/p/july-18th-2024
Looking forward to more novel papers and methods.
Without supplementary materials mentioned in the paper, which are nowhere to be found, it would be hard for anyone to believe all the claims in this paper. The paper mentions that it used the same dataset as the previous TableSense paper (WebSheet10K and WebSheet400), but these datasets also cannot be found anywhere. It seems like a black hole of research.
\n","updatedAt":"2024-08-01T00:08:20.512Z","author":{"_id":"6664baec994b80c65b192dc3","avatarUrl":"/avatars/c69820a3af57e6e3d832f9464c5d60ba.svg","fullname":"segaa segaa","name":"segaa","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9587306976318359},"editors":["segaa"],"editorAvatarUrls":["/avatars/c69820a3af57e6e3d832f9464c5d60ba.svg"],"reactions":[{"reaction":"๐","users":["drvenabili","quantresearch1","raunakdoesdev","paul1arito","Gangwoo"],"count":5}],"isReport":false},"replies":[{"id":"68c12925b50dfcc1b97ba1bc","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false},"createdAt":"2025-09-10T07:30:45.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"This link may help with you: https://aclanthology.org/2024.emnlp-main.1154/","html":"This link may help with you: https://aclanthology.org/2024.emnlp-main.1154/
\n","updatedAt":"2025-10-01T06:43:27.823Z","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.7506183385848999},"editors":["HaoyuDong"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png"],"reactions":[],"isReport":false,"parentCommentId":"66aad1f45788a648a9ed8bce"}},{"id":"68c12a6c1a55962df7dea2e0","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false},"createdAt":"2025-09-10T07:36:12.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"https://github.com/microsoft/TableSense\nPart of TableSense data that can be made public.","html":"https://github.com/microsoft/TableSense
Part of TableSense data that can be made public.
Did anyone find anything of SpreadsheetLLM implementation/code yet? Or would anyone be interested to try and figure it out ourselves, or would that be impossible?
\n","updatedAt":"2024-08-01T11:56:03.954Z","author":{"_id":"66aa1d4d2df7f4b31edfeae6","avatarUrl":"/avatars/d857f7fd411dda26d552d9526a56024f.svg","fullname":"Joรซlle","name":"HuggingBink","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9826725721359253},"editors":["HuggingBink"],"editorAvatarUrls":["/avatars/d857f7fd411dda26d552d9526a56024f.svg"],"reactions":[{"reaction":"โ","users":["Gozde"],"count":1}],"isReport":false},"replies":[{"id":"66b4a52fdcbf715a55b7df18","author":{"_id":"654eca42259ab60296e01f82","avatarUrl":"/avatars/5e4beb3c7636df678da792e3fcf2b541.svg","fullname":"Richard Hsu","name":"richardprobe","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2024-08-08T10:59:59.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Yeah, Iโm also waiting for implementation. ","html":"Yeah, Iโm also waiting for implementation.
\n","updatedAt":"2024-08-08T10:59:59.135Z","author":{"_id":"654eca42259ab60296e01f82","avatarUrl":"/avatars/5e4beb3c7636df678da792e3fcf2b541.svg","fullname":"Richard Hsu","name":"richardprobe","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8948639631271362},"editors":["richardprobe"],"editorAvatarUrls":["/avatars/5e4beb3c7636df678da792e3fcf2b541.svg"],"reactions":[{"reaction":"โ","users":["Gozde","AndyVivi","CyberMichael","XuJWood"],"count":4}],"isReport":false,"parentCommentId":"66ab77d386cac5dbd8efe097"}},{"id":"68c129368ebd56a1e1b63dd3","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false},"createdAt":"2025-09-10T07:31:02.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"This link may help with you: https://aclanthology.org/2024.emnlp-main.1154/","html":"This link may help with you: https://aclanthology.org/2024.emnlp-main.1154/
\n","updatedAt":"2025-10-01T06:43:21.032Z","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.7506183385848999},"editors":["HaoyuDong"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png"],"reactions":[],"isReport":false,"parentCommentId":"66ab77d386cac5dbd8efe097"}}]},{"id":"66b975d717c479e47cecda37","author":{"_id":"654eca42259ab60296e01f82","avatarUrl":"/avatars/5e4beb3c7636df678da792e3fcf2b541.svg","fullname":"Richard Hsu","name":"richardprobe","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2024-08-12T02:39:19.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"i'm wondering what approaches people have taken to understand sheets deeply? I know converting each cell to json may certainly assist. ","html":"i'm wondering what approaches people have taken to understand sheets deeply? I know converting each cell to json may certainly assist.
\n","updatedAt":"2024-08-12T02:39:19.014Z","author":{"_id":"654eca42259ab60296e01f82","avatarUrl":"/avatars/5e4beb3c7636df678da792e3fcf2b541.svg","fullname":"Richard Hsu","name":"richardprobe","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.967511773109436},"editors":["richardprobe"],"editorAvatarUrls":["/avatars/5e4beb3c7636df678da792e3fcf2b541.svg"],"reactions":[],"isReport":false},"replies":[{"id":"69269fc866ae2cc21c4d1ca6","author":{"_id":"65a7a2d476c9f29f090c65b2","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65a7a2d476c9f29f090c65b2/keF_aoSq0d453KP-juSR5.webp","fullname":"Ali","name":"unit-o07","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-26T06:35:52.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"i have recently started working on agentic ai system, u might find this helpful as it was also used by langchian team \"https://blog.langchain.com/summarizing-and-querying-data-from-excel-spreadsheets-using-eparse-and-a-large-language-model/\" and here is the opensource repo \"https://github.com/ChrisPappalardo/eparse\" . is not really that strong, but i played around with it, and it works, but not with 100% accuracy everytime. but yeah same as you im looking forward for the SpreadSheet llm Implementation. i hope it helps \n","html":"i have recently started working on agentic ai system, u might find this helpful as it was also used by langchian team \"https://blog.langchain.com/summarizing-and-querying-data-from-excel-spreadsheets-using-eparse-and-a-large-language-model/\" and here is the opensource repo \"https://github.com/ChrisPappalardo/eparse\" . is not really that strong, but i played around with it, and it works, but not with 100% accuracy everytime. but yeah same as you im looking forward for the SpreadSheet llm Implementation. i hope it helps
\n","updatedAt":"2025-11-26T06:35:52.124Z","author":{"_id":"65a7a2d476c9f29f090c65b2","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65a7a2d476c9f29f090c65b2/keF_aoSq0d453KP-juSR5.webp","fullname":"Ali","name":"unit-o07","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9440706968307495},"editors":["unit-o07"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/65a7a2d476c9f29f090c65b2/keF_aoSq0d453KP-juSR5.webp"],"reactions":[{"reaction":"๐","users":["MachineLearningLM"],"count":1}],"isReport":false,"parentCommentId":"66b975d717c479e47cecda37"}}]},{"id":"6701b5ce09b887cdcfb1c75f","author":{"_id":"641f121b2c631e05c2d08c7a","avatarUrl":"/avatars/ed051f258a753020d094793536340853.svg","fullname":"Pete Davis","name":"pdavis","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2024-10-05T21:55:26.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"3 months later, no code. Guess we'll just have to take their word for how awesome it is.","html":"3 months later, no code. Guess we'll just have to take their word for how awesome it is.
\n","updatedAt":"2024-10-05T21:55:26.606Z","author":{"_id":"641f121b2c631e05c2d08c7a","avatarUrl":"/avatars/ed051f258a753020d094793536340853.svg","fullname":"Pete Davis","name":"pdavis","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"editors":["pdavis"],"editorAvatarUrls":["/avatars/ed051f258a753020d094793536340853.svg"],"reactions":[{"reaction":"๐","users":["segaa"],"count":1},{"reaction":"๐","users":["paul1arito"],"count":1},{"reaction":"๐","users":["paul1arito"],"count":1}],"isReport":false},"replies":[{"id":"68dccd691e4b572ded87cd02","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false},"createdAt":"2025-10-01T06:42:49.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"This link may help with you: https://aclanthology.org/2024.emnlp-main.1154/","html":"This link may help with you: https://aclanthology.org/2024.emnlp-main.1154/
\n","updatedAt":"2025-10-01T06:43:12.831Z","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.7506183385848999},"editors":["HaoyuDong"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png"],"reactions":[],"isReport":false,"parentCommentId":"6701b5ce09b887cdcfb1c75f"}},{"id":"6926ab7920b43a4198f9c78c","author":{"_id":"65a7a2d476c9f29f090c65b2","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65a7a2d476c9f29f090c65b2/keF_aoSq0d453KP-juSR5.webp","fullname":"Ali","name":"unit-o07","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-26T07:25:45.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hey Dr. Dong, when i downloaded the the file, i noticed we had \"# From cot_process_phase1.py, line 36-37\ncompletion = client_phase1.chat.completions.create(\n model=\"va_nfs_fmt0_4k-ft-gpt4-v4\", # โ Custom fine-tuned model!\". where can i get the api \n\nso i was wondering if you can help us with the model va_nfs_fmt0_4k-ft-gpt4-v4 access. \n thank you ","html":"Hey Dr. Dong, when i downloaded the the file, i noticed we had \"# From cot_process_phase1.py, line 36-37
completion = client_phase1.chat.completions.create(
model=\"va_nfs_fmt0_4k-ft-gpt4-v4\", # โ Custom fine-tuned model!\". where can i get the api
so i was wondering if you can help us with the model va_nfs_fmt0_4k-ft-gpt4-v4 access.
thank you
A new public dataset for spreadsheet-centric financial and accounting workflows to evaluate frontier agents: https://huggingface.co/papers/2512.13168
\n","updatedAt":"2025-12-20T00:07:29.929Z","author":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","fullname":"Haoyu Dong","name":"HaoyuDong","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7501601576805115},"editors":["HaoyuDong"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2407.09025","authors":[{"_id":"669488068db712bad712e469","name":"Yuzhang Tian","hidden":false},{"_id":"669488068db712bad712e46a","name":"Jianbo Zhao","hidden":false},{"_id":"669488068db712bad712e46b","user":{"_id":"637b08057ce76c3b834da15d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PFmI1HNcZlHF0RoLjP3c2.png","isPro":false,"fullname":"Haoyu Dong","user":"HaoyuDong","type":"user"},"name":"Haoyu Dong","status":"claimed_verified","statusLastChangedAt":"2025-09-12T16:28:02.536Z","hidden":false},{"_id":"669488068db712bad712e46c","name":"Junyu Xiong","hidden":false},{"_id":"669488068db712bad712e46d","name":"Shiyu Xia","hidden":false},{"_id":"669488068db712bad712e46e","name":"Mengyu Zhou","hidden":false},{"_id":"669488068db712bad712e46f","name":"Yun Lin","hidden":false},{"_id":"669488068db712bad712e470","name":"Josรฉ Cambronero","hidden":false},{"_id":"669488068db712bad712e471","name":"Yeye He","hidden":false},{"_id":"669488068db712bad712e472","name":"Shi Han","hidden":false},{"_id":"669488068db712bad712e473","name":"Dongmei Zhang","hidden":false}],"publishedAt":"2024-07-12T06:34:21.000Z","submittedOnDailyAt":"2024-07-15T00:53:26.334Z","title":"SpreadsheetLLM: Encoding Spreadsheets for Large Language Models","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"Spreadsheets, with their extensive two-dimensional grids, various layouts,\nand diverse formatting options, present notable challenges for large language\nmodels (LLMs). In response, we introduce SpreadsheetLLM, pioneering an\nefficient encoding method designed to unleash and optimize LLMs' powerful\nunderstanding and reasoning capability on spreadsheets. Initially, we propose a\nvanilla serialization approach that incorporates cell addresses, values, and\nformats. However, this approach was limited by LLMs' token constraints, making\nit impractical for most applications. To tackle this challenge, we develop\nSheetCompressor, an innovative encoding framework that compresses spreadsheets\neffectively for LLMs. It comprises three modules: structural-anchor-based\ncompression, inverse index translation, and data-format-aware aggregation. It\nsignificantly improves performance in spreadsheet table detection task,\noutperforming the vanilla approach by 25.6% in GPT4's in-context learning\nsetting. Moreover, fine-tuned LLM with SheetCompressor has an average\ncompression ratio of 25 times, but achieves a state-of-the-art 78.9% F1 score,\nsurpassing the best existing models by 12.3%. Finally, we propose Chain of\nSpreadsheet for downstream tasks of spreadsheet understanding and validate in a\nnew and demanding spreadsheet QA task. We methodically leverage the inherent\nlayout and structure of spreadsheets, demonstrating that SpreadsheetLLM is\nhighly effective across a variety of spreadsheet tasks.","upvotes":139,"discussionId":"669488088db712bad712e4c1","ai_summary":"SpreadsheetLLM introduces SheetCompressor and Chain of Spreadsheet to enhance LLMs' performance on spreadsheet tasks through efficient encoding and understanding.","ai_keywords":["LLMs","SpreadsheetLLM","vanilla serialization","SheetCompressor","structural-anchor-based compression","inverse index translation","data-format-aware aggregation","in-context learning","F1 score","spreadsheet QA"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"66885b5b86fdd3744055a0c8","avatarUrl":"/avatars/13285f779182ca1a9b54d122a73b4da0.svg","isPro":false,"fullname":"Scott","user":"HardOnion","type":"user"},{"_id":"642357dfb9772125689fa0bf","avatarUrl":"/avatars/d3f80d282396c3d4f76e8774c645a4b5.svg","isPro":false,"fullname":"Julian Kaljuvee","user":"kaljuvee","type":"user"},{"_id":"62fadca70697d22421a05a36","avatarUrl":"/avatars/ce9fd5d70f56a903a5d0f4de9f6f4034.svg","isPro":false,"fullname":"jineui-kim","user":"engui","type":"user"},{"_id":"661663ada15c52fa7aefd358","avatarUrl":"/avatars/7c07b1333638d9cd8164adb95825bc06.svg","isPro":false,"fullname":"Ayala ","user":"Agustavo87","type":"user"},{"_id":"63e5fb75f2e9a8f22c51cd45","avatarUrl":"/avatars/1fb446c8095caae1d61cbaf1a16b6dd5.svg","isPro":false,"fullname":"Alex Essaijan","user":"AE1999","type":"user"},{"_id":"646fce0528638f11a83ee890","avatarUrl":"/avatars/6bbe81608f9fb82506dec7cbd182d94b.svg","isPro":false,"fullname":"Hristo Panev","user":"hppdqdq","type":"user"},{"_id":"651d625b585dd7e3ce5005a2","avatarUrl":"/avatars/bca89df5a756e82e81810dc2c2bbe6ad.svg","isPro":false,"fullname":"Lam","user":"woodlee309","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"64169a99bce2fed80ab86122","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1679202958868-noauth.jpeg","isPro":false,"fullname":"Sigrid Jin","user":"sigridjineth","type":"user"},{"_id":"635612c6805be5a8f312bf1a","avatarUrl":"/avatars/ce4d43fe434bb41b67d2a0586f179bd3.svg","isPro":false,"fullname":"Anthony Ivan S","user":"anthonyivn","type":"user"},{"_id":"6312cd225beb528b5c1513ae","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6312cd225beb528b5c1513ae/bcCCYw8ZDzTv8uyhHgHCa.jpeg","isPro":false,"fullname":"Shubham Singh Tomar","user":"shubham24","type":"user"},{"_id":"631c1000cf39db4b17214942","avatarUrl":"/avatars/116ef47159382b15d18c07e119474830.svg","isPro":false,"fullname":"Orwelious Maximilous","user":"orwelian84","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
Abstract
SpreadsheetLLM introduces SheetCompressor and Chain of Spreadsheet to enhance LLMs' performance on spreadsheet tasks through efficient encoding and understanding.
Spreadsheets, with their extensive two-dimensional grids, various layouts, and diverse formatting options, present notable challenges for large language models (LLMs). In response, we introduce SpreadsheetLLM, pioneering an efficient encoding method designed to unleash and optimize LLMs' powerful understanding and reasoning capability on spreadsheets. Initially, we propose a vanilla serialization approach that incorporates cell addresses, values, and formats. However, this approach was limited by LLMs' token constraints, making it impractical for most applications. To tackle this challenge, we develop SheetCompressor, an innovative encoding framework that compresses spreadsheets effectively for LLMs. It comprises three modules: structural-anchor-based compression, inverse index translation, and data-format-aware aggregation. It significantly improves performance in spreadsheet table detection task, outperforming the vanilla approach by 25.6% in GPT4's in-context learning setting. Moreover, fine-tuned LLM with SheetCompressor has an average compression ratio of 25 times, but achieves a state-of-the-art 78.9% F1 score, surpassing the best existing models by 12.3%. Finally, we propose Chain of Spreadsheet for downstream tasks of spreadsheet understanding and validate in a new and demanding spreadsheet QA task. We methodically leverage the inherent layout and structure of spreadsheets, demonstrating that SpreadsheetLLM is highly effective across a variety of spreadsheet tasks.
Community
[Disclaimer : I don't claim to be an expert, I just want to have an insightfull discussion with domain experts]
Formidable work ! I learned a lot reading this article ! As I was reading your article, a question sparked.
In the introduction you have said that "However, spreadsheets pose unique challenges for LLMs due to their expansive grids that usually exceed the token limitations of popular LLMs, as well as their inherent two-dimensional layouts and structures, which are poorly suited to linear and sequential input."
This sentence then sparked the idea that yes LLMs struggles to comprehend the 2D architecture of tabular data, but is it possible to chunk our data into "sub-array" the same way that Dosovitskiy et. al. did in their paper (arXiv:2010.11929) regarding ViT ? I remember that they chunked their input matrices into smaller matrices to reduce the cost of self-attention.
So I was wondering if it's possible take this idea from matrices as image to matrices as spreadsheets ? Is it relevant to adapt this technique to enhence tabular comprehension for LLMs ?
I had expected exploration of modified positional encoding schemes in two dimensions for this problem. Was that considered at all?
They mentioned about supplementary material in the paper but I have no Idea where it is.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs (2024)
- CHESS: Contextual Harnessing for Efficient SQL Synthesis (2024)
- One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models (2024)
- SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation (2024)
- QuickLLaMA: Query-aware Inference Acceleration for Large Language Models (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Kudos Yuzhang and team. I've featured this paper in my AI research newsletter https://www.aitidbits.ai/p/july-18th-2024
Looking forward to more novel papers and methods.
Without supplementary materials mentioned in the paper, which are nowhere to be found, it would be hard for anyone to believe all the claims in this paper. The paper mentions that it used the same dataset as the previous TableSense paper (WebSheet10K and WebSheet400), but these datasets also cannot be found anywhere. It seems like a black hole of research.
Did anyone find anything of SpreadsheetLLM implementation/code yet? Or would anyone be interested to try and figure it out ourselves, or would that be impossible?
Yeah, Iโm also waiting for implementation.
i'm wondering what approaches people have taken to understand sheets deeply? I know converting each cell to json may certainly assist.
i have recently started working on agentic ai system, u might find this helpful as it was also used by langchian team "https://blog.langchain.com/summarizing-and-querying-data-from-excel-spreadsheets-using-eparse-and-a-large-language-model/" and here is the opensource repo "https://github.com/ChrisPappalardo/eparse" . is not really that strong, but i played around with it, and it works, but not with 100% accuracy everytime. but yeah same as you im looking forward for the SpreadSheet llm Implementation. i hope it helps
A new public dataset for spreadsheet-centric financial and accounting workflows to evaluate frontier agents: https://huggingface.co/papers/2512.13168
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper