everything for high quality filtering of HPLT3
JQL-AI
community
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models\nTokenizer Choice For LLM Training: Negligible or Crucial? \nInvestigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions? \nDo Multilingual Large Language Models Mitigate Stereotype Bias? \n\n
\n","classNames":"hf-sanitized hf-sanitized-UrF2byeTZnNyAAJ0plZfR"},"users":[{"_id":"64bfc4d55ce3d382c05c0f9a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/7qh8Tnfkog9QizWUW3MYt.jpeg","isPro":true,"fullname":"Mehdi Ali","user":"mali90","type":"user"},{"_id":"6310ebaf631a69165c076b8d","avatarUrl":"/avatars/6ed432656913bbd77162187d158830f0.svg","isPro":false,"fullname":"Elias Wendt","user":"eliaswendt","type":"user"},{"_id":"64243ad773f771f6630871e4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64243ad773f771f6630871e4/Ds4g_5RyBKBDWAC32BqsH.jpeg","isPro":false,"fullname":"Michael Fromm","user":"mfromm","type":"user"},{"_id":"65aaf3125c84d7d5e7f4232a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65aaf3125c84d7d5e7f4232a/3U74uYNM54wZ7S77jbQV3.jpeg","isPro":false,"fullname":"Max Lue","user":"max-lue","type":"user"},{"_id":"62fa1d95e8c9c532aa75331c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62fa1d95e8c9c532aa75331c/WFfk_n8gOj845pSkfdazA.jpeg","isPro":false,"fullname":"Manuel Brack","user":"mbrack","type":"user"},{"_id":"657b29765c6f0b1f36d9f845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/657b29765c6f0b1f36d9f845/ULJsNPW5CQgFTsCtjYJ74.jpeg","isPro":true,"fullname":"Alexander Weber","user":"AlexanderAWeber","type":"user"},{"_id":"6538c7c6ffe3e0513128cc43","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538c7c6ffe3e0513128cc43/79z9yGpMqC0LINfFy1BrS.jpeg","isPro":false,"fullname":"Richard Rutmann","user":"rrutmann","type":"user"},{"_id":"60be26cfc9cec0c3141450df","avatarUrl":"/avatars/825185f6cb51a23cf43c8afe6ece72b6.svg","isPro":false,"fullname":"Ben","user":"Be-Lo","type":"user"},{"_id":"67e5721b169edeab9a5cd781","avatarUrl":"/avatars/521cbfdd3691f7f02132339aaf1d32e9.svg","isPro":false,"fullname":"S","user":"sebawastaken","type":"user"},{"_id":"618a5dc6f95f1190eec1abc2","avatarUrl":"/avatars/db53b7f915986837e74c597d194789b4.svg","isPro":false,"fullname":"Behzad Shomali","user":"Behzadshomali","type":"user"},{"_id":"6401bf313e3d0f2745abfa72","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6401bf313e3d0f2745abfa72/BPMKO3PsoIem4zR5yDmcw.jpeg","isPro":false,"fullname":"Markus Frey","user":"CYHSM","type":"user"},{"_id":"60f1a0c9f600f0c72212c4a4","avatarUrl":"/avatars/f1491333a8ae901729a57961a318e8bc.svg","isPro":true,"fullname":"Christopher Tauchmann","user":"ctauchmann","type":"user"},{"_id":"65b36f38638328850ebda93d","avatarUrl":"/avatars/965974657b11ee1031576258459ce3e1.svg","isPro":false,"fullname":"Ruben Härle","user":"RuHae","type":"user"}],"userCount":13,"collections":[{"slug":"JQL-AI/snowflake-hplt3-69087037cc62fbdfc55817bc","title":"Snowflake-HPLT3","description":"everything for high quality filtering of HPLT3","gating":false,"lastUpdated":"2026-01-08T16:26:33.862Z","owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"items":[{"_id":"6908704a1e012557a73b0dc0","position":0,"type":"dataset","author":"JQL-AI","downloads":10,"gated":false,"id":"JQL-AI/HPLT3-198-500k","lastModified":"2025-11-03T09:29:53.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"_id":"695fd8f773628fa86100c274","position":1,"type":"dataset","author":"Eurolingua","downloads":98,"gated":false,"id":"Eurolingua/hplt3_domains","lastModified":"2025-12-27T18:28:07.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":7117897540,"libraries":["datasets","dask","polars","mlcroissant"],"formats":["json"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"_id":"695fd90338fc67ab6d876bf8","position":2,"type":"dataset","author":"Eurolingua","downloads":552,"gated":false,"id":"Eurolingua/hplt3_edu_scores","lastModified":"2026-01-14T15:45:12.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":1548315105,"libraries":["datasets","dask","polars","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"_id":"695fd92e38fc67ab6d876db6","position":4,"type":"dataset","author":"Eurolingua","downloads":248,"gated":false,"id":"Eurolingua/HPLT3-198-500k","lastModified":"2025-11-10T14:56:30.000Z","datasetsServerInfo":{"viewer":"preview","numRows":0,"libraries":[],"formats":[],"modalities":[]},"private":false,"repoType":"dataset","likes":1,"isLikedByUser":false,"isBenchmark":false}],"position":0,"theme":"purple","private":false,"shareUrl":"https://hf.co/collections/JQL-AI/snowflake-hplt3","upvotes":1,"isUpvotedByUser":false},{"slug":"JQL-AI/jql-683998d88ead63ba097b9884","title":"JQL","description":"Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models","gating":false,"lastUpdated":"2025-07-30T12:53:52.655Z","owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"items":[{"_id":"6839990b7f983113fa01ed4c","position":0,"type":"space","author":"JQL-AI","authorData":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"colorFrom":"yellow","colorTo":"indigo","createdAt":"2025-05-26T11:09:12.000Z","emoji":"🦊","id":"JQL-AI/JQL","lastModified":"2025-05-31T05:33:23.000Z","likes":6,"pinned":false,"private":false,"sdk":"static","repoType":"space","runtime":{"stage":"RUNNING","hardware":{"current":null,"requested":null},"storage":null,"replicas":{"requested":1,"current":1}},"title":"JQL: Judging Quality Across Languages","isLikedByUser":false,"ai_short_description":"Filter multilingual data for high-quality language models","ai_category":"Data Visualization","trendingScore":0,"tags":["static","region:us"],"featured":false},{"_id":"68399970ac00da416d169f70","position":1,"type":"paper","id":"2505.22232","title":"Judging Quality Across Languages: A Multilingual Approach to Pretraining\n Data Filtering with Language Models","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2505.22232.png","upvotes":18,"publishedAt":"2025-05-28T11:06:54.000Z","isUpvotedByUser":false},{"_id":"683998fcbf4a7656ef274732","position":2,"type":"model","author":"JQL-AI","authorData":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"downloads":0,"gated":false,"id":"JQL-AI/JQL-Edu-Heads","availableInferenceProviders":[],"lastModified":"2025-06-01T20:45:34.000Z","likes":2,"pipeline_tag":"text-ranking","private":false,"repoType":"model","isLikedByUser":false},{"_id":"683999036463097bc6f06446","position":3,"type":"dataset","author":"JQL-AI","downloads":298,"gated":false,"id":"JQL-AI/JQL-LLM-Edu-Annotations","lastModified":"2025-05-29T09:05:09.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":11374793,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":2,"isLikedByUser":false,"isBenchmark":false}],"position":1,"theme":"indigo","private":false,"shareUrl":"https://hf.co/collections/JQL-AI/jql","upvotes":0,"isUpvotedByUser":false},{"slug":"JQL-AI/snowflake-fineweb-2-688a15bb76d86123cf25bc25","title":"Snowflake-Fineweb 2","description":"Fineweb 2 (removed / filtered) embeddings with Snowflake's Arctic-embed-m-v2.0.","gating":false,"lastUpdated":"2025-07-30T13:04:17.918Z","owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"items":[{"_id":"688a15d7f5a84e70c7338841","position":0,"type":"dataset","author":"JQL-AI","downloads":3031,"gated":false,"id":"JQL-AI/fw2_embeddings","lastModified":"2025-08-21T12:40:45.000Z","private":false,"repoType":"dataset","likes":2,"isLikedByUser":false,"isBenchmark":false},{"_id":"688a18518e02585787ee28f4","position":1,"type":"dataset","author":"JQL-AI","downloads":3499,"gated":false,"id":"JQL-AI/fw2_edu_scores","lastModified":"2025-08-07T14:31:33.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":4920676532,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":5,"isLikedByUser":false,"isBenchmark":false}],"position":2,"theme":"purple","private":false,"shareUrl":"https://hf.co/collections/JQL-AI/snowflake-fineweb-2","upvotes":0,"isUpvotedByUser":false},{"slug":"JQL-AI/snowflake-hplt2-dedup-688a1548c62174063ea4e2dc","title":"Snowflake-HPLT2-dedup","description":"HPLT2-dedup embeddings from Snowflake's Arctic-embed-m-v2.0 model","gating":false,"lastUpdated":"2025-07-30T12:53:52.652Z","owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"items":[{"_id":"688a159ad8d68978c7ef6963","position":0,"type":"dataset","author":"JQL-AI","downloads":7589,"gated":false,"id":"JQL-AI/hplt2_embeddings","lastModified":"2025-08-21T12:39:53.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"_id":"688a15a27d3876ce67c240bc","position":1,"type":"dataset","author":"JQL-AI","downloads":4742,"gated":false,"id":"JQL-AI/hplt2_edu_scores","lastModified":"2025-08-11T19:24:05.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":3361052897,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":1,"isLikedByUser":false,"isBenchmark":false}],"position":3,"theme":"green","private":false,"shareUrl":"https://hf.co/collections/JQL-AI/snowflake-hplt2-dedup","upvotes":0,"isUpvotedByUser":false},{"slug":"JQL-AI/snowflake-curated-688a14ac77074a9de965242d","title":"Snowflake-Curated","description":"Collection of curated datasets embedded with Snowflake's Arctic-embed-m-v2.0.","gating":false,"lastUpdated":"2025-07-30T12:53:52.653Z","owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"items":[{"_id":"688a1517c62174063ea4db87","position":0,"type":"dataset","author":"JQL-AI","downloads":1327,"gated":false,"id":"JQL-AI/curated_embeddings","lastModified":"2025-08-21T12:41:26.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"_id":"688a1522576e8dc02ae93cae","position":1,"type":"dataset","author":"JQL-AI","downloads":239,"gated":false,"id":"JQL-AI/curated_edu_scores","lastModified":"2025-07-30T12:45:13.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":475,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false}],"position":4,"theme":"pink","private":false,"shareUrl":"https://hf.co/collections/JQL-AI/snowflake-curated","upvotes":0,"isUpvotedByUser":false}],"datasets":[{"author":"JQL-AI","downloads":10,"gated":false,"id":"JQL-AI/HPLT3-198-500k","lastModified":"2025-11-03T09:29:53.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":1327,"gated":false,"id":"JQL-AI/curated_embeddings","lastModified":"2025-08-21T12:41:26.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":3031,"gated":false,"id":"JQL-AI/fw2_embeddings","lastModified":"2025-08-21T12:40:45.000Z","private":false,"repoType":"dataset","likes":2,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":7589,"gated":false,"id":"JQL-AI/hplt2_embeddings","lastModified":"2025-08-21T12:39:53.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":4742,"gated":false,"id":"JQL-AI/hplt2_edu_scores","lastModified":"2025-08-11T19:24:05.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":3361052897,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":1,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":3499,"gated":false,"id":"JQL-AI/fw2_edu_scores","lastModified":"2025-08-07T14:31:33.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":4920676532,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":5,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":239,"gated":false,"id":"JQL-AI/curated_edu_scores","lastModified":"2025-07-30T12:45:13.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":475,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":298,"gated":false,"id":"JQL-AI/JQL-LLM-Edu-Annotations","lastModified":"2025-05-29T09:05:09.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":11374793,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":2,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":18,"gated":false,"id":"JQL-AI/JQL-Human-Edu-Annotations","lastModified":"2025-05-29T09:04:35.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":20400,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":5,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":295,"gated":false,"id":"JQL-AI/Fineweb_2_500k_removed","lastModified":"2025-01-22T19:16:22.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":11700443,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false}],"models":[{"author":"JQL-AI","authorData":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"downloads":0,"gated":false,"id":"JQL-AI/fw2_edu_scores","availableInferenceProviders":[],"lastModified":"2025-08-01T09:34:47.000Z","likes":0,"private":false,"repoType":"model","isLikedByUser":false},{"author":"JQL-AI","authorData":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"downloads":0,"gated":false,"id":"JQL-AI/JQL-Edu-Heads","availableInferenceProviders":[],"lastModified":"2025-06-01T20:45:34.000Z","likes":2,"pipeline_tag":"text-ranking","private":false,"repoType":"model","isLikedByUser":false}],"paperPreviews":[],"spaces":[{"author":"JQL-AI","authorData":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"colorFrom":"yellow","colorTo":"indigo","createdAt":"2025-05-26T11:09:12.000Z","emoji":"🦊","id":"JQL-AI/JQL","lastModified":"2025-05-31T05:33:23.000Z","likes":6,"pinned":false,"private":false,"sdk":"static","repoType":"space","runtime":{"stage":"RUNNING","hardware":{"current":null,"requested":null},"storage":null,"replicas":{"requested":1,"current":1}},"title":"JQL: Judging Quality Across Languages","isLikedByUser":false,"ai_short_description":"Filter multilingual data for high-quality language models","ai_category":"Data Visualization","trendingScore":0,"tags":["static","region:us"],"featured":false}],"buckets":[],"numBuckets":0,"numDatasets":12,"numModels":2,"numSpaces":2,"lastOrgActivities":[{"time":"2026-01-08T16:19:58.743Z","user":"mfromm","userAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64243ad773f771f6630871e4/Ds4g_5RyBKBDWAC32BqsH.jpeg","orgAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","type":"collection","collection":{"id":"69087037cc62fbdfc55817bc","slug":"JQL-AI/snowflake-hplt3-69087037cc62fbdfc55817bc","title":"Snowflake-HPLT3","description":"everything for high quality filtering of HPLT3","lastUpdated":"2026-01-08T16:26:33.862Z","numberItems":5,"owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"theme":"purple","shareUrl":"https://hf.co/collections/JQL-AI/snowflake-hplt3","upvotes":1,"isUpvotedByUser":false},"org":"JQL-AI"},{"time":"2026-01-08T16:19:37.919Z","user":"mfromm","userAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64243ad773f771f6630871e4/Ds4g_5RyBKBDWAC32BqsH.jpeg","orgAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","type":"collection","collection":{"id":"69087037cc62fbdfc55817bc","slug":"JQL-AI/snowflake-hplt3-69087037cc62fbdfc55817bc","title":"Snowflake-HPLT3","description":"everything for high quality filtering of HPLT3","lastUpdated":"2026-01-08T16:26:33.862Z","numberItems":5,"owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"theme":"purple","shareUrl":"https://hf.co/collections/JQL-AI/snowflake-hplt3","upvotes":1,"isUpvotedByUser":false},"org":"JQL-AI"},{"time":"2026-01-08T16:19:15.235Z","user":"mfromm","userAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64243ad773f771f6630871e4/Ds4g_5RyBKBDWAC32BqsH.jpeg","orgAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","type":"collection","collection":{"id":"69087037cc62fbdfc55817bc","slug":"JQL-AI/snowflake-hplt3-69087037cc62fbdfc55817bc","title":"Snowflake-HPLT3","description":"everything for high quality filtering of HPLT3","lastUpdated":"2026-01-08T16:26:33.862Z","numberItems":5,"owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"theme":"purple","shareUrl":"https://hf.co/collections/JQL-AI/snowflake-hplt3","upvotes":1,"isUpvotedByUser":false},"org":"JQL-AI"}],"acceptLanguages":["*"],"canReadRepos":false,"canReadSpaces":false,"blogPosts":[],"currentRepoPage":0,"filters":{},"paperView":false}">
datasets
12
\n","classNames":"hf-sanitized hf-sanitized-UrF2byeTZnNyAAJ0plZfR"},"users":[{"_id":"64bfc4d55ce3d382c05c0f9a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/7qh8Tnfkog9QizWUW3MYt.jpeg","isPro":true,"fullname":"Mehdi Ali","user":"mali90","type":"user"},{"_id":"6310ebaf631a69165c076b8d","avatarUrl":"/avatars/6ed432656913bbd77162187d158830f0.svg","isPro":false,"fullname":"Elias Wendt","user":"eliaswendt","type":"user"},{"_id":"64243ad773f771f6630871e4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64243ad773f771f6630871e4/Ds4g_5RyBKBDWAC32BqsH.jpeg","isPro":false,"fullname":"Michael Fromm","user":"mfromm","type":"user"},{"_id":"65aaf3125c84d7d5e7f4232a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65aaf3125c84d7d5e7f4232a/3U74uYNM54wZ7S77jbQV3.jpeg","isPro":false,"fullname":"Max Lue","user":"max-lue","type":"user"},{"_id":"62fa1d95e8c9c532aa75331c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62fa1d95e8c9c532aa75331c/WFfk_n8gOj845pSkfdazA.jpeg","isPro":false,"fullname":"Manuel Brack","user":"mbrack","type":"user"},{"_id":"657b29765c6f0b1f36d9f845","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/657b29765c6f0b1f36d9f845/ULJsNPW5CQgFTsCtjYJ74.jpeg","isPro":true,"fullname":"Alexander Weber","user":"AlexanderAWeber","type":"user"},{"_id":"6538c7c6ffe3e0513128cc43","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538c7c6ffe3e0513128cc43/79z9yGpMqC0LINfFy1BrS.jpeg","isPro":false,"fullname":"Richard Rutmann","user":"rrutmann","type":"user"},{"_id":"60be26cfc9cec0c3141450df","avatarUrl":"/avatars/825185f6cb51a23cf43c8afe6ece72b6.svg","isPro":false,"fullname":"Ben","user":"Be-Lo","type":"user"},{"_id":"67e5721b169edeab9a5cd781","avatarUrl":"/avatars/521cbfdd3691f7f02132339aaf1d32e9.svg","isPro":false,"fullname":"S","user":"sebawastaken","type":"user"},{"_id":"618a5dc6f95f1190eec1abc2","avatarUrl":"/avatars/db53b7f915986837e74c597d194789b4.svg","isPro":false,"fullname":"Behzad Shomali","user":"Behzadshomali","type":"user"},{"_id":"6401bf313e3d0f2745abfa72","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6401bf313e3d0f2745abfa72/BPMKO3PsoIem4zR5yDmcw.jpeg","isPro":false,"fullname":"Markus Frey","user":"CYHSM","type":"user"},{"_id":"60f1a0c9f600f0c72212c4a4","avatarUrl":"/avatars/f1491333a8ae901729a57961a318e8bc.svg","isPro":true,"fullname":"Christopher Tauchmann","user":"ctauchmann","type":"user"},{"_id":"65b36f38638328850ebda93d","avatarUrl":"/avatars/965974657b11ee1031576258459ce3e1.svg","isPro":false,"fullname":"Ruben Härle","user":"RuHae","type":"user"}],"userCount":13,"collections":[{"slug":"JQL-AI/snowflake-hplt3-69087037cc62fbdfc55817bc","title":"Snowflake-HPLT3","description":"everything for high quality filtering of HPLT3","gating":false,"lastUpdated":"2026-01-08T16:26:33.862Z","owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"items":[{"_id":"6908704a1e012557a73b0dc0","position":0,"type":"dataset","author":"JQL-AI","downloads":10,"gated":false,"id":"JQL-AI/HPLT3-198-500k","lastModified":"2025-11-03T09:29:53.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"_id":"695fd8f773628fa86100c274","position":1,"type":"dataset","author":"Eurolingua","downloads":98,"gated":false,"id":"Eurolingua/hplt3_domains","lastModified":"2025-12-27T18:28:07.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":7117897540,"libraries":["datasets","dask","polars","mlcroissant"],"formats":["json"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"_id":"695fd90338fc67ab6d876bf8","position":2,"type":"dataset","author":"Eurolingua","downloads":552,"gated":false,"id":"Eurolingua/hplt3_edu_scores","lastModified":"2026-01-14T15:45:12.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":1548315105,"libraries":["datasets","dask","polars","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"_id":"695fd92e38fc67ab6d876db6","position":4,"type":"dataset","author":"Eurolingua","downloads":248,"gated":false,"id":"Eurolingua/HPLT3-198-500k","lastModified":"2025-11-10T14:56:30.000Z","datasetsServerInfo":{"viewer":"preview","numRows":0,"libraries":[],"formats":[],"modalities":[]},"private":false,"repoType":"dataset","likes":1,"isLikedByUser":false,"isBenchmark":false}],"position":0,"theme":"purple","private":false,"shareUrl":"https://hf.co/collections/JQL-AI/snowflake-hplt3","upvotes":1,"isUpvotedByUser":false},{"slug":"JQL-AI/jql-683998d88ead63ba097b9884","title":"JQL","description":"Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models","gating":false,"lastUpdated":"2025-07-30T12:53:52.655Z","owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"items":[{"_id":"6839990b7f983113fa01ed4c","position":0,"type":"space","author":"JQL-AI","authorData":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"colorFrom":"yellow","colorTo":"indigo","createdAt":"2025-05-26T11:09:12.000Z","emoji":"🦊","id":"JQL-AI/JQL","lastModified":"2025-05-31T05:33:23.000Z","likes":6,"pinned":false,"private":false,"sdk":"static","repoType":"space","runtime":{"stage":"RUNNING","hardware":{"current":null,"requested":null},"storage":null,"replicas":{"requested":1,"current":1}},"title":"JQL: Judging Quality Across Languages","isLikedByUser":false,"ai_short_description":"Filter multilingual data for high-quality language models","ai_category":"Data Visualization","trendingScore":0,"tags":["static","region:us"],"featured":false},{"_id":"68399970ac00da416d169f70","position":1,"type":"paper","id":"2505.22232","title":"Judging Quality Across Languages: A Multilingual Approach to Pretraining\n Data Filtering with Language Models","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2505.22232.png","upvotes":18,"publishedAt":"2025-05-28T11:06:54.000Z","isUpvotedByUser":false},{"_id":"683998fcbf4a7656ef274732","position":2,"type":"model","author":"JQL-AI","authorData":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"downloads":0,"gated":false,"id":"JQL-AI/JQL-Edu-Heads","availableInferenceProviders":[],"lastModified":"2025-06-01T20:45:34.000Z","likes":2,"pipeline_tag":"text-ranking","private":false,"repoType":"model","isLikedByUser":false},{"_id":"683999036463097bc6f06446","position":3,"type":"dataset","author":"JQL-AI","downloads":298,"gated":false,"id":"JQL-AI/JQL-LLM-Edu-Annotations","lastModified":"2025-05-29T09:05:09.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":11374793,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":2,"isLikedByUser":false,"isBenchmark":false}],"position":1,"theme":"indigo","private":false,"shareUrl":"https://hf.co/collections/JQL-AI/jql","upvotes":0,"isUpvotedByUser":false},{"slug":"JQL-AI/snowflake-fineweb-2-688a15bb76d86123cf25bc25","title":"Snowflake-Fineweb 2","description":"Fineweb 2 (removed / filtered) embeddings with Snowflake's Arctic-embed-m-v2.0.","gating":false,"lastUpdated":"2025-07-30T13:04:17.918Z","owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"items":[{"_id":"688a15d7f5a84e70c7338841","position":0,"type":"dataset","author":"JQL-AI","downloads":3031,"gated":false,"id":"JQL-AI/fw2_embeddings","lastModified":"2025-08-21T12:40:45.000Z","private":false,"repoType":"dataset","likes":2,"isLikedByUser":false,"isBenchmark":false},{"_id":"688a18518e02585787ee28f4","position":1,"type":"dataset","author":"JQL-AI","downloads":3499,"gated":false,"id":"JQL-AI/fw2_edu_scores","lastModified":"2025-08-07T14:31:33.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":4920676532,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":5,"isLikedByUser":false,"isBenchmark":false}],"position":2,"theme":"purple","private":false,"shareUrl":"https://hf.co/collections/JQL-AI/snowflake-fineweb-2","upvotes":0,"isUpvotedByUser":false},{"slug":"JQL-AI/snowflake-hplt2-dedup-688a1548c62174063ea4e2dc","title":"Snowflake-HPLT2-dedup","description":"HPLT2-dedup embeddings from Snowflake's Arctic-embed-m-v2.0 model","gating":false,"lastUpdated":"2025-07-30T12:53:52.652Z","owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"items":[{"_id":"688a159ad8d68978c7ef6963","position":0,"type":"dataset","author":"JQL-AI","downloads":7589,"gated":false,"id":"JQL-AI/hplt2_embeddings","lastModified":"2025-08-21T12:39:53.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"_id":"688a15a27d3876ce67c240bc","position":1,"type":"dataset","author":"JQL-AI","downloads":4742,"gated":false,"id":"JQL-AI/hplt2_edu_scores","lastModified":"2025-08-11T19:24:05.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":3361052897,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":1,"isLikedByUser":false,"isBenchmark":false}],"position":3,"theme":"green","private":false,"shareUrl":"https://hf.co/collections/JQL-AI/snowflake-hplt2-dedup","upvotes":0,"isUpvotedByUser":false},{"slug":"JQL-AI/snowflake-curated-688a14ac77074a9de965242d","title":"Snowflake-Curated","description":"Collection of curated datasets embedded with Snowflake's Arctic-embed-m-v2.0.","gating":false,"lastUpdated":"2025-07-30T12:53:52.653Z","owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"items":[{"_id":"688a1517c62174063ea4db87","position":0,"type":"dataset","author":"JQL-AI","downloads":1327,"gated":false,"id":"JQL-AI/curated_embeddings","lastModified":"2025-08-21T12:41:26.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"_id":"688a1522576e8dc02ae93cae","position":1,"type":"dataset","author":"JQL-AI","downloads":239,"gated":false,"id":"JQL-AI/curated_edu_scores","lastModified":"2025-07-30T12:45:13.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":475,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false}],"position":4,"theme":"pink","private":false,"shareUrl":"https://hf.co/collections/JQL-AI/snowflake-curated","upvotes":0,"isUpvotedByUser":false}],"datasets":[{"author":"JQL-AI","downloads":10,"gated":false,"id":"JQL-AI/HPLT3-198-500k","lastModified":"2025-11-03T09:29:53.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":1327,"gated":false,"id":"JQL-AI/curated_embeddings","lastModified":"2025-08-21T12:41:26.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":3031,"gated":false,"id":"JQL-AI/fw2_embeddings","lastModified":"2025-08-21T12:40:45.000Z","private":false,"repoType":"dataset","likes":2,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":7589,"gated":false,"id":"JQL-AI/hplt2_embeddings","lastModified":"2025-08-21T12:39:53.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":4742,"gated":false,"id":"JQL-AI/hplt2_edu_scores","lastModified":"2025-08-11T19:24:05.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":3361052897,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":1,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":3499,"gated":false,"id":"JQL-AI/fw2_edu_scores","lastModified":"2025-08-07T14:31:33.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":4920676532,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":5,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":239,"gated":false,"id":"JQL-AI/curated_edu_scores","lastModified":"2025-07-30T12:45:13.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":475,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":298,"gated":false,"id":"JQL-AI/JQL-LLM-Edu-Annotations","lastModified":"2025-05-29T09:05:09.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":11374793,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":2,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":18,"gated":false,"id":"JQL-AI/JQL-Human-Edu-Annotations","lastModified":"2025-05-29T09:04:35.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":20400,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["text"]},"private":false,"repoType":"dataset","likes":5,"isLikedByUser":false,"isBenchmark":false},{"author":"JQL-AI","downloads":295,"gated":false,"id":"JQL-AI/Fineweb_2_500k_removed","lastModified":"2025-01-22T19:16:22.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":11700443,"libraries":["datasets","dask","mlcroissant"],"formats":["json"],"modalities":["tabular","text"]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false,"isBenchmark":false}],"models":[{"author":"JQL-AI","authorData":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"downloads":0,"gated":false,"id":"JQL-AI/fw2_edu_scores","availableInferenceProviders":[],"lastModified":"2025-08-01T09:34:47.000Z","likes":0,"private":false,"repoType":"model","isLikedByUser":false},{"author":"JQL-AI","authorData":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"downloads":0,"gated":false,"id":"JQL-AI/JQL-Edu-Heads","availableInferenceProviders":[],"lastModified":"2025-06-01T20:45:34.000Z","likes":2,"pipeline_tag":"text-ranking","private":false,"repoType":"model","isLikedByUser":false}],"paperPreviews":[],"spaces":[{"author":"JQL-AI","authorData":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"colorFrom":"yellow","colorTo":"indigo","createdAt":"2025-05-26T11:09:12.000Z","emoji":"🦊","id":"JQL-AI/JQL","lastModified":"2025-05-31T05:33:23.000Z","likes":6,"pinned":false,"private":false,"sdk":"static","repoType":"space","runtime":{"stage":"RUNNING","hardware":{"current":null,"requested":null},"storage":null,"replicas":{"requested":1,"current":1}},"title":"JQL: Judging Quality Across Languages","isLikedByUser":false,"ai_short_description":"Filter multilingual data for high-quality language models","ai_category":"Data Visualization","trendingScore":0,"tags":["static","region:us"],"featured":false}],"buckets":[],"numBuckets":0,"numDatasets":12,"numModels":2,"numSpaces":2,"lastOrgActivities":[{"time":"2026-01-08T16:19:58.743Z","user":"mfromm","userAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64243ad773f771f6630871e4/Ds4g_5RyBKBDWAC32BqsH.jpeg","orgAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","type":"collection","collection":{"id":"69087037cc62fbdfc55817bc","slug":"JQL-AI/snowflake-hplt3-69087037cc62fbdfc55817bc","title":"Snowflake-HPLT3","description":"everything for high quality filtering of HPLT3","lastUpdated":"2026-01-08T16:26:33.862Z","numberItems":5,"owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"theme":"purple","shareUrl":"https://hf.co/collections/JQL-AI/snowflake-hplt3","upvotes":1,"isUpvotedByUser":false},"org":"JQL-AI"},{"time":"2026-01-08T16:19:37.919Z","user":"mfromm","userAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64243ad773f771f6630871e4/Ds4g_5RyBKBDWAC32BqsH.jpeg","orgAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","type":"collection","collection":{"id":"69087037cc62fbdfc55817bc","slug":"JQL-AI/snowflake-hplt3-69087037cc62fbdfc55817bc","title":"Snowflake-HPLT3","description":"everything for high quality filtering of HPLT3","lastUpdated":"2026-01-08T16:26:33.862Z","numberItems":5,"owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"theme":"purple","shareUrl":"https://hf.co/collections/JQL-AI/snowflake-hplt3","upvotes":1,"isUpvotedByUser":false},"org":"JQL-AI"},{"time":"2026-01-08T16:19:15.235Z","user":"mfromm","userAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64243ad773f771f6630871e4/Ds4g_5RyBKBDWAC32BqsH.jpeg","orgAvatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","type":"collection","collection":{"id":"69087037cc62fbdfc55817bc","slug":"JQL-AI/snowflake-hplt3-69087037cc62fbdfc55817bc","title":"Snowflake-HPLT3","description":"everything for high quality filtering of HPLT3","lastUpdated":"2026-01-08T16:26:33.862Z","numberItems":5,"owner":{"_id":"682da695748a2716a0890d9b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bfc4d55ce3d382c05c0f9a/bYJdyIf3jcjJcAi4Bpdei.png","fullname":"JQL-AI","name":"JQL-AI","type":"org","isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":18,"isUserFollowing":false},"theme":"purple","shareUrl":"https://hf.co/collections/JQL-AI/snowflake-hplt3","upvotes":1,"isUpvotedByUser":false},"org":"JQL-AI"}],"acceptLanguages":["*"],"canReadRepos":false,"canReadSpaces":false,"blogPosts":[],"currentRepoPage":0,"filters":{},"paperView":false}">
AI & ML interests
None defined yet.
Recent Activity
Organization Card
JQL-AI (pronounced Jackal-AI) is a community of machine learning researchers committed to advancing the development of multilingual foundation models.
Latest Research
- Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
- Tokenizer Choice For LLM Training: Negligible or Crucial?
- Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?
- Do Multilingual Large Language Models Mitigate Stereotype Bias?
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
-
JQL: Judging Quality Across Languages
🦊6Filter multilingual data for high-quality language models
-
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Paper • 2505.22232 • Published • 18 -
JQL-AI/JQL-Edu-Heads
Text Ranking • Updated • 2 -
JQL-AI/JQL-LLM-Edu-Annotations
Viewer • Updated • 11.4M • 298 • 2
everything for high quality filtering of HPLT3
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
-
JQL: Judging Quality Across Languages
🦊6Filter multilingual data for high-quality language models
-
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Paper • 2505.22232 • Published • 18 -
JQL-AI/JQL-Edu-Heads
Text Ranking • Updated • 2 -
JQL-AI/JQL-LLM-Edu-Annotations
Viewer • Updated • 11.4M • 298 • 2
datasets
12
JQL-AI/HPLT3-198-500k
Updated
•
10
JQL-AI/curated_embeddings
Updated
•
1.33k
JQL-AI/fw2_embeddings
Updated
•
3.03k
•
2
JQL-AI/hplt2_embeddings
Updated
•
7.59k
JQL-AI/hplt2_edu_scores
Viewer
•
Updated
•
3.36B
•
4.74k
•
1
JQL-AI/fw2_edu_scores
Viewer
•
Updated
•
4.92B
•
3.5k
•
5
JQL-AI/curated_edu_scores
Viewer
•
Updated
•
475
•
239
JQL-AI/JQL-LLM-Edu-Annotations
Viewer
•
Updated
•
11.4M
•
298
•
2
JQL-AI/JQL-Human-Edu-Annotations
Viewer
•
Updated
•
20.4k
•
18
•
5
JQL-AI/Fineweb_2_500k_removed
Viewer
•
Updated
•
11.7M
•
295