Deprecated : The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Inference - a Stalin16 Collection
Inference updated Nov 1, 2025
The Impact of Hyperparameters on Large Language Model Inference
Performance: An Evaluation of vLLM and HuggingFace Pipelines Paper
• 2408.01050
• Published Aug 2, 2024 • 9
Scaling LLM Test-Time Compute Optimally can be More Effective than
Scaling Model Parameters Paper
• 2408.03314
• Published Aug 6, 2024 • 63
Towards a Unified View of Preference Learning for Large Language Models:
A Survey Paper
• 2409.02795
• Published Sep 4, 2024 • 72
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized
Academic Assistance Paper
• 2409.04593
• Published Sep 6, 2024 • 26
From MOOC to MAIC: Reshaping Online Teaching and Learning through
LLM-driven Agents Paper
• 2409.03512
• Published Sep 5, 2024 • 29
Political DEBATE: Efficient Zero-shot and Few-shot Classifiers for
Political Text Paper
• 2409.02078
• Published Sep 3, 2024 • 11
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper
• 2408.16725
• Published Aug 29, 2024 • 53
TextBoost: Towards One-Shot Personalization of Text-to-Image Models via
Fine-tuning Text Encoder Paper
• 2409.08248
• Published Sep 12, 2024 • 16
GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question
Answering Paper
• 2409.06595
• Published Sep 10, 2024 • 38
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic
reasoning Paper
• 2409.12183
• Published Sep 18, 2024 • 39
Preference Tuning with Human Feedback on Language, Speech, and Vision
Tasks: A Survey Paper
• 2409.11564
• Published Sep 17, 2024 • 20
Enhancing Structured-Data Retrieval with GraphRAG: Soccer Data Case
Study Paper
• 2409.17580
• Published Sep 26, 2024 • 8
Law of the Weakest Link: Cross Capabilities of Large Language Models Paper
• 2409.19951
• Published Sep 30, 2024 • 54
Illustrious: an Open Advanced Illustration Model Paper
• 2409.19946
• Published Sep 30, 2024 • 15
Ruler: A Model-Agnostic Method to Control Generated Length for Large
Language Models Paper
• 2409.18943
• Published Sep 27, 2024 • 27
SLM: Bridge the thin gap between speech and text foundation models Paper
• 2310.00230
• Published Sep 30, 2023 • 1
ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation Paper
• 2410.01731
• Published Oct 2, 2024 • 16
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper
• 2411.04905
• Published Nov 7, 2024 • 127
Agent-as-a-Judge: Evaluate Agents with Agents Paper
• 2410.10934
• Published Oct 14, 2024 • 23
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large
Multimodal Models Paper
• 2410.09732
• Published Oct 13, 2024 • 54
Analyzing The Language of Visual Tokens Paper
• 2411.05001
• Published Nov 7, 2024 • 24
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content Paper
• 2410.10783
• Published Oct 14, 2024 • 26
Intriguing Properties of Large Language and Vision Models Paper
• 2410.04751
• Published Oct 7, 2024 • 16
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple
Tasks in Superposition Paper
• 2410.05603
• Published Oct 8, 2024 • 11
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle
Grandmaster Level Paper
• 2411.03562
• Published Nov 5, 2024 • 69
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM
Quantization Paper
• 2411.02355
• Published Nov 4, 2024 • 52
Survey of Cultural Awareness in Language Models: Text and Beyond Paper
• 2411.00860
• Published Oct 30, 2024 • 24
Multi-expert Prompting Improves Reliability, Safety, and Usefulness of
Large Language Models Paper
• 2411.00492
• Published Nov 1, 2024 • 6
Personalization of Large Language Models: A Survey Paper
• 2411.00027
• Published Oct 29, 2024 • 33
Survey of User Interface Design and Interaction Techniques in Generative
AI Applications Paper
• 2410.22370
• Published Oct 28, 2024 • 12
Navigating the Unknown: A Chat-Based Collaborative Interface for
Personalized Exploratory Tasks Paper
• 2410.24032
• Published Oct 31, 2024 • 10
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM
Inference Paper
• 2410.21465
• Published Oct 28, 2024 • 11
Document Parsing Unveiled: Techniques, Challenges, and Prospects for
Structured Information Extraction Paper
• 2410.21169
• Published Oct 28, 2024 • 30
Are LLMs Better than Reported? Detecting Label Errors and Mitigating
Their Effect on Model Performance Paper
• 2410.18889
• Published Oct 24, 2024 • 15
Counting Ability of Large Language Models and Impact of Tokenization Paper
• 2410.19730
• Published Oct 25, 2024 • 11
Can Knowledge Editing Really Correct Hallucinations? Paper
• 2410.16251
• Published Oct 21, 2024 • 55
Looking Inward: Language Models Can Learn About Themselves by
Introspection Paper
• 2410.13787
• Published Oct 17, 2024 • 8
JudgeBench: A Benchmark for Evaluating LLM-based Judges Paper
• 2410.12784
• Published Oct 16, 2024 • 47
WorldCuisines: A Massive-Scale Benchmark for Multilingual and
Multicultural Visual Question Answering on Global Cuisines Paper
• 2410.12705
• Published Oct 16, 2024 • 32
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model Paper
• 2410.13639
• Published Oct 17, 2024 • 19
Remember, Retrieve and Generate: Understanding Infinite Visual Concepts
as Your Personalized Assistant Paper
• 2410.13360
• Published Oct 17, 2024 • 9
The Curse of Multi-Modalities: Evaluating Hallucinations of Large
Multimodal Models across Language, Visual, and Audio Paper
• 2410.12787
• Published Oct 16, 2024 • 30
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse
Synthetic Data and Global-to-Local Adaptive Perception Paper
• 2410.12628
• Published Oct 16, 2024 • 41
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale
Haystacks? Paper
• 2411.05000
• Published Nov 7, 2024 • 22
Cut Your Losses in Large-Vocabulary Language Models Paper
• 2411.09009
• Published Nov 13, 2024 • 49
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large
Language Models on Mobile Devices Paper
• 2411.10640
• Published Nov 16, 2024 • 46
Generative World Explorer Paper
• 2411.11844
• Published Nov 18, 2024 • 77
VideoAutoArena: An Automated Arena for Evaluating Large Multimodal
Models in Video Analysis through User Simulation Paper
• 2411.13281
• Published Nov 20, 2024 • 21
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context
Training Paper
• 2411.13476
• Published Nov 20, 2024 • 16
Evaluating Tokenizer Performance of Large Language Models Across
Official Indian Languages Paper
• 2411.12240
• Published Nov 19, 2024 • 7
SageAttention2 Technical Report: Accurate 4 Bit Attention for
Plug-and-play Inference Acceleration Paper
• 2411.10958
• Published Nov 17, 2024 • 57
Personalized Multimodal Large Language Models: A Survey Paper
• 2412.02142
• Published Dec 3, 2024 • 13
Towards Universal Soccer Video Understanding Paper
• 2412.01820
• Published Dec 2, 2024 • 11
Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OASIS Paper
• 2411.19655
• Published Nov 29, 2024 • 20
Training Large Language Models to Reason in a Continuous Latent Space Paper
• 2412.06769
• Published Dec 9, 2024 • 94
Evaluating Language Models as Synthetic Data Generators Paper
• 2412.03679
• Published Dec 4, 2024 • 47
Paper
• 2412.04315
• Published Dec 5, 2024 • 19
Paper
• 2412.07724
• Published Dec 10, 2024 • 18
Token-Budget-Aware LLM Reasoning Paper
• 2412.18547
• Published Dec 24, 2024 • 46
In Case You Missed It: ARC 'Challenge' Is Not That Challenging Paper
• 2412.17758
• Published Dec 23, 2024 • 17
Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning Paper
• 2412.09078
• Published Dec 12, 2024 B-STaR: Monitoring and Balancing Exploration and Exploitation in
Self-Taught Reasoners Paper
• 2412.17256
• Published Dec 23, 2024 • 47
Outcome-Refining Process Supervision for Code Generation Paper
• 2412.15118
• Published Dec 19, 2024 • 19
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought Paper
• 2412.17498
• Published Dec 23, 2024 • 22
MLLM-as-a-Judge for Image Safety without Human Labeling Paper
• 2501.00192
• Published Dec 31, 2024 • 32
Enhancing Human-Like Responses in Large Language Models Paper
• 2501.05032
• Published Jan 9, 2025 • 61
LServe: Efficient Long-sequence LLM Serving with Unified Sparse
Attention Paper
• 2502.14866
• Published Feb 20, 2025 • 13
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper
• 2510.26697
• Published Oct 30, 2025 • 119