Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
dystrio/Qwen3.5-9B-Sculpt-Experimental · Hugging Face
[go: Go Back, main page]

Qwen3.5-9B-Sculpt-Experimental

18% FFN compression with live teacher distillation. Drop-in replacement — no custom kernels, no runtime changes.

Dystrio Sculpt structurally compresses transformer FFN layers, producing dense models that load with standard transformers.

This is the Experimental tier of Qwen3.5-9B.

Use case: Local — maximum compression (1.27x prefill)

Benchmark Results (lm_eval)

Model MMLU HellaSwag ARC-C TruthfulQA Winogrande GSM8K
Qwen3.5-9B (baseline) 78.7 78.1 55.6 53.7 73.0 87.3
Sculpt Default (kf=0.95) 76.2 (↓2.5) 75.8 (↓2.3) 56.4 (↑0.8) 52.6 (↓1.1) 68.7 (↓4.3) 81.5 (↓5.8)
Sculpt Production (kf=0.9) 73.9 (↓4.8) 75.1 (↓3.0) 56.8 (↑1.2) 47.3 (↓6.4) 69.8 (↓3.2) 74.5 (↓12.8)
Sculpt Throughput (kf=0.88) 70.8 (↓7.9) 74.0 (↓4.1) 57.2 (↑1.6) 52.0 (↓1.7) 70.7 (↓2.3) 69.6 (↓17.7)
Sculpt Experimental (kf=0.82) 70.2 (↓8.5) 70.7 (↓7.4) 53.6 (↓2.0) 47.6 (↓6.1) 66.6 (↓6.4) 54.7 (↓32.6)

This Model vs Baseline

Benchmark Experimental Baseline Delta
arc_challenge 53.6 55.6 -2.0
gsm8k 54.7 87.3 -32.6
hellaswag 70.7 78.1 -7.4
mmlu 70.2 78.7 -8.5
mmlu_abstract_algebra 48.0 66.0 -18.0
mmlu_anatomy 57.8 77.8 -20.0
mmlu_astronomy 79.6 92.8 -13.2
mmlu_business_ethics 70.0 82.0 -12.0
mmlu_clinical_knowledge 74.0 86.8 -12.8
mmlu_college_biology 80.6 93.1 -12.5
mmlu_college_chemistry 53.0 59.0 -6.0
mmlu_college_computer_science 64.0 82.0 -18.0
mmlu_college_mathematics 46.0 64.0 -18.0
mmlu_college_medicine 70.5 81.5 -11.0
mmlu_college_physics 50.0 64.7 -14.7
mmlu_computer_security 78.0 83.0 -5.0
mmlu_conceptual_physics 80.0 90.2 -10.2
mmlu_econometrics 64.0 73.7 -9.7
mmlu_electrical_engineering 68.3 82.1 -13.8
mmlu_elementary_mathematics 66.4 80.7 -14.3
mmlu_formal_logic 59.5 65.9 -6.4
mmlu_global_facts 38.0 50.0 -12.0
mmlu_high_school_biology 85.5 93.5 -8.0
mmlu_high_school_chemistry 69.0 77.8 -8.8
mmlu_high_school_computer_science 77.0 88.0 -11.0
mmlu_high_school_european_history 78.2 87.3 -9.1
mmlu_high_school_geography 89.4 92.4 -3.0
mmlu_high_school_government_and_politics 90.7 96.9 -6.2
mmlu_high_school_macroeconomics 74.6 85.9 -11.3
mmlu_high_school_mathematics 42.2 53.3 -11.1
mmlu_high_school_microeconomics 82.4 93.3 -10.9
mmlu_high_school_physics 55.6 72.8 -17.2
mmlu_high_school_psychology 89.0 93.2 -4.2
mmlu_high_school_statistics 73.1 78.7 -5.6
mmlu_high_school_us_history 82.8 90.2 -7.4
mmlu_high_school_world_history 85.7 89.9 -4.2
mmlu_human_aging 69.5 78.9 -9.4
mmlu_human_sexuality 78.6 86.3 -7.7
mmlu_humanities 64.4 70.5 -6.1
mmlu_international_law 83.5 90.1 -6.6
mmlu_jurisprudence 77.8 84.3 -6.5
mmlu_logical_fallacies 76.1 84.7 -8.6
mmlu_machine_learning 59.8 66.1 -6.3
mmlu_management 80.6 86.4 -5.8
mmlu_marketing 89.7 95.7 -6.0
mmlu_medical_genetics 78.0 91.0 -13.0
mmlu_miscellaneous 80.5 90.3 -9.8
mmlu_moral_disputes 74.3 81.2 -6.9
mmlu_moral_scenarios 48.6 53.3 -4.7
mmlu_nutrition 75.2 86.3 -11.1
mmlu_other 72.9 83.1 -10.2
mmlu_philosophy 76.2 80.4 -4.2
mmlu_prehistory 75.0 84.3 -9.3
mmlu_professional_accounting 56.0 65.6 -9.6
mmlu_professional_law 54.9 60.3 -5.4
mmlu_professional_medicine 76.5 91.5 -15.0
mmlu_professional_psychology 72.7 82.8 -10.1
mmlu_public_relations 68.2 73.6 -5.4
mmlu_security_studies 75.5 76.7 -1.2
mmlu_social_sciences 79.9 87.0 -7.1
mmlu_sociology 85.1 89.1 -4.0
mmlu_stem 66.5 78.3 -11.8
mmlu_us_foreign_policy 84.0 90.0 -6.0
mmlu_virology 51.8 56.6 -4.8
mmlu_world_religions 77.8 86.5 -8.7
truthfulqa_mc2 47.6 53.7 -6.1
winogrande 66.6 73.0 -6.4

Performance

Metric Sculpt Baseline Change
Model size 15.1 GB 16.7 GB -9.6%
Parameters 8,098,165,248
Prefill throughput 5,803 tok/s 4,566 tok/s +27%
Decode throughput 36 tok/s 37 tok/s -4%

KV-cache footprint is unchanged — Sculpt only compresses FFN layers, not attention.

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "dystrio/Qwen3.5-9B-Sculpt-Experimental",
    torch_dtype="bfloat16",
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("dystrio/Qwen3.5-9B-Sculpt-Experimental")

inputs = tokenizer("The future of AI inference is", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

All Sculpt Tiers

Tier HuggingFace Config Use Case
Default dystrio/Qwen3.5-9B-Sculpt-Default kf=0.95 Enterprise — maximum quality preservation
Production dystrio/Qwen3.5-9B-Sculpt-Production kf=0.9 Enterprise — balanced quality and efficiency
Throughput dystrio/Qwen3.5-9B-Sculpt-Throughput kf=0.88 Local/throughput — speed sweet spot (1.25x prefill)
Experimental dystrio/Qwen3.5-9B-Sculpt-Experimental kf=0.82 Local — maximum compression (1.27x prefill)

Technical Details

  • Method: Structural FFN pruning with importance-aware block selection + live teacher distillation (alpha=0.5)
  • Keep fraction: 0.82 (18% of FFN neurons removed)
  • Repair: 8-stage cosine-LR fine-tuning with best-checkpoint restore
  • Training data: general_v2 mixture (WikiText, OpenHermes 2.5, MMLU, HellaSwag, GSM8K, OpenOrca)
  • Hardware: 1x NVIDIA H200 141GB
  • Output: Standard dense transformer — loads with any HuggingFace-compatible framework

Compatibility

  • HuggingFace Transformers
  • vLLM
  • TGI (Text Generation Inference)
  • llama.cpp / GGUF conversion
  • AWQ / GPTQ quantization
  • Any framework that loads standard safetensors

Citation

@misc{dystrio_sculpt_2026,
  title={Dystrio Sculpt: Structural Compilation for Transformer LLMs},
  author={Dystrio},
  year={2026},
  url={https://huggingface.co/dystrio}
}
Downloads last month
351
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dystrio/Qwen3.5-9B-Sculpt-Experimental

Finetuned
Qwen/Qwen3.5-9B
Finetuned
(171)
this model

Datasets used to train dystrio/Qwen3.5-9B-Sculpt-Experimental