Content-Aware Sparsity
Status: ActiveDates:
Stealth — details after publication
The work behind the work.
Drafting, training, debugging. Each one is trying to surface.
Dates:
Stealth — details after publication
Stalled, withdrawn, or didn't survive review. Each one fed the next.
Dates:
We argue that deep Transformer layer redundancy is caused by a structural information bottleneck in the gradient's path, not its raw magnitude. Causal interventions validate the mechanism, which we then leverage to build a superior pruning method and a more efficient, tapered architecture.
Dates:
A self-guided reinforcement-learning framework that improves chain-of-thought arithmetic reasoning in LLMs by treating self-logicality across sampled rationales as the reward signal — no human-graded supervision required.
Dates:
By using a multiple-pivoting method to improve the quality of LLM output on low-resource languages, while performing a thorough analysis to better understand the workings of LLMs.