Submitted by Junhyuck Kim 1 Pruning and Distilling Mixture-of-Experts into Dense Language Models KRAFTON 1 2