7 30 30

Sangwoo Park

Sangsang

swgger

AI & ML interests

I do LLM post-training research (KAIST AI)

Recent Activity

upvoted a paper 5 days ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

upvoted a paper 5 days ago

Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents

updated a model 6 days ago

Sangsang/feedback_asymmetric_kl_fixed_ema_Qwen3-14B_bw0p5_fw0p5_ema0p999_ep30

View all activity

Organizations

None yet

upvoted 2 papers 5 days ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published 7 days ago • 82

Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents

Paper • 2604.14004 • Published 6 days ago • 29

updated a model 6 days ago

Sangsang/feedback_asymmetric_kl_fixed_ema_Qwen3-14B_bw0p5_fw0p5_ema0p999_ep30

Text Generation • Updated 6 days ago • 13

published a model 6 days ago

Sangsang/feedback_asymmetric_kl_fixed_ema_Qwen3-14B_bw0p5_fw0p5_ema0p999_ep30

Text Generation • Updated 6 days ago • 13

updated a model 8 days ago

Sangsang/grpo_Qwen3-0.6B_bs16_g16_mb128_lr1e-6_b1e-3_clip0p2_temp0p7_ep30

Text Generation • Updated 8 days ago • 11

published a model 8 days ago

Sangsang/grpo_Qwen3-0.6B_bs16_g16_mb128_lr1e-6_b1e-3_clip0p2_temp0p7_ep30

Text Generation • Updated 8 days ago • 11

updated a model 9 days ago

Sangsang/feedback_asymmetric_fixed_ema_Llama-3.1-8B-Instruct_bw0p5_fw0p5_ema0p999_ep30_v2

Text Generation • Updated 9 days ago • 18

published a model 9 days ago

Sangsang/feedback_asymmetric_fixed_ema_Llama-3.1-8B-Instruct_bw0p5_fw0p5_ema0p999_ep30_v2

Text Generation • Updated 9 days ago • 18

updated a model 10 days ago

Sangsang/ci-sft_Qwen3-4B

Text Generation • Updated 10 days ago • 17

published a model 10 days ago

Sangsang/ci-sft_Qwen3-4B

Text Generation • Updated 10 days ago • 17

updated a dataset 10 days ago

Sangsang/CI-Qwen3-32B-Instruct-Augmented-Responses

Viewer • Updated 10 days ago • 729 • 30

published a dataset 10 days ago

Sangsang/CI-Qwen3-32B-Instruct-Augmented-Responses

Viewer • Updated 10 days ago • 729 • 30

updated a model 11 days ago

Sangsang/ci-sft_Qwen3-4B-Instruct-2507_lr1e-6_ep30

Text Generation • Updated 11 days ago • 14

published a model 11 days ago

Sangsang/ci-sft_Qwen3-4B-Instruct-2507_lr1e-6_ep30

Text Generation • Updated 11 days ago • 14

updated 2 datasets 11 days ago

Sangsang/CI-Qwen3-32B-Augmented-Responses

Viewer • Updated 10 days ago • 729 • 56

Sangsang/CI-Qwen2.5-32B-Instruct-Augmented-Responses

Viewer • Updated 10 days ago • 729 • 37

published a dataset 13 days ago

Sangsang/CI-Qwen3-32B-Augmented-Responses

Viewer • Updated 10 days ago • 729 • 56

updated a dataset 13 days ago

Sangsang/ContextualIntegritySyntheticDataset_Qwen3-32B_all

Viewer • Updated 13 days ago • 729 • 59

published a dataset 13 days ago

Sangsang/ContextualIntegritySyntheticDataset_Qwen3-32B_all

Viewer • Updated 13 days ago • 729 • 59

updated a model 13 days ago

Sangsang/qwen3_onpolicy

Text Generation • Updated 13 days ago • 34

Sangwoo Park

AI & ML interests

Recent Activity

Organizations

Sangsang's activity