Skip to content

Pull requests: THUDM/slime

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add rollout sampling-mask support
#1795 opened Apr 2, 2026 by yitianlian Loading…
Hook proposal
#1774 opened Mar 27, 2026 by andrija-s Draft
[Fix] Initialize grad_norm before found_inf skip path
#1762 opened Mar 24, 2026 by kaysonyu Loading…
feat: add npu patch for qwen3-vl-8b grpo & ppo
#1750 opened Mar 23, 2026 by cjy0x Loading…
[docker] fix qwen3_vl visual module loading
#1727 opened Mar 15, 2026 by ZHZisZZ Loading…
Add Mooncake Backend for Rollout Data Transfer run-ci-megatron
#1709 opened Mar 11, 2026 by zxpdemonio Loading…
6 tasks done
fix: make ray actor gpu fractions configurable
#1699 opened Mar 10, 2026 by ailuntz Loading…
fix: accept unboxed math answers
#1698 opened Mar 10, 2026 by ailuntz Loading…
fix: default reward for aborted samples
#1697 opened Mar 10, 2026 by ailuntz Loading…
fix: handle missing sglang cuda-graph constant
#1696 opened Mar 10, 2026 by ailuntz Loading…
PipelineRL -- keep cache on weight update
#1694 opened Mar 9, 2026 by hari-hm Loading…
fix: normalize rewards per-group when sample counts are unequal
#1655 opened Mar 2, 2026 by dubin555 Loading…
2 of 3 tasks
feat: Add knowledge distillation example with offline support
#1654 opened Mar 2, 2026 by tourzhao Loading…
3 tasks
Refactor code safety checks by removing patterns
#1643 opened Feb 28, 2026 by Rohan5commit Loading…
ProTip! Filter pull requests by the default branch with base:main.