Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[GRPO VLM] Update split sizes to generalize
#4032 opened Sep 8, 2025 by zucchini-nlp Loading…
Enable XPU for vllm client
#4031 opened Sep 8, 2025 by jiqing-feng Draft
Add missing trainer docstrings
#4030 opened Sep 8, 2025 by albertvillanova Loading…
vllm sleep mode support
#4028 opened Sep 8, 2025 by ved1beta Loading…
2 of 5 tasks
[doc] Group paper index by trainer
#4027 opened Sep 7, 2025 by LeonEricsson Loading…
1 of 5 tasks
Fix passing model kwargs
#4019 opened Sep 5, 2025 by qgallouedec Loading…
Fix: undefined current_gradient_accumulation_steps
#4014 opened Sep 5, 2025 by ysjprojects Loading…
2 of 5 tasks
Improve typing of SFT trainer
#4007 opened Sep 4, 2025 by cyyever Loading…
✨ Improve SFT doc
#4005 opened Sep 4, 2025 by qgallouedec Loading…
5 tasks
Fix: Make sft script work when chat template is None
#3995 opened Sep 2, 2025 by rabinadk1 Loading…
1 of 5 tasks
[docs] add CP docs
#3994 opened Sep 2, 2025 by kashif Loading…
[GFPO]: implement GFPO in GRPOTrainer
#3989 opened Sep 1, 2025 by Peter-Chou Loading…
3 of 5 tasks
Dft
#3960 opened Aug 27, 2025 by 1485840691 Loading…
5 tasks
fix bug when using dataset streaming by accelerate
#3950 opened Aug 25, 2025 by kaixuanliu Loading…
Docker update
#3931 opened Aug 20, 2025 by qgallouedec Loading…
5 tasks
[SFTTrainer]: Check for assistant mask up to max_length
#3930 opened Aug 20, 2025 by pramodith Loading…
3 of 5 tasks
[DRAFT] Refactor DPO
#3906 opened Aug 15, 2025 by qgallouedec Draft
5 tasks
Test in distributed setting
#3902 opened Aug 15, 2025 by qgallouedec Loading…
5 tasks
BEMA for ref model
#3898 opened Aug 14, 2025 by qgallouedec Loading…
5 tasks
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.