huggingface / trl Public

generated from fastai/nbdev_template

Notifications You must be signed in to change notification settings
Fork 2.2k
Star 15.4k

Code
Issues 479
Pull requests 74
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 33 Milestones 0

New pull request New

74 Open 1,901 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[GRPO VLM] Update split sizes to generalize

#4032 opened Sep 8, 2025 by zucchini-nlp

Loading…

Enable XPU for vllm client

#4031 opened Sep 8, 2025 by jiqing-feng • Draft

Add missing trainer docstrings

#4030 opened Sep 8, 2025 by albertvillanova

Loading…

vllm sleep mode support

#4028 opened Sep 8, 2025 by ved1beta

Loading…

2 of 5 tasks

[doc] Group paper index by trainer

#4027 opened Sep 7, 2025 by LeonEricsson

Loading…

1 of 5 tasks

Made ref_model as None in PPO trainer for refined args

#4024 opened Sep 7, 2025 by complete-dope

Loading…

Fix #3982: Fix DPO Trainer support for Gemma 3 vision models

#4022 opened Sep 6, 2025 by akshay-babbar

Loading…

Fix passing model kwargs

#4019 opened Sep 5, 2025 by qgallouedec

Loading…

Fix: undefined current_gradient_accumulation_steps

#4014 opened Sep 5, 2025 by ysjprojects

Loading…

2 of 5 tasks

Fix: ignore precompute_ref_log_probs when use_liger_loss=True

#4008 opened Sep 4, 2025 by ginkyenglee

Loading…

5 tasks

Improve typing of SFT trainer

#4007 opened Sep 4, 2025 by cyyever

Loading…

⚖️ Align SFT and DPO for model creation and deprecate DPOConfig.padding_value in favour or pad_token_id

#4006 opened Sep 4, 2025 by qgallouedec

Loading…

5 tasks

✨ Improve SFT doc

#4005 opened Sep 4, 2025 by qgallouedec

Loading…

5 tasks

Remove attention mask when position ids is returned

#3997 opened Sep 2, 2025 by qgallouedec • Draft

Fix: Make sft script work when chat template is None

#3995 opened Sep 2, 2025 by rabinadk1

Loading…

1 of 5 tasks

[docs] add CP docs

#3994 opened Sep 2, 2025 by kashif

Loading…

[GFPO]: implement GFPO in GRPOTrainer

#3989 opened Sep 1, 2025 by Peter-Chou

Loading…

3 of 5 tasks

Enable saving and loading precomputed reference log probabilities in …

#3986 opened Sep 1, 2025 by ginkyenglee

Loading…

3 tasks

Dft

#3960 opened Aug 27, 2025 by 1485840691

Loading…

5 tasks

fix bug when using dataset streaming by accelerate

#3950 opened Aug 25, 2025 by kaixuanliu

Loading…

Docker update

#3931 opened Aug 20, 2025 by qgallouedec

Loading…

5 tasks

[SFTTrainer]: Check for assistant mask up to max_length

#3930 opened Aug 20, 2025 by pramodith

Loading…

3 of 5 tasks

[DRAFT] Refactor DPO

#3906 opened Aug 15, 2025 by qgallouedec • Draft

5 tasks

Test in distributed setting

#3902 opened Aug 15, 2025 by qgallouedec

Loading…

5 tasks

BEMA for ref model

#3898 opened Aug 14, 2025 by qgallouedec

Loading…

5 tasks

Previous 1 2 3 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!