generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Open
Labels
Description
Feature request
Save/Load Precomputed Ref Log-Probabilities in DPOTrainer
Motivation
Currently, when precompute_ref_log_probs=True
, DPOTrainer
always recomputes the reference model log-probs (ref_chosen_logps, ref_rejected_logps) for the training and evaluation datasets.
This process can be very time-consuming, and for repeated experiments on the same dataset/model setup, it leads to redundant computation.
It would be highly beneficial to have an option to cache precomputed reference log-probs to disk and reload them later, avoiding unnecessary recomputation.
Your contribution
Introduce two new arguments:
save_ref_logps_dir
: (optional) directory path where precomputed log-probs will be storedload_ref_logps_dir
: (optional) directory path to load precomputed log-probs from- When provided, the trainer checks dataset fingerprint, number of rows, and model/tokenizer info to ensure compatibility
- If they match, cached values are loaded instead of recomputing
- If they differ, print log and the cache is ignored and recomputation proceeds as usual
If the idea sounds reasonable, I’d be happy to open a PR to implement it. Any feedback or suggestions would be greatly appreciated!