Skip to content

Feature Request: Save/Load Precomputed Ref Log-Probabilities in DPOTrainer #3985

@ginkyenglee

Description

@ginkyenglee

Feature request

Save/Load Precomputed Ref Log-Probabilities in DPOTrainer

Motivation

Currently, when precompute_ref_log_probs=True, DPOTrainer always recomputes the reference model log-probs (ref_chosen_logps, ref_rejected_logps) for the training and evaluation datasets.
This process can be very time-consuming, and for repeated experiments on the same dataset/model setup, it leads to redundant computation.

It would be highly beneficial to have an option to cache precomputed reference log-probs to disk and reload them later, avoiding unnecessary recomputation.

Your contribution

Introduce two new arguments:

  • save_ref_logps_dir: (optional) directory path where precomputed log-probs will be stored
  • load_ref_logps_dir: (optional) directory path to load precomputed log-probs from
    • When provided, the trainer checks dataset fingerprint, number of rows, and model/tokenizer info to ensure compatibility
    • If they match, cached values are loaded instead of recomputing
    • If they differ, print log and the cache is ignored and recomputation proceeds as usual

If the idea sounds reasonable, I’d be happy to open a PR to implement it. Any feedback or suggestions would be greatly appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions