Skip to content

DPOTrainer uses tokenizer instead of processor for Gemma3 vision #3982

@supreme-gg-gg

Description

@supreme-gg-gg

Reproduction

I am trying to fine tune Gemma3 4B with DPOTrainer. My dataset is proper formatted with "prompt", "chosen", "rejected", and "images" columns. Since Gemma3 4b is a vision model and I passed in a processor instead of tokenizer, the trainer should call process_row instead of tokenize_row to avoid the issue below.

However, I saw from the source code in dpo_trainer.py that this tokenize vs process is determined based on this:

self.is_vision_model = model.config.model_type in MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES.keys()

however MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES does not contain gemma3 (it has paligemma), so DPOTrainer is using the processor I provided as a tokenizer instead of calling the proper process_row function to prepare dataset internally.

I'm not sure if this is intentional or a bug. Perhaps we can check with MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES instead since it contains more recent VLM models like gemma3?

Apologies if I'm using the DPOTrainer wrongly with Gemma3 since I cannot find any notebook example showing how to use it with a vision dataset with gemma.

from unsloth import FastVisionModel
from transformers import AutoProcessor
from trl import DPOConfig, DPOTrainer

model, _ = FastVisionModel.from_pretrained(
    "unsloth/gemma-3-4b-it-unsloth-bnb-4bit",
    load_in_4bit=True,
    use_gradient_checkpointing="unsloth",
)
processor = AutoProcessor.from_pretrained(
    "unsloth/gemma-3-4b-it-unsloth-bnb-4bit", trust_remote_code=True
)
...
dpo_args = DPOConfig(...)
trainer = DPOTrainer(
    model=model,
    args=dpo_args,
    train_dataset=formatted_dataset,
    processing_class=processor,
)

outputs:

AttributeError                            Traceback (most recent call last)
[/tmp/ipython-input-2298813859.py](https://localhost:8080/#) in <cell line: 0>()
      1 # Unsloth's version of DPOTrainer accepts a `processing_class` argument
      2 # which we pass the main processor to.
----> 3 trainer = DPOTrainer(
      4     model=model,
      5     args=dpo_args,

13 frames
[/content/unsloth_compiled_cache/UnslothDPOTrainer.py](https://localhost:8080/#) in tokenize_row()
   1028             if tokenizer.eos_token_id is not None:
   1029                 prompt_input_ids = prompt_input_ids + [tokenizer.eos_token_id]
-> 1030         chosen_input_ids = chosen_input_ids + [tokenizer.eos_token_id]
   1031         rejected_input_ids = rejected_input_ids + [tokenizer.eos_token_id]
   1032 

AttributeError: 'Gemma3Processor' object has no attribute 'eos_token_id'

System Info

  • Platform: Linux-6.1.123+-x86_64-with-glibc2.35
  • Python version: 3.12.11
  • TRL version: 0.22.1
  • PyTorch version: 2.8.0+cu126
  • accelerator(s): NVIDIA L4
  • Transformers version: 4.55.4
  • Accelerate version: 1.10.1
  • Accelerate config: not found
  • Datasets version: 3.6.0
  • HF Hub version: 0.34.4
  • bitsandbytes version: 0.45.3
  • DeepSpeed version: not installed
  • Diffusers version: 0.35.1
  • Liger-Kernel version: not installed
  • LLM-Blender version: not installed
  • OpenAI version: 1.101.0
  • PEFT version: 0.17.1
  • vLLM version: not installed

Checklist

  • I have checked that my issue isn't already filed (see open issues)
  • I have included my system information
  • Any code provided is minimal, complete, and reproducible (more on MREs)
  • Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
  • Any traceback provided is complete

Metadata

Metadata

Assignees

No one assigned

    Labels

    🏋 DPORelated to DPO🐛 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions