-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Reproduction
I am trying to fine tune Gemma3 4B with DPOTrainer. My dataset is proper formatted with "prompt", "chosen", "rejected", and "images" columns. Since Gemma3 4b is a vision model and I passed in a processor instead of tokenizer, the trainer should call process_row
instead of tokenize_row
to avoid the issue below.
However, I saw from the source code in dpo_trainer.py
that this tokenize vs process is determined based on this:
self.is_vision_model = model.config.model_type in MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES.keys()
however MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES
does not contain gemma3
(it has paligemma
), so DPOTrainer
is using the processor I provided as a tokenizer instead of calling the proper process_row
function to prepare dataset internally.
I'm not sure if this is intentional or a bug. Perhaps we can check with MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES
instead since it contains more recent VLM models like gemma3
?
Apologies if I'm using the DPOTrainer wrongly with Gemma3 since I cannot find any notebook example showing how to use it with a vision dataset with gemma.
from unsloth import FastVisionModel
from transformers import AutoProcessor
from trl import DPOConfig, DPOTrainer
model, _ = FastVisionModel.from_pretrained(
"unsloth/gemma-3-4b-it-unsloth-bnb-4bit",
load_in_4bit=True,
use_gradient_checkpointing="unsloth",
)
processor = AutoProcessor.from_pretrained(
"unsloth/gemma-3-4b-it-unsloth-bnb-4bit", trust_remote_code=True
)
...
dpo_args = DPOConfig(...)
trainer = DPOTrainer(
model=model,
args=dpo_args,
train_dataset=formatted_dataset,
processing_class=processor,
)
outputs:
AttributeError Traceback (most recent call last)
[/tmp/ipython-input-2298813859.py](https://localhost:8080/#) in <cell line: 0>()
1 # Unsloth's version of DPOTrainer accepts a `processing_class` argument
2 # which we pass the main processor to.
----> 3 trainer = DPOTrainer(
4 model=model,
5 args=dpo_args,
13 frames
[/content/unsloth_compiled_cache/UnslothDPOTrainer.py](https://localhost:8080/#) in tokenize_row()
1028 if tokenizer.eos_token_id is not None:
1029 prompt_input_ids = prompt_input_ids + [tokenizer.eos_token_id]
-> 1030 chosen_input_ids = chosen_input_ids + [tokenizer.eos_token_id]
1031 rejected_input_ids = rejected_input_ids + [tokenizer.eos_token_id]
1032
AttributeError: 'Gemma3Processor' object has no attribute 'eos_token_id'
System Info
- Platform: Linux-6.1.123+-x86_64-with-glibc2.35
- Python version: 3.12.11
- TRL version: 0.22.1
- PyTorch version: 2.8.0+cu126
- accelerator(s): NVIDIA L4
- Transformers version: 4.55.4
- Accelerate version: 1.10.1
- Accelerate config: not found
- Datasets version: 3.6.0
- HF Hub version: 0.34.4
- bitsandbytes version: 0.45.3
- DeepSpeed version: not installed
- Diffusers version: 0.35.1
- Liger-Kernel version: not installed
- LLM-Blender version: not installed
- OpenAI version: 1.101.0
- PEFT version: 0.17.1
- vLLM version: not installed
Checklist
- I have checked that my issue isn't already filed (see open issues)
- I have included my system information
- Any code provided is minimal, complete, and reproducible (more on MREs)
- Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
- Any traceback provided is complete