Skip to content

[Bug]: Qwen2.5: Sliding window for some but all layers is not supported. This model uses sliding window but max_window_layers = 28 is less than num_hidden_layers = 28. Please open an issue to discuss this feature. #15705

@Martin7-1

Description

@Martin7-1

Your current environment

For some reason, I can't run collect_env.py in my env. Sorry about that. :) But I'm sure this problem has nothing to do with the environment.

My envinronment:

vllm: 0.7.3
cuda: 12.4
transformers: 4.50.1
trl: 0.15.2

🐛 Describe the bug

My code using vllm:

    llm = LLM(model=model_name_or_path,
              dtype="float16",
              tensor_parallel_size=tensor_parallel_size,
              max_num_seqs=batch_size,
              max_model_len=None if max_model_len == -1 else max_model_len,
              gpu_memory_utilization=0.9)
    sampling_params = SamplingParams(temperature=temperature, max_tokens=max_tokens)

    outputs = llm.generate(prompts, sampling_params)

When I want to generate response from Qwen2.5-7B-Instruct, I encounter ValueError raised by this line:

Sliding window for some but all layers is not supported. This model uses sliding window but `max_window_layers` = 28 is less than `num_hidden_layers` = 28. Please open an issue to discuss this feature.

The model I used is fine-tuned using trl library and flash-attention 2, with sliding window enabled.

Looks like there is a TODO tag on this line, does it make sense?

I'm curious why when I trained it with trl and vllm, all works fine, but when I want to predict with the fine-tuend model, the vllm throws this ValueError?

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions