Fix doc for PerceptionLMForConditionalGeneration forward. #40733

shuminghu · 2025-09-05T18:34:11Z

@auto_docstring for PerceptionLMForConditionalGeneration's forward would pin the doc for the method in modeling file to a stale version even if the doc was manually updated in the modular file.

Removing it and then run

 python utils/modular_model_converter.py --files_to_parse src/transformers/models/perception_lm/modular_perception_lm.py

updated the example in doc modeling file.

@zucchini-nlp

zucchini-nlp · 2025-09-08T08:46:36Z

src/transformers/models/perception_lm/modular_perception_lm.py

@@ -317,7 +317,6 @@ def prepare_inputs_for_generation(
        return model_inputs

    @can_return_tuple
-    @auto_docstring


we still need to keep auto docstring, it will add docs about forward args

auto_docstring make the modeling file doc stale (don't see modular file doc change)
Maybe a simple workaround is to remove, convert then add it back

What do you mean, modular doesn't copy the example snippet if we keep it? We can't delete it from modeling file as well, so it is not about modular and we can't add it back :)

zucchini-nlp · 2025-09-08T08:47:39Z

src/transformers/models/perception_lm/modular_perception_lm.py

+                "content": [
+                    {
+                        "type": "image",
+                        "url": test_image_file,


no need to download, we can put image link here (https://huggingface.co/datasets/shumingh/perception_lm_test_images/resolve/main/14496_0.PNG) and it will also work

ok. let me test this out. I've already left Meta, let me find a GPU somewhere lol.

shuminghu · 2025-09-08T16:06:43Z

Yea . See the PR description. if @autodoc_string is present in modular, no matter how one updates its docstring, convert script does NOT update the doc string in modeling. if @autodoc_string is removed from modular, convert script updates the doc string in modeling.

…

On Mon, Sep 8, 2025 at 9:01 AM Raushan Turganbay ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/transformers/models/perception_lm/modular_perception_lm.py <#40733 (comment)> : > @@ -317,7 +317,6 @@ def prepare_inputs_for_generation( return model_inputs @can_return_tuple - @auto_docstring What do you mean, modular doesn't copy the example snippet if we keep it? We can't delete it from modeling file as well, so it is not about modular and we can't add it back :) — Reply to this email directly, view it on GitHub <#40733 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAWMMF3IPS3QPSEYFC3Q4QT3RWR43AVCNFSM6AAAAACFX6KEN6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTCOJXGE3TSOBVGM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

zucchini-nlp · 2025-09-08T16:41:35Z

That is weird, similar pattern works fine with other models (e.g.

transformers/src/transformers/models/aria/modular_aria.py

Lines 1462 to 1485 in fd2a29d

    
               @can_return_tuple 
        
               @auto_docstring 
        
               def forward( 
        
                   self, 
        
                   input_ids: Optional[torch.LongTensor] = None, 
        
                   pixel_values: Optional[torch.FloatTensor] = None, 
        
                   pixel_mask: Optional[torch.LongTensor] = None, 
        
                   attention_mask: Optional[torch.Tensor] = None, 
        
                   position_ids: Optional[torch.LongTensor] = None, 
        
                   past_key_values: Optional[Cache] = None, 
        
                   inputs_embeds: Optional[torch.FloatTensor] = None, 
        
                   labels: Optional[torch.LongTensor] = None, 
        
                   use_cache: Optional[bool] = None, 
        
                   logits_to_keep: Union[int, torch.Tensor] = 0, 
        
                   cache_position: Optional[torch.LongTensor] = None, 
        
                   **kwargs: Unpack[TransformersKwargs], 
        
               ) -> Union[tuple, AriaCausalLMOutputWithPast]: 
        
                   r""" 
        
                   labels (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*): 
        
                       Labels for computing the masked language modeling loss. Indices should either be in `[0, ..., 
        
                       config.vocab_size]` or `model.image_token_id` (where `model` is your instance of `AriaForConditionalGeneration`). 
        
                       Tokens with indices set to `model.image_token_id` are ignored (masked), the loss is only 
        
                       computed for the tokens with labels in `[0, ..., config.vocab_size]`.

)

Could it be caused by version mismatch with one of the packages? 🤔

shuminghu · 2025-09-08T17:00:23Z

hmm i just noticed the "stale" example I mentioned was actually added here, post PLM PR. add43c4#diff-4bdb20fd22b6d6f61cdb8e54281a289e68ab95c377bebb3fbfa4f23c08204f45

…

On Mon, Sep 8, 2025 at 9:41 AM Raushan Turganbay ***@***.***> wrote: *zucchini-nlp* left a comment (huggingface/transformers#40733) <#40733 (comment)> That is weird, similar pattern works fine with other models (e.g. https://github.com/huggingface/transformers/blob/fd2a29d4680c972d7c4e48593cd69d875a6f2499/src/transformers/models/aria/modular_aria.py#L1462-L1485 ) Could it be caused by version mismatch with one of the packages? 🤔 — Reply to this email directly, view it on GitHub <#40733 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAWMMF7PVUAMVTSZZKTO4JD3RWWVNAVCNFSM6AAAAACFX6KEN6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTENRXGEYTONRSGQ> . You are receiving this because you authored the thread.Message ID: ***@***.***>

zucchini-nlp · 2025-09-08T17:09:59Z

Ah yeah, modular copied from llava because it had no example from Perception itself. Sorry, missed that one. As long as we add docstring it should be fine, togetheer with the decorator

shuminghu · 2025-09-08T17:13:21Z

but only modeling though, not the modular file

…

On Mon, Sep 8, 2025 at 10:10 AM Raushan Turganbay ***@***.***> wrote: *zucchini-nlp* left a comment (huggingface/transformers#40733) <#40733 (comment)> Ah yeah, modular copied from llava because it had no example from Perception itself. Sorry, missed that one. As long as we add docstring it should be fine, togetheer with the decorator — Reply to this email directly, view it on GitHub <#40733 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAWMMFZJSYFO72VM6QOLY7T3RWZ7ZAVCNFSM6AAAAACFX6KEN6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTENRXGE4TSNZQGU> . You are receiving this because you authored the thread.Message ID: ***@***.***>

zucchini-nlp · 2025-09-08T18:24:05Z

both, same as in aria and it should work

shuminghu · 2025-09-08T18:27:45Z

Oh I meant the “stale” example was in modeling only not in modular in that PR.

…

On Mon, Sep 8, 2025 at 11:24 AM Raushan Turganbay ***@***.***> wrote: *zucchini-nlp* left a comment (huggingface/transformers#40733) <#40733 (comment)> both, same as in aria and it should work — Reply to this email directly, view it on GitHub <#40733 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAWMMF42ZEZD27P4D3OF3BL3RXCVZAVCNFSM6AAAAACFX6KEN6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTENRXGQZDIMZYGY> . You are receiving this because you authored the thread.Message ID: ***@***.***>

zucchini-nlp · 2025-09-08T18:32:11Z

ahhh oke, misunderstood then. In any case, we will need to have both files updated and with identical docs to pass CI :)

zucchini-nlp · 2025-09-10T09:28:25Z

Lets merge it, it only had the decorator missing :)

github-actions · 2025-09-10T09:29:25Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: perception_lm

HuggingFaceDocBuilderDev · 2025-09-10T09:37:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp

Ah, have to approve first

shuminghu · 2025-09-10T14:55:56Z

Awesome, thanks! (I'm getting a laptop from my new employer today)

…

On Wed, Sep 10, 2025 at 2:57 AM Raushan Turganbay ***@***.***> wrote: Merged #40733 <#40733> into main. — Reply to this email directly, view it on GitHub <#40733 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAWMMF63ZZJUWQXJNQITWS33R7YZLAVCNFSM6AAAAACFX6KEN6VHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJZGYYDSMZSHA4DEMI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

zucchini-nlp · 2025-09-10T15:07:09Z

Congrats, hope you got a GPU as well 😆

Fix doc for PerceptionLMForConditionalGeneration forward.

f8afb6c

zucchini-nlp reviewed Sep 8, 2025

View reviewed changes

fix last nit

a77f124

zucchini-nlp enabled auto-merge (squash) September 10, 2025 09:28

zucchini-nlp approved these changes Sep 10, 2025

View reviewed changes

zucchini-nlp merged commit 0997c2f into huggingface:main Sep 10, 2025
14 checks passed

Fix doc for PerceptionLMForConditionalGeneration forward. #40733

Fix doc for PerceptionLMForConditionalGeneration forward. #40733

Uh oh!

Conversation

shuminghu commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zucchini-nlp Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

shuminghu Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

shuminghu Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

shuminghu commented Sep 8, 2025 via email

Uh oh!

zucchini-nlp commented Sep 8, 2025

Uh oh!

shuminghu commented Sep 8, 2025 via email

Uh oh!

zucchini-nlp commented Sep 8, 2025

Uh oh!

shuminghu commented Sep 8, 2025 via email

Uh oh!

zucchini-nlp commented Sep 8, 2025

Uh oh!

shuminghu commented Sep 8, 2025 via email

Uh oh!

zucchini-nlp commented Sep 8, 2025

Uh oh!

zucchini-nlp commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 10, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Sep 10, 2025

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shuminghu commented Sep 10, 2025 via email

Uh oh!

zucchini-nlp commented Sep 10, 2025

Uh oh!

Uh oh!

shuminghu commented Sep 5, 2025 •

edited

Loading

shuminghu Sep 8, 2025 •

edited

Loading

zucchini-nlp Sep 8, 2025 •

edited

Loading

zucchini-nlp commented Sep 10, 2025 •

edited

Loading