Add `skip_post_init` flag #42497

Aravind-11 · 2025-11-30T00:43:11Z

What does this PR do?

This PR adds a skip_post_init parameter to PreTrainedModel.from_pretrained() to allow users to skip the post-initialization step that reinitializes model weights. This is essential for users who subclass models with custom parameters and want to preserve their initialization.

When users subclass HuggingFace models (e.g., Qwen2VLForConditionalGeneration) and add custom parameters initialized in their __init__(), the initialization gets silently overwritten by post_init() which calls _init_weights() .

Fixes #42418

Before submitting

Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests? Let me know if additional tests are needed.

Who can review?

@ArthurZucker @Cyrilvallez

ArthurZucker

Ty tho I think we want to be a bit more careful in designing this!
We need tests that's for sure, but yes we need a good api to make sure users are not supprised.

For example -> based on the classes that are in the public init only.
cc @Cyrilvallez wdyt

Cyrilvallez · 2025-12-01T15:05:04Z

Hey @Aravind-11 ! The thing is that post_init is responsible for more than weight initialization, so we cannot simply skip it. Also, adding yet another flag is not a good idea IMO. Custom models should instead do

def _init_weights(self, module): ...

to skip weight initialization, instead of

self.skip_post_init = True

It's not more verbose, and avoid adding a flag and missing the rest of the post_init operations.

However, we could indeed think of something so that it's the default from remote code models. But in general we cannot be sure if people will rely on it or not, so a bit hard to decide. What's sure it that this used to be the default, so could make sense to go back to it for remote code

Aravind-11 · 2025-12-01T15:31:56Z

That makes a lot of sense @Cyrilvallez ! Thank you for clarifying!!

Aravind-11 added 2 commits November 29, 2025 17:38

Add skip_post_init flag

4bf25b3

make style

f1cfc07

ArthurZucker reviewed Dec 1, 2025

View reviewed changes

Cyrilvallez mentioned this pull request Dec 1, 2025

Custom nn.Parameter initialization in PreTrainedModel subclasses is overwritten by post_init()/from_pretrained() causing NaNs/Zeros #42418

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `skip_post_init` flag #42497

Add `skip_post_init` flag #42497

Uh oh!

Aravind-11 commented Nov 30, 2025

ArthurZucker left a comment

Cyrilvallez commented Dec 1, 2025

Aravind-11 commented Dec 1, 2025

Labels

3 participants

Add skip_post_init flag #42497

Are you sure you want to change the base?

Add skip_post_init flag #42497

Uh oh!

Conversation

Aravind-11 commented Nov 30, 2025

What does this PR do?

Before submitting

Who can review?

ArthurZucker left a comment

Choose a reason for hiding this comment

Cyrilvallez commented Dec 1, 2025

Aravind-11 commented Dec 1, 2025

Labels

3 participants

Add `skip_post_init` flag #42497

Add `skip_post_init` flag #42497