- Notifications
You must be signed in to change notification settings - Fork 31.4k
[WIP] Fix weights initialization of several vision models #19449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
129f7b2 to a280d8c Compare | The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
amyeroberts left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Just a few comments on updating the TF models and initialization calls in inits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come it's only the VIT model that has these tests skipped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make the equivalent initialization updates in the TF model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make the equivalent initialization updates in the TF model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make the equivalent updates for the TF model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might need the post_init() call here
| self.config = config | |
| self.config = config | |
| # Initialize weights and apply final processing | |
| self.post_init() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And possibly here
| self.num_patches = num_patches | |
| # Initialize weights and apply final processing | |
| self.post_init() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this up out of the for-loop as it doesn't need to be redefined for each parameter
| This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
b298da1 to be67d33 Compare | Any further updates? @NielsRogge @amyeroberts |
What does this PR do?
This PR is a follow-up of #19341, to make sure weights are properly initialized when training vision models from scratch.