Call configure_module before freeze_before_training #20428

chualanagit · 2024-11-18T06:53:17Z

What does this PR do?

Fixes #19658

Before submitting

Was this discussed/agreed via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

codecov · 2024-11-18T07:15:34Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79%. Comparing base (333d1cf) to head (11c8be4).

❗ There is a different number of reports uploaded between BASE (333d1cf) and HEAD (11c8be4). Click for more details.

HEAD has 223 uploads less than BASE

Flag BASE (333d1cf) HEAD (11c8be4)

cpu 77 24

lightning_fabric 13 0

pytest 50 0

python3.9 21 6

lightning 59 18

python3.11 19 6

gpu 2 0

python3.10 10 3

python3.12 27 9

pytorch2.1 13 9

pytest-full 29 24

pytorch2.2.2 4 3

pytorch_lightning 7 6

Additional details and impacted files

@@            Coverage Diff            @@
##           master   #20428     +/-   ##
=========================================
- Coverage      88%      79%     -9%     
=========================================
  Files         267      264      -3     
  Lines       23274    23219     -55     
=========================================
- Hits        20381    18270   -2111     
- Misses       2893     4949   +2056

lantiga · 2024-11-18T23:16:10Z

Thank you @chualanagit

We need to make sure configure_model is not called twice:
https://github.com/Lightning-AI/pytorch-lightning/blob/master/src/lightning/pytorch/trainer/trainer.py#L945

is that the case?

for more information, see https://pre-commit.ci

lantiga · 2024-11-25T10:51:26Z

The current change is messing with assumptions on how hooks are called internally and how progress is tracked, I don't think this is going to work.

I think should introduce a different hook instead (configure_model) that gets called after configure_model and we can then change fine tuning to use that if the module overrides it.

chualanagit · 2024-11-25T18:31:44Z

configure_model

I double checked all the tests that failed, seems like all the failures are identical, failing on assertion that setup() needs to be called before configure_model(). I wonder if the assertions themselves need to be changed? There seem to be no other errors caught by any other test cases can that indicates this change leads to any logical failures in the library. Pushing a change to modify the assertion order in the test_hooks.py files and see if any other tests are failing.

…s.py

for more information, see https://pre-commit.ci

chualanagit · 2024-11-26T01:51:16Z

f the assertions themselves need to be changed? There seem to be no other errors caught by any other test cases can that indicates this change leads to any logical failures in the library. Pushing a change to modify the assertion order in the test_hooks.py files and see if any other tests are failing

seems like no other errors are raised, wondering if anyone knows of any reason why configure_model hook should not be called before setup hook? cc @lantiga

chualanagit · 2024-11-26T20:30:26Z

@lantiga quick question, how reliable in general are the CI tests? I have definitely seen transient errors that show up on a specific commit and disappear in another before. Also seen before that a PR was merged when some CI tests are still failing. Would love to learn more for future contributions, thanks!

call configure_module before freeze_before_training

5fd675c

chualanagit requested review from lantiga, Borda, tchaton and justusschock as code owners November 18, 2024 06:53

github-actions bot added the pl Generic label for PyTorch Lightning package label Nov 18, 2024

chualanagit added 2 commits November 18, 2024 02:19

Merge branch 'master' into chualan/fix-19658

91775f7

Merge branch 'master' into chualan/fix-19658

9da9e7d

lantiga added the waiting on author Waiting on user action, correction, or update label Nov 18, 2024

lantiga and others added 3 commits November 19, 2024 21:34

Merge branch 'master' into chualan/fix-19658

faef707

remove bad fix

90ff8f0

second fix and test case

a205c4a

chualanagit requested a review from ethanwharris as a code owner November 22, 2024 10:38

pre-commit-ci bot and others added 4 commits November 22, 2024 10:40

[pre-commit.ci] auto fixes from pre-commit.com hooks

ef35dca

for more information, see https://pre-commit.ci

Merge branch 'master' into chualan/fix-19658

0e570a8

remove print statement

56d05a3

[pre-commit.ci] auto fixes from pre-commit.com hooks

1c040d7

for more information, see https://pre-commit.ci

chualanagit changed the title ~~[wip] Call configure_module before freeze_before_training~~ Call configure_module before freeze_before_training Nov 22, 2024

chualanagit changed the title ~~Call configure_module before freeze_before_training~~ [wip] Call configure_module before freeze_before_training Nov 22, 2024

Merge branch 'master' into chualan/fix-19658

bfa0fd4

Alan Chu and others added 2 commits November 25, 2024 18:32

change assertion order for setup() and configure_model() in test_hook…

8ba644a

…s.py

[pre-commit.ci] auto fixes from pre-commit.com hooks

1d8ef66

for more information, see https://pre-commit.ci

lantiga changed the title ~~[wip] Call configure_module before freeze_before_training~~ Call configure_module before freeze_before_training Nov 25, 2024

Merge branch 'master' into chualan/fix-19658

11c8be4

Merge branch 'master' into chualan/fix-19658

9e53990

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Call configure_module before freeze_before_training #20428

Call configure_module before freeze_before_training #20428

chualanagit commented Nov 18, 2024 •

edited

Loading

codecov bot commented Nov 18, 2024 •

edited

Loading

lantiga commented Nov 18, 2024

lantiga commented Nov 25, 2024

chualanagit commented Nov 25, 2024

chualanagit commented Nov 26, 2024 •

edited

Loading

chualanagit commented Nov 26, 2024

Call configure_module before freeze_before_training #20428

Are you sure you want to change the base?

Call configure_module before freeze_before_training #20428

Conversation

chualanagit commented Nov 18, 2024 • edited Loading

What does this PR do?

PR review

codecov bot commented Nov 18, 2024 • edited Loading

Codecov Report

lantiga commented Nov 18, 2024

lantiga commented Nov 25, 2024

chualanagit commented Nov 25, 2024

chualanagit commented Nov 26, 2024 • edited Loading

chualanagit commented Nov 26, 2024

chualanagit commented Nov 18, 2024 •

edited

Loading

codecov bot commented Nov 18, 2024 •

edited

Loading

chualanagit commented Nov 26, 2024 •

edited

Loading