Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strong_basline训练到后期acc突然降到0 loss变为nan #43

Open
Tidewww opened this issue Aug 13, 2021 · 3 comments
Open

strong_basline训练到后期acc突然降到0 loss变为nan #43

Tidewww opened this issue Aug 13, 2021 · 3 comments

Comments

@Tidewww
Copy link

Tidewww commented Aug 13, 2021

Epoch: [12][ 380/1161] Time 0.496 (0.498) Acc@1 54.69% (60.06%) cross_entropy 3.617 (4.314) softmax_triplet 2.303 (3.466)
Epoch: [12][ 390/1161] Time 0.493 (0.498) Acc@1 57.03% (60.02%) cross_entropy 3.469 (4.292) softmax_triplet 1.859 (3.471)
Epoch: [12][ 400/1161] Time 0.501 (0.498) Acc@1 56.25% (59.95%) cross_entropy 3.451 (4.274) softmax_triplet 4.916 (3.471)
Epoch: [12][ 410/1161] Time 0.487 (0.498) Acc@1 53.12% (59.85%) cross_entropy 3.627 (4.257) softmax_triplet 3.050 (3.438)
Epoch: [12][ 420/1161] Time 0.489 (0.498) Acc@1 54.69% (59.79%) cross_entropy 3.693 (4.240) softmax_triplet 5.153 (3.429)
Epoch: [12][ 430/1161] Time 0.505 (0.498) Acc@1 58.59% (59.72%) cross_entropy 3.444 (4.225) softmax_triplet 1.498 (3.448)
Epoch: [12][ 440/1161] Time 0.488 (0.498) Acc@1 57.03% (59.68%) cross_entropy 3.482 (4.208) softmax_triplet 5.507 (3.431)
Epoch: [12][ 450/1161] Time 0.478 (0.498) Acc@1 60.94% (59.59%) cross_entropy 3.388 (4.195) softmax_triplet 0.360 (3.432)
Epoch: [12][ 460/1161] Time 0.487 (0.498) Acc@1 51.56% (59.52%) cross_entropy 3.739 (4.181) softmax_triplet 2.203 (3.410)
Epoch: [12][ 470/1161] Time 0.185 (0.493) Acc@1 0.00% (58.61%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 480/1161] Time 0.182 (0.487) Acc@1 0.00% (57.40%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 490/1161] Time 0.191 (0.481) Acc@1 0.00% (56.23%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 500/1161] Time 1.477 (0.477) Acc@1 0.00% (55.11%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 510/1161] Time 0.186 (0.472) Acc@1 0.00% (54.03%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 520/1161] Time 0.192 (0.466) Acc@1 0.00% (52.99%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 530/1161] Time 0.181 (0.461) Acc@1 0.00% (52.00%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 540/1161] Time 0.191 (0.456) Acc@1 0.00% (51.03%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 550/1161] Time 0.196 (0.451) Acc@1 0.00% (50.11%) cross_entropy nan (nan) softmax_triplet nan (nan)

  • 如上,在单卡训练strong_basline的时候acc突然变0 loss都变为nan

  • 之前在训练market 2 duke 的时候也有这个情况出现,单卡训练到49epoch的时候 中间几个iter会突然acc变0 loss变nan

  • 四卡训练的时候倒是没有出现这个情况,请问是为什么呢?

@chongyangwang-song
Copy link

@Tidewww Hi, I study this code recently, could we have some Communication?

@Jacobxz
Copy link

Jacobxz commented Mar 3, 2022

你好,请问单卡Nan的问题解决了吗

@Jacobxz
Copy link

Jacobxz commented Mar 3, 2022

解决了,把学习率调小就可以了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants