Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mAP nearly zero? #841

Open
Chenyaoyi1998 opened this issue Aug 17, 2023 · 3 comments
Open

mAP nearly zero? #841

Chenyaoyi1998 opened this issue Aug 17, 2023 · 3 comments

Comments

@Chenyaoyi1998
Copy link

What I'm trying to do

I tried to train on the VOC2007 dataset. However, I failed. After many epochs, the mAP is zero.

What I've tried

  1. Loaded with DarkNet weights——weights/darknet53.conv.74
  2. Converting VOC datasets to COCO:
    filename.xml -> filename.txt
    x = (x1 + x2) / 2
    y = (y1 + y2) / 2
    w = x2 - x1
    h = y2 - y1
    x , y, w, h = x/W, y/H, w/W, h/H
  3. modified the yolov3.cfg:
    Use the default hyperparameters
    Use the anchor by kmeans
    Change the class from 80 to 20
    Changing the feature map dimension from 3*(5+80) to 3*(5+20)

Additional context

I tried to modify the learning rate from 0.0001 to 0.001, did not work.
I tried to use sgd but not adam, did not work.

@Chenyaoyi1998
Copy link
Author

log after 10 epech:
+-------+-------------+---------+
| Index | Class | AP |
+-------+-------------+---------+
| 0 | aeroplane | 0.00000 |
| 1 | bicycle | 0.00000 |
| 2 | bird | 0.00000 |
| 3 | boat | 0.00000 |
| 4 | bottle | 0.00000 |
| 5 | bus | 0.00000 |
| 6 | car | 0.00000 |
| 7 | cat | 0.00000 |
| 8 | chair | 0.00000 |
| 9 | cow | 0.00000 |
| 10 | diningtable | 0.00000 |
| 11 | dog | 0.00000 |
| 12 | horse | 0.00000 |
| 13 | motorbike | 0.00000 |
| 14 | person | 0.00000 |
| 15 | pottedplant | 0.00000 |
| 16 | sheep | 0.00000 |
| 17 | sofa | 0.00000 |
| 18 | train | 0.00000 |
| 19 | tvmonitor | 0.00000 |
+-------+-------------+---------+
---- mAP 0.00000 ----

@Chenyaoyi1998
Copy link
Author

image

@Chenyaoyi1998
Copy link
Author

I think I found a possible reason.

I read the function ‘compute_loss’ carefully, which describes the loss as:
Find the anchor that ground truth is responsible for in the three feature maps, and the bbox predicted by these responsible anchors are the positive examples. The authors consider the rest of the prediction bbox to be negative examples. Positive examples produce three-part losses for classification, confidence, and bbox regression. Negative examples produce only a loss of confidence. In calculating the loss of confidence, the positive examples are labelled with IoU of ground truth, while the negative examples are labelled with 0.

The problem is that there is a great imbalance between the positive and negative examples, and the calculation of the loss function does not seem to be consistent with what seems to be described in the paper. During my training, the network tends to output boxes with confidence 0, and the confidence of the prediction quickly turns negative after a few batches (it goes to 0 later after sigmod).

I'm a newcomer to the field of target detection. The above is just my personal understanding, there may be deviations in the understanding of the details of calculating the loss function in the code or the paper, welcome to discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant