Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use model for Inference #18

Open
ohernpaul opened this issue Jun 7, 2018 · 7 comments
Open

How to use model for Inference #18

ohernpaul opened this issue Jun 7, 2018 · 7 comments

Comments

@ohernpaul
Copy link

Hello,

I have trained your model on omniglot (train and validation combined) and am achieving similar results to what you're presenting. I am interested in running a single image through the network for prediction, but am having trouble wrapping my head around how to do it.

The network takes in samples, each sample contains a number of sequences. The default setting is batch size of 16 with seq_length of 50, so 50 * 16 images. After the model is trained and saved, the size of the memory block remains 50 sequences long.

How do I perform inference on a single image with a trained model containing a memory module of 50 sequences long? I am also aware that the labels are created arbitrarily once a batch is collected, how should I work with this for inference on a single image?

Thank you so much,
Paul

@snowkylin
Copy link
Owner

Well, it will be weird if you want to do inference on a single image using this model. The sequence is a "learning process" for meta-learning which is required in the task proposed in the paper.

@ohernpaul
Copy link
Author

So, say I am trying to do binary classification on cats and dogs. The model has been trained and is ready for inference. I go out and collect 25 real photos of cats and 25 of dogs. For the model to be able to predict the classes correctly, it needs to fill the memory module first, correct? So I should run inference using multiple images of each class to first fill the memory module?

Sorry for being so confused! I am having a hard time understanding how this would be used in a real world application.

@snowkylin
Copy link
Owner

Yes, you need to send at least 1 image of each class to the sequence (you can call it "train phase") so the model can have a chance to know what a cat and a dog looks like (and store them in the memory). The more images you send to the model, the more accucate the model will be. For remaining part of the sequence (or "test phase"), you can send one or more images and see whether the model can give the right answer. To make things simple you can just send one image.

@ohernpaul
Copy link
Author

Ok, so there is a preliminary phase (after truly training the network) before inference that I should do to prime the model.

Thank you for your time and quick responses, you rock!

@ohernpaul
Copy link
Author

My final question is:

If I train the entire network from scratch on a sequence length of 50 and save the model, does that mean I will need to fill memory up (prime) with 50 sequences before inference mode?

Or do I fill it up with one instance per class, then the rest of the images in the sequence are unseen test images for inference?

@snowkylin
Copy link
Owner

No need for 50 sequences. After training the network, the model should have ability to classify latter part of images in a sequence based on preceding part of the sequence. One or more instance per class (with their labels) should be filled in the sequence before (real) test image. Details can be found in my slide.

@fschiro
Copy link

fschiro commented Apr 17, 2023

Is this really true? When you train the model you are passing images through and updating the weights of the memory-matrix, read-heads, and write-heads. So every time you pass an image through the model in the training phase the memory-matrix will be updated.

So when you save the model weights after training phase, the memory matrix weights will also be saved. So why should you need to prime the model?

Ok, so there is a preliminary phase (after truly training the network) before inference that I should do to prime the model.

Thank you for your time and quick responses, you rock!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants