Nirbhay Modhe, Vikas Jain
Recent Adavances in Computer Vision, Prof Gaurav Sharma, Autumn 2016
The project is based on the work by google Deepmind "DRAW: A recurrent neural network for image generation" by K Gregor et al. The purpose of the project was to implement and understand the paper and possible enhance the model for better image generation.
- Implemented the model proposed in the paper. The code is adapted from the implementation by Eric Jang (link here).
- Analyzed the network parameters of DRAW model.
- Analyzed the latent space of the encoded images. This deems as an important step which was missing and produced interesting observations.
- Proposed and trained 3 models incorporating CNN in the original DRAW model:
- Model I - Each Step Convolution
- Model II - Supervised Encoder
- Model III - Convolutional Encoder
Each model shows better performance than the original DRAW model.
- Implemented stochastic data generation part of the paper.
- Added an interface for the SVHN dataset to be given as input to the draw network.
- Added convolution and deconvolution wrappers for training the proposed models I, II and III.
- Implemented the evaluation phase to calculate the negative log-likelihood of the generated images.
- Added new sampling functionalities to visualize the latent space.
- Visualizing and plotting kernels of the learned CNN.
- DRAW-replication: implementation of original paper on two datasets -- MNIST and SVHN.
- models: code for three new model proposed
- error-calculation: code for calculating negative log likelihood of the images generated by different models.
- plotting: code for plotting kernels and generated images.
- report: final report and presentation of the project.
For the performance measure of the three proposed models, and analysis of the DRAW model, see final report and presentation in the 'report' folder.