This repository is the python implementation of the RLOSF-20221 Coding Exercise.
Modify the second scenario in Simulating Content Personalization with Contextual Bandits in the following ways:
- Add multiple changes to the reward distribution over time
- Introduce varying noise in the reward distribution
Run this new simulator with different exploration algorithms and vizualize and compare their performance.