Training plot for boxing giving values much greater than Evaluation #3

Ashutosh-Adhikari · 2018-05-14T09:52:14Z

While training, the scale of summed clipped rewards that an agent gets is much higher than what it gets for boxing and much lower for games like qbert and spaceinvaders. Any idea regarding this?

Ashutosh-Adhikari · 2018-05-14T10:01:39Z

Hi, is it because of clipping the rewards?

hengyuan-hu · 2018-05-15T18:45:20Z

I am not sure whether I get your question correctly. Different games have different rewarding mechanism. Some games have dense reward signal while some games have sparse reward signal (for example in space invaders the agent only get reward when it hit an enemy).

Ashutosh-Adhikari changed the title ~~Training plot for boxing giving values much greater than~~ Training plot for boxing giving values much greater than Evaluation May 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training plot for boxing giving values much greater than Evaluation #3

Training plot for boxing giving values much greater than Evaluation #3

Ashutosh-Adhikari commented May 14, 2018

Ashutosh-Adhikari commented May 14, 2018

hengyuan-hu commented May 15, 2018

Training plot for boxing giving values much greater than Evaluation #3

Training plot for boxing giving values much greater than Evaluation #3

Comments

Ashutosh-Adhikari commented May 14, 2018

Ashutosh-Adhikari commented May 14, 2018

hengyuan-hu commented May 15, 2018