[Data] Normalization and Scaling - Effect on algorithm convergence and stability #2

christianadriano · 2020-05-26T10:33:14Z

Many approximation algorithms converge better when the data is normalized (zero to one) and scaled (mean==0) . Could you please investigate if this is a possible issue that would be interesting to show?

If positive, we can easily run the algorithms with four different subsets of the utility increase data combining small and large kurtosis and skewness. We would be looking at how quickly (number of episodes) each run achieves a certain level of exploitation (reduces exploration) and how quick it reaches the maximum reward (within a determined margin of error). These are charts that Nico has already developed.

2start · 2020-06-12T12:42:14Z

@christianadriano @brrrachel Sorry, I guess I am a little late to the party.

I thought about this one again. Normalization is used in ML algorithms to normalize the impact of different predictor variables on the target variable. However, we only have a single input variable, the reward. Therefore normalization will probably have no effect because in the current RL algorithms, there is no part sensible to the absolute size of the rewards.

Regarding the transformation of the raw utilities: I don't think this is useful either because we want the agent to maximize the total utility/reward.

r_1 + r_2 + ... + r_n

However, if we somehow use a function f to alter all the rewards r_1 .. r_n we maximize the following function:

f(r_1) + f(r_2) + .. + f(r_n)

Therefore, I propose we drop this part of modifying the input data. An interesting task, however, would be to analyze the data to find possible faults/interesting characteristics, which will help us understanding the results later on.

christianadriano · 2020-06-12T16:48:03Z

@brrrachel I would like to hear Rachel opinion on this too.

brrrachel · 2020-06-13T14:14:18Z

Well I did some research about this too. Currently our aim is to be able to better predict / distinguish between the <component, failure> combinations. Some important points about normalization:

the type of distribution doesn't change
allows your the agent to distinguish good and bad actions more effectively
reduces training time

Then I red a little bit more about it and could help to deal with a non-stationary environment, too. Since in reinforcement learning the policy of behavior can change during learning, thereby it changes the distribution and magnitude of the values. An approach to deal with that is presented here: https://arxiv.org/pdf/1602.07714.pdf

Currently, I haven't a proved approach how to implement normalisation (since it isn't always about to simply scale it to [-1;1]) but I would not prefer to drop this idea right now.

christianadriano added the question Further information is requested label May 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data] Normalization and Scaling - Effect on algorithm convergence and stability #2

[Data] Normalization and Scaling - Effect on algorithm convergence and stability #2

christianadriano commented May 26, 2020 •

edited

Loading

2start commented Jun 12, 2020

christianadriano commented Jun 12, 2020

brrrachel commented Jun 13, 2020

[Data] Normalization and Scaling - Effect on algorithm convergence and stability #2

[Data] Normalization and Scaling - Effect on algorithm convergence and stability #2

Comments

christianadriano commented May 26, 2020 • edited Loading

2start commented Jun 12, 2020

christianadriano commented Jun 12, 2020

brrrachel commented Jun 13, 2020

christianadriano commented May 26, 2020 •

edited

Loading