You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I tried to replicate the crawler example, but my agent doesn't want to learn. Maybe I didn’t set up his rewards and observations completely or didn’t select the config correctly? Here's the code:
Hello, I tried to replicate the crawler example, but my agent doesn't want to learn. Maybe I didn’t set up his rewards and observations completely or didn’t select the config correctly? Here's the code:
CONFIG:
behaviors:
Agent1:
trainer_type: ppo
hyperparameters:
batch_size: 2048
buffer_size: 20480
learning_rate: 0.0003
beta: 0.005
epsilon: 0.2
lambd: 0.95
num_epoch: 3
learning_rate_schedule: linear
network_settings:
normalize: true
hidden_units: 256
num_layers: 3
vis_encode_type: simple
reward_signals:
extrinsic:
gamma: 0.995
strength: 1.0
keep_checkpoints: 5
max_steps: 10000000
time_horizon: 1000
summary_freq: 30000
The text was updated successfully, but these errors were encountered: