Updates to the JumperHard env #31

Ivan-267 · 2024-04-03T21:31:54Z

Fixes some issues with initial high rewards due static platforms detecting collisions with goal areas,
adjusts rewards,
other sight modifications.

I was able to train this with relatively good results using rllib, but I ~~don't have the trained onnx yet~~ (added, see below) as I found out that the rllib script exports a different output shape than our sb3 export.

- Fixes some issues with initial high rewards due static platforms detecting collisions with goal areas, - adjusts rewards, - other sight modifications.

- Updates plugin to support rllib export and sb3 export - Updatex onnx file, trained with rllib

Ivan-267 · 2024-04-05T18:18:36Z

Updated to the latest experimental multiagent plugin (so should not be merged before), and onnx re-trained using Rllib.

Continuous actions (included onnx is from this session):

Tensorboard stats (smoothing: 0)

Hyperparams used (stopped manually with CTRL + C):

algorithm: PPO

# Multi-agent-env setting:
# If true:
# - Any AIController with done = true will receive zeroes as action values until all AIControllers are done, an episode ends at that point.
# - ai_controller.needs_reset will also be set to true every time a new episode begins (but you can ignore it in your env if needed).
# If false:
# - AIControllers auto-reset in Godot and will receive actions after setting done = true.
# - Each AIController has its own episodes that can end/reset at any point.
# Set to false if you have a single policy name for all agents set in AIControllers
env_is_multiagent: false

checkpoint_frequency: 20

# You can set one or more stopping criteria
stop:
    #episode_reward_mean: 0
    #training_iteration: 1000
    #timesteps_total: 10000
    time_total_s: 10000000

config:
    env: godot
    env_config:
        env_path: 'JumperHard.console.exe' # Set your env path here (exported executable from Godot) - e.g. 'env_path.exe' on Windows
        action_repeat: null # Doesn't need to be set here, you can set this in sync node in Godot editor as well
        show_window: true # Displays game window while training. Might be faster when false in some cases, turning off also reduces GPU usage if you don't need rendering.
        speedup: 30 # Speeds up Godot physics

    framework: torch # ONNX models exported with torch are compatible with the current Godot RL Agents Plugin

    lr: 0.0003
    lambda: 0.95
    gamma: 0.99

    vf_loss_coeff: 0.5
    vf_clip_param: .inf
    #clip_param: 0.2
    entropy_coeff: 0.0001
    entropy_coeff_schedule: null
    #grad_clip: 0.5

    normalize_actions: False
    clip_actions: True # During onnx inference we simply clip the actions to [-1.0, 1.0] range, set here to match

    rollout_fragment_length: 32
    sgd_minibatch_size: 128
    num_workers: 4
    num_envs_per_worker: 16
    train_batch_size: 2048

    num_sgd_iter: 4
    batch_mode: truncate_episodes

    num_gpus: 0
    model:
        vf_share_layers: False
        fcnet_hiddens: [64, 64]

Discrete actions:

Smaller discrete actions training session (just for testing discrete actions, onnx not included):
Relevant env code changes (AIController3D.gd):

func set_action(action):
	_player.move_action = action.move - 1
	_player.turn_action = action.turn - 1
	_player.jump_action = action.jump

func get_action_space():
	return {
		"jump": {"size": 2, "action_type": "discrete"},
		"move": {"size": 3, "action_type": "discrete"},
		"turn": {"size": 3, "action_type": "discrete"}
	}

Results:

edbeeching · 2024-04-29T20:28:29Z

examples/JumperHard/JumperHard.csproj

@@ -1,4 +1,4 @@
-<Project Sdk="Godot.NET.Sdk/4.0.3">
+<Project Sdk="Godot.NET.Sdk/4.3.0-dev.5">


if possible it would probably be best not to have a dev version here

I think this can be reverted while the project should still work (including the dev version). Will have to try it out later.

Ivan-267 added 2 commits April 3, 2024 23:30

Updates to the JumperHard env

e83456c

- Fixes some issues with initial high rewards due static platforms detecting collisions with goal areas, - adjusts rewards, - other sight modifications.

Updates plugin and onnx file

8f05f42

- Updates plugin to support rllib export and sb3 export - Updatex onnx file, trained with rllib

edbeeching reviewed Apr 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates to the JumperHard env #31

Updates to the JumperHard env #31

Ivan-267 commented Apr 3, 2024 •

edited

Loading

Ivan-267 commented Apr 5, 2024 •

edited

Loading

edbeeching Apr 29, 2024

Ivan-267 Apr 29, 2024

		@@ -1,4 +1,4 @@
		<Project Sdk="Godot.NET.Sdk/4.0.3">
		<Project Sdk="Godot.NET.Sdk/4.3.0-dev.5">

Updates to the JumperHard env #31

Are you sure you want to change the base?

Updates to the JumperHard env #31

Conversation

Ivan-267 commented Apr 3, 2024 • edited Loading

Ivan-267 commented Apr 5, 2024 • edited Loading

Continuous actions (included onnx is from this session):

Discrete actions:

edbeeching Apr 29, 2024

Choose a reason for hiding this comment

Ivan-267 Apr 29, 2024

Choose a reason for hiding this comment

Ivan-267 commented Apr 3, 2024 •

edited

Loading

Ivan-267 commented Apr 5, 2024 •

edited

Loading