Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to the JumperHard env #31

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Updates to the JumperHard env #31

wants to merge 2 commits into from

Conversation

Ivan-267
Copy link
Collaborator

@Ivan-267 Ivan-267 commented Apr 3, 2024

  • Fixes some issues with initial high rewards due static platforms detecting collisions with goal areas,
  • adjusts rewards,
  • other sight modifications.

I was able to train this with relatively good results using rllib, but I don't have the trained onnx yet (added, see below) as I found out that the rllib script exports a different output shape than our sb3 export.

Ivan-267 added 2 commits April 3, 2024 23:30
- Fixes some issues with initial high rewards due static platforms detecting collisions with goal areas,
- adjusts rewards,
- other sight modifications.
- Updates plugin to support rllib export and sb3 export
- Updatex onnx file, trained with rllib
@Ivan-267
Copy link
Collaborator Author

Ivan-267 commented Apr 5, 2024

Updated to the latest experimental multiagent plugin (so should not be merged before), and onnx re-trained using Rllib.

Continuous actions (included onnx is from this session):

Tensorboard stats (smoothing: 0)
image
image
image
image
image
image
image

Hyperparams used (stopped manually with CTRL + C):

algorithm: PPO

# Multi-agent-env setting:
# If true:
# - Any AIController with done = true will receive zeroes as action values until all AIControllers are done, an episode ends at that point.
# - ai_controller.needs_reset will also be set to true every time a new episode begins (but you can ignore it in your env if needed).
# If false:
# - AIControllers auto-reset in Godot and will receive actions after setting done = true.
# - Each AIController has its own episodes that can end/reset at any point.
# Set to false if you have a single policy name for all agents set in AIControllers
env_is_multiagent: false

checkpoint_frequency: 20

# You can set one or more stopping criteria
stop:
    #episode_reward_mean: 0
    #training_iteration: 1000
    #timesteps_total: 10000
    time_total_s: 10000000

config:
    env: godot
    env_config:
        env_path: 'JumperHard.console.exe' # Set your env path here (exported executable from Godot) - e.g. 'env_path.exe' on Windows
        action_repeat: null # Doesn't need to be set here, you can set this in sync node in Godot editor as well
        show_window: true # Displays game window while training. Might be faster when false in some cases, turning off also reduces GPU usage if you don't need rendering.
        speedup: 30 # Speeds up Godot physics

    framework: torch # ONNX models exported with torch are compatible with the current Godot RL Agents Plugin

    lr: 0.0003
    lambda: 0.95
    gamma: 0.99

    vf_loss_coeff: 0.5
    vf_clip_param: .inf
    #clip_param: 0.2
    entropy_coeff: 0.0001
    entropy_coeff_schedule: null
    #grad_clip: 0.5

    normalize_actions: False
    clip_actions: True # During onnx inference we simply clip the actions to [-1.0, 1.0] range, set here to match

    rollout_fragment_length: 32
    sgd_minibatch_size: 128
    num_workers: 4
    num_envs_per_worker: 16
    train_batch_size: 2048

    num_sgd_iter: 4
    batch_mode: truncate_episodes

    num_gpus: 0
    model:
        vf_share_layers: False
        fcnet_hiddens: [64, 64]

Discrete actions:

Smaller discrete actions training session (just for testing discrete actions, onnx not included):
Relevant env code changes (AIController3D.gd):

func set_action(action):
	_player.move_action = action.move - 1
	_player.turn_action = action.turn - 1
	_player.jump_action = action.jump

func get_action_space():
	return {
		"jump": {"size": 2, "action_type": "discrete"},
		"move": {"size": 3, "action_type": "discrete"},
		"turn": {"size": 3, "action_type": "discrete"}
	}

Results:
image
image
image
image
image

@@ -1,4 +1,4 @@
<Project Sdk="Godot.NET.Sdk/4.0.3">
<Project Sdk="Godot.NET.Sdk/4.3.0-dev.5">
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if possible it would probably be best not to have a dev version here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be reverted while the project should still work (including the dev version). Will have to try it out later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants