How I enable to get trained policy of Isaac-Repose-Cube-Allegro-v0 task? #1442

cold-young · 2024-11-20T08:07:31Z

cold-young
Nov 20, 2024

Hi there,

I have been trying to train the Isaac-Repose-Cube-Allegro-v0 task using the skrl PPO algorithm.
(num_envs 2048, <1 hour)
Even though the reinforcement learning reward graph has converged, I cannot get an expert policy.
I think I need well defined shaped rewards or other learning approaches like imitation learning.
By the way, I found a good demonstration of in-hand manipulation with shadow hand. on this page
To acquire expert policy, what i do ?

Thanks

RandomOakForest · 2024-11-20T20:21:48Z

RandomOakForest
Nov 20, 2024
Maintainer

Thanks for posting this. By expert policy, do you mean the final trained policy? You may find them in this directory:

./IsaacLab/logs/<learning_library_name>/<task_name>

1 reply

cold-young Nov 21, 2024
Author

Thank you for your answer.

I mean, I cannot get a well-trained policy using the "manager_based" Isaac-Repose-Cube-Allegro-v0 task.

I was able to acquire expert policy using "Isaac-Repose-Cube-Allegro-Direct-v0" task.

Although the "manager_based" task is useful to adjust many parameters (in reward and states), I cannot get expert policy. (Maybe I should change many parameters in the Isaac-Repose-Cube-Allegro-v0 task).
How do I do for getting well-trained policy using "Isaac-Repose-Cube-Allegro-v0" task.
Default setting is not work to train

StrainFlow · 2024-11-21T15:53:30Z

StrainFlow
Nov 21, 2024
Maintainer

I think the first step is for us to reproduce what you're seeing. Can you give me reproduction steps for your two policies so I can see if I get the same results? Are you using the samples found here directly:

https://isaac-sim.github.io/IsaacLab/main/source/overview/environments.html

Are these also the shadow hand examples you're looking for?

1 reply

cold-young Nov 25, 2024
Author

Hi,

I used exactly this example without any changes.
"./isaaclab.sh -p source/standalone/workflows/skrl/train.py --task Isaac-Repose-Cube-Allegro-v0 --num_envs 4096 --headless" and
"./isaaclab.sh -p source/standalone/workflows/skrl/train.py --task Isaac-Repose-Cube-Allegro-Direct-v0 --num_envs 4096 --headless"

I think "Isaac-Repose-Cube-Allegro-Direct-v0" has well-defined parameters (rewards), but "Isaac-Repose-Cube-Allegro-v0" doesn't.
And I found the radomisation of "Isaac-Repose-Cube-Allegro-v0" is more varied including mass, texture.

Thanks for your help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How I enable to get trained policy of Isaac-Repose-Cube-Allegro-v0 task? #1442

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

How I enable to get trained policy of Isaac-Repose-Cube-Allegro-v0 task? #1442

cold-young Nov 20, 2024

Replies: 2 comments · 2 replies

RandomOakForest Nov 20, 2024 Maintainer

cold-young Nov 21, 2024 Author

StrainFlow Nov 21, 2024 Maintainer

cold-young Nov 25, 2024 Author

cold-young
Nov 20, 2024

Replies: 2 comments 2 replies

RandomOakForest
Nov 20, 2024
Maintainer

cold-young Nov 21, 2024
Author

StrainFlow
Nov 21, 2024
Maintainer

cold-young Nov 25, 2024
Author