Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement AIRL #36

Open
keiohta opened this issue Jun 2, 2019 · 32 comments
Open

Implement AIRL #36

keiohta opened this issue Jun 2, 2019 · 32 comments
Milestone

Comments

@keiohta
Copy link
Owner

keiohta commented Jun 2, 2019

Learning Robust Rewards with Adversarial Inverse Reinforcement Learning

@keiohta keiohta added this to the IRL milestone Jun 2, 2019
@keiohta
Copy link
Owner Author

keiohta commented Dec 19, 2019

Test code

# Generate trajectories
$ python examples/run_sac.py --env-name HalfCheetah-v2 --save-test-path --test-interval 50000 --gpu -1
$ ls results
20191220T185529.974847_SAC_

$ python examples/run_airl_sac.py --env-name HalfCheetah-v2 --test-interval 10000 --gpu -1 --expert-path-dir results/20191220T185529.974847_SAC_

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020

hi @keiohta when I run
$ python ~/tf2rl-master/examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir ~/GAIL/results/20200619T013740.036943_SAC_ --gpu -1 --dir-suffix GAIfO

run_gaifo_ddpg.py: error: unrecognized arguments: --gpu -1

can you help me ? Thank you!

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

@haoyu-x Hi! Thanks for reporting the bug. I fixed the error on this commit, so can you try on the latest master branch again?

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

Yeah, did you update the codes?

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

At least I resolved the error of --gpu.
Let me check whether full code runs.

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

I confirmed the script runs on my machine. Can you provide me with the full error message?

$ python examples/run_sac.py --env-name=HalfCheetah-v2 --save-test-path --test-interval=50000 --max-steps 300000
$ ls results
20200627T221712.423081_SAC_
$ find results/20200627T221712.423081_SAC_/ -name *.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_02_return_02744.1677.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_04_return_02701.9388.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_00_return_03121.5797.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_01_return_02784.6256.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_03_return_02752.4279.pkl

$ python examples/run_gail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20200627T221712.423081_SAC_/ --gpu -1
...
22:23:48.107 [INFO] (irl_trainer.py:74) Total Epi:    19 Steps:   19000 Episode Steps:  1000 Return:  1174.4017 FPS: 118.79
22:23:56.162 [INFO] (irl_trainer.py:74) Total Epi:    20 Steps:   20000 Episode Steps:  1000 Return:  1889.9691 FPS: 124.15
22:23:57.861 [INFO] (irl_trainer.py:118) Evaluation Total Steps:   20000 Average Reward  2278.0820 over  5 episodes

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

Oh, I assumed you installed tf2rl on developer mode...
I have not reflected my change on PyPI, so I do now.

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

Now, you can get the latest codes through PyPI. Can you try following?

# Update tf2rl
$ pip install -U tf2rl
# Make sure the version is 0.1.14
$ pip list | grep tf2rl

# Run your script
$ python ~/tf2rl-master/examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir ~/GAIL/results/20200619T013740.036943_SAC_ --gpu -1 --dir-suffix GAIfO

By the way, it seems that your path: ~/tf2rl-master suggests that you did not install tf2rl using git clone but you just download zip file, didn't you?
Anyway above command can detect the version, so please let me know if you still encounter the same problem.

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

I cannot see your screenshot.
Can you copy the message or retry uploading the picture?

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

I guess you collected the expert transitions on different environment (such as Pendulum-v0? because the state dimension of pendulum-v0 is 3).
Are you sure the expert data are collected on HalfCheetah-v2?

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

It's great your script runs successfully!
I cannot see your picture again... I just do:

$ tensorboard --logdir results

Does this answer your question?

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

You can add suffix to a resulted directory by adding --dir-suffix option. #67 uses it as:

$ python examples/run_gail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559_SAC_ --gpu -1 --dir-suffix GAIL
$ python examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559_SAC_ --gpu -1 --dir-suffix GAIfO
$ python examples/run_vail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559_SAC_ --gpu -1 --dir-suffix VAIL

@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

My pleasure! Please don't hesitate to open an issue if you encounter any difficulty or question.
I close this issue. Thanks for the report!

@keiohta keiohta closed this as completed Jun 27, 2020
@keiohta
Copy link
Owner Author

keiohta commented Jun 27, 2020

OMG, this issue is not related to your question. So, I have to reopen this one.
It would be better to open a new issue if it is not related to the original one ;)

@keiohta keiohta reopened this Jun 27, 2020
@haoyu-x
Copy link

haoyu-x commented Jun 27, 2020 via email

@haoyu-x
Copy link

haoyu-x commented Jul 5, 2020 via email

@ymd-h
Copy link
Contributor

ymd-h commented Jul 5, 2020

Hi, @haoyu-x

Could you open a new issue?

This is the issue where developpers track and discuss AIRL implementation.

For me, your problem is not related with the main topic of this issue.

@keiohta
Copy link
Owner Author

keiohta commented Jul 6, 2020

Thanks @yamada-github-account , @haoyu-x and yes, I also think it would be better to open a new issue regarding this.

@Aadit-Ambadkar
Copy link

Aadit-Ambadkar commented Apr 20, 2022

@keiohta I can't seem to find the run-airl-****.py files anywhere. Is this a commit issue? Am I missing something?

@keiohta
Copy link
Owner Author

keiohta commented Apr 20, 2022

Hi @Aadit-Ambadkar , we haven't fully tested AIRL yet, but you can try it on different branch: https://github.com/keiohta/tf2rl/blob/airl/examples/run_airl_sac.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants