-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement AIRL #36
Comments
Test code # Generate trajectories
$ python examples/run_sac.py --env-name HalfCheetah-v2 --save-test-path --test-interval 50000 --gpu -1
$ ls results
20191220T185529.974847_SAC_
$ python examples/run_airl_sac.py --env-name HalfCheetah-v2 --test-interval 10000 --gpu -1 --expert-path-dir results/20191220T185529.974847_SAC_ |
hi @keiohta when I run run_gaifo_ddpg.py: error: unrecognized arguments: --gpu -1 can you help me ? Thank you! |
@haoyu-x Hi! Thanks for reporting the bug. I fixed the error on this commit, so can you try on the latest master branch again? |
should I still use the same command suggested in issue 67?
#67
when I run
python ~/tf2rl-master/examples/run_gail_ddpg.py --env-name=HalfCheetah-v2
--expert-path-dir ~/GAIL/results/20200619T013740.036943_SAC_ --gpu -1
--dir-suffix GAIL
same error.
…On Sat, Jun 27, 2020 at 7:52 PM Kei Ohta ***@***.***> wrote:
@haoyu-x <https://github.com/haoyu-x> Hi! Thanks for reporting the bug. I
fixed the error on this commit
<ab675d0>,
so can you try on the latest master branch again?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZW5GAOYIOYFBJIKW23RYXMOLANCNFSM4HSDDXZQ>
.
|
Yeah, did you update the codes? |
yes. I updated. Can you run gail and gaifo on your computer?
…On Sat, Jun 27, 2020 at 9:13 PM Kei Ohta ***@***.***> wrote:
Yeah, did you update the codes?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZUBM7ZIBNGBRSREIM3RYXV6FANCNFSM4HSDDXZQ>
.
|
At least I resolved the error of |
Is there any other method to run gail and gaifo instead of the command line?
…On Sat, Jun 27, 2020 at 9:15 PM Kei Ohta ***@***.***> wrote:
At least I resolved the error of --gpu.
Let me check whether full code runs.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZSL5LEL2TMTZBCXKITRYXWGTANCNFSM4HSDDXZQ>
.
|
I confirmed the script runs on my machine. Can you provide me with the full error message?
|
[image: Screenshot from 2020-06-27 21-38-20.png]
…On Sat, Jun 27, 2020 at 9:34 PM Kei Ohta ***@***.***> wrote:
I confirmed the script runs on my machine. Can you provide me with the
full error message?
$ python examples/run_sac.py --env-name=HalfCheetah-v2 --save-test-path --test-interval=50000 --max-steps 300000
$ ls results
20200627T221712.423081_SAC_
$ find results/20200627T221712.423081_SAC_/ -name *.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_02_return_02744.1677.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_04_return_02701.9388.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_00_return_03121.5797.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_01_return_02784.6256.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_03_return_02752.4279.pkl
$ python examples/run_gail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20200627T221712.423081_SAC_/ --gpu -1
...
22:23:48.107 [INFO] (irl_trainer.py:74) Total Epi: 19 Steps: 19000 Episode Steps: 1000 Return: 1174.4017 FPS: 118.79
22:23:56.162 [INFO] (irl_trainer.py:74) Total Epi: 20 Steps: 20000 Episode Steps: 1000 Return: 1889.9691 FPS: 124.15
22:23:57.861 [INFO] (irl_trainer.py:118) Evaluation Total Steps: 20000 Average Reward 2278.0820 over 5 episodes
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZQZU6PRI4UA6GBI2TLRYXYPZANCNFSM4HSDDXZQ>
.
|
Oh, I assumed you installed tf2rl on developer mode... |
sure. Please let me know what I should do after your change, Thank you a
lot!
…On Sat, Jun 27, 2020 at 9:39 PM Kei Ohta ***@***.***> wrote:
Oh, I assumed you installed tf2rl on developer mode...
I have not reflected my change on PyPI, so I do now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZUVWIQGBHFOU5EK4ETRYXZBZANCNFSM4HSDDXZQ>
.
|
Now, you can get the latest codes through PyPI. Can you try following?
By the way, it seems that your path: |
problem fixed. But encountering another issue. :(
…On Sat, Jun 27, 2020 at 9:48 PM Kei Ohta ***@***.***> wrote:
Now, you can get the latest codes through PyPI. Can you try following?
# Update tf2rl
$ pip install -U tf2rl
# Make sure the version is 0.1.14
$ pip list | grep tf2rl
# Run your script
$ python ~/tf2rl-master/examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir ~/GAIL/results/20200619T013740.036943_SAC_ --gpu -1 --dir-suffix GAIfO
By the way, it seems that your path: ~/tf2rl-master suggests that you did
not install tf2rl using git clone but you just download zip file, didn't
you?
Anyway above command can detect the version, so please let me know if you
still encounter the same problem.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZTP44TSSKYFXIEKWF3RYX2EDANCNFSM4HSDDXZQ>
.
|
[image: Screenshot from 2020-06-27 21-54-05.png]
…On Sat, Jun 27, 2020 at 9:53 PM Haoyu Xiong ***@***.***> wrote:
problem fixed. But encountering another issue. :(
On Sat, Jun 27, 2020 at 9:48 PM Kei Ohta ***@***.***> wrote:
> Now, you can get the latest codes through PyPI. Can you try following?
>
> # Update tf2rl
> $ pip install -U tf2rl
> # Make sure the version is 0.1.14
> $ pip list | grep tf2rl
>
> # Run your script
> $ python ~/tf2rl-master/examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir ~/GAIL/results/20200619T013740.036943_SAC_ --gpu -1 --dir-suffix GAIfO
>
> By the way, it seems that your path: ~/tf2rl-master suggests that you
> did not install tf2rl using git clone but you just download zip file,
> didn't you?
> Anyway above command can detect the version, so please let me know if you
> still encounter the same problem.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#36 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/APACPZTP44TSSKYFXIEKWF3RYX2EDANCNFSM4HSDDXZQ>
> .
>
|
I cannot see your screenshot. |
sure.
21:56:03.468 [INFO] (irl_trainer.py:74) Total Epi: 7 Steps: 7000
Episode Steps: 1000 Return: -327.7823 FPS: 4416.74
21:56:03.713 [INFO] (irl_trainer.py:74) Total Epi: 8 Steps: 8000
Episode Steps: 1000 Return: -262.8208 FPS: 4088.41
21:56:03.955 [INFO] (irl_trainer.py:74) Total Epi: 9 Steps: 9000
Episode Steps: 1000 Return: -325.9061 FPS: 4149.77
21:56:04.268 [INFO] (irl_trainer.py:74) Total Epi: 10 Steps: 10000
Episode Steps: 1000 Return: -278.5830 FPS: 4176.82
Traceback (most recent call last):
File "/home/haoyux/tf2rl-master/examples/run_gaifo_ddpg.py", line 43, in
<module>
trainer()
File
"/home/haoyux/venv/lib/python3.6/site-packages/tf2rl/experiments/irl_trainer.py",
line 113, in __call__
expert_next_states=self._expert_next_obs[indices])
File
"/home/haoyux/venv/lib/python3.6/site-packages/tf2rl/algos/gaifo.py", line
48, in train
agent_states, agent_next_states, expert_states, expert_next_states)
File
"/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py",
line 580, in __call__
result = self._call(*args, **kwds)
File
"/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py",
line 627, in _call
self._initialize(args, kwds, add_initializers_to=initializers)
File
"/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py",
line 506, in _initialize
*args, **kwds))
File
"/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/eager/function.py",
line 2446, in _get_concrete_function_internal_garbage_collected
graph_function, _, _ = self._maybe_define_function(args, kwargs)
File
"/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/eager/function.py",
line 2777, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File
"/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/eager/function.py",
line 2667, in _create_graph_function
capture_by_value=self._capture_by_value),
File
"/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py",
line 981, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File
"/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py",
line 441, in wrapped_fn
return weak_wrapped_fn().__wrapped__(*args, **kwds)
File
"/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/eager/function.py",
line 3299, in bound_method_wrapper
return wrapped_fn(*args, **kwargs)
File
"/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py",
line 968, in wrapper
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
/home/haoyux/venv/lib/python3.6/site-packages/tf2rl/algos/gaifo.py:58
_train_body *
real_logits = self.disc([expert_states, expert_next_states])
/home/haoyux/venv/lib/python3.6/site-packages/tf2rl/algos/gail.py:29
call *
features = self.l1(features)
/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:886
__call__ **
self.name)
/home/haoyux/venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_spec.py:216
assert_input_compatibility
' but received input with shape ' + str(shape))
ValueError: Input 0 of layer L1 is incompatible with the layer:
expected axis -1 of input shape to have value 34 but received input with
shape [32, 6]
(venv) haoyux@haoyux-ThinkPad:~$
[image: Screenshot from 2020-06-27 21-54-05.png]
…On Sat, Jun 27, 2020 at 10:00 PM Kei Ohta ***@***.***> wrote:
I cannot see your screenshot.
Can you copy the message or retry uploading the picture?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZVMZ7AIN7BFRXJPDCLRYX3PJANCNFSM4HSDDXZQ>
.
|
I guess you collected the expert transitions on different environment (such as Pendulum-v0? because the state dimension of pendulum-v0 is 3). |
OH! I made a stupid mistask. Thank you Kei, everything is fine now!
…On Sat, Jun 27, 2020 at 10:24 PM Kei Ohta ***@***.***> wrote:
I guess you collected the expert transitions on different environment
(such as Pendulum-v0? because the state dimension of pendulum-v0 is 3).
Are you sure the expert data are collected on HalfCheetah-v2?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZRGHFMBPALRGXXZE6DRYX6IFANCNFSM4HSDDXZQ>
.
|
one last question, how can I make a tensorboard figure like yours by
command line?
[image: Screenshot from 2020-06-27 22-28-13.png]
…On Sat, Jun 27, 2020 at 10:26 PM Haoyu Xiong ***@***.***> wrote:
OH! I made a stupid mistask. Thank you Kei, everything is fine now!
On Sat, Jun 27, 2020 at 10:24 PM Kei Ohta ***@***.***>
wrote:
> I guess you collected the expert transitions on different environment
> (such as Pendulum-v0? because the state dimension of pendulum-v0 is 3).
> Are you sure the expert data are collected on HalfCheetah-v2?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#36 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/APACPZRGHFMBPALRGXXZE6DRYX6IFANCNFSM4HSDDXZQ>
> .
>
|
It's great your script runs successfully!
Does this answer your question? |
I mean how can I visualize the training process using tensorboard.
The figure is
#67
…On Sat, Jun 27, 2020 at 10:38 PM Kei Ohta ***@***.***> wrote:
It's great your script runs successfully!
I cannot see your picture again... I just do:
$ tensorboard --logdir results
Does this answer your question?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZSWMA2M76SP3B2JWE3RYX74RANCNFSM4HSDDXZQ>
.
|
You can add suffix to a resulted directory by adding
|
yes! thank you!
…On Sat, Jun 27, 2020 at 10:47 PM Kei Ohta ***@***.***> wrote:
You can add suffix to a resulted directory by adding --dir-suffix option.
#67 <#67> uses it as:
$ python examples/run_gail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559_SAC_ --gpu -1 --dir-suffix GAIL
$ python examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559_SAC_ --gpu -1 --dir-suffix GAIfO
$ python examples/run_vail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559_SAC_ --gpu -1 --dir-suffix VAIL
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZXH2RFEPZ4ZPSAES23RYYBADANCNFSM4HSDDXZQ>
.
|
My pleasure! Please don't hesitate to open an issue if you encounter any difficulty or question. |
OMG, this issue is not related to your question. So, I have to reopen this one. |
thank you again!
…On Sat, Jun 27, 2020 at 11:01 PM Kei Ohta ***@***.***> wrote:
OMG, this issue is not related to your question. So, I have to reopen this
one.
It would be better to open a new issue if it is not related to the
original one ;)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/APACPZWHX7KRFYN2KM4B2CDRYYCTNANCNFSM4HSDDXZQ>
.
|
Hi Kei,
I'm using tf2rl'gaifo on robosuite.
https://github.com/gal-leibovich/robosuite.
but there is an error: mujoco_py.builder.MujocoException: Unknown warning
type Time = 1.3900.Check for NaN in simulation.
I found out that my policy-net generates action [nan nan nan nan nan nan
nan nan] after several episodes training. It happens on robosuite all the
time, but works well on gym.
I'm wondering if you can offer me some help. Thank you!
…On Sat, Jun 27, 2020 at 11:04 PM Haoyu Xiong ***@***.***> wrote:
thank you again!
On Sat, Jun 27, 2020 at 11:01 PM Kei Ohta ***@***.***>
wrote:
> OMG, this issue is not related to your question. So, I have to reopen
> this one.
> It would be better to open a new issue if it is not related to the
> original one ;)
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#36 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/APACPZWHX7KRFYN2KM4B2CDRYYCTNANCNFSM4HSDDXZQ>
> .
>
|
Hi, @haoyu-x Could you open a new issue? This is the issue where developpers track and discuss AIRL implementation. For me, your problem is not related with the main topic of this issue. |
Thanks @yamada-github-account , @haoyu-x and yes, I also think it would be better to open a new issue regarding this. |
@keiohta I can't seem to find the run-airl-****.py files anywhere. Is this a commit issue? Am I missing something? |
Hi @Aadit-Ambadkar , we haven't fully tested AIRL yet, but you can try it on different branch: https://github.com/keiohta/tf2rl/blob/airl/examples/run_airl_sac.py |
Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
The text was updated successfully, but these errors were encountered: