Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Command line argument 'vasp_gam' was not understood. #265

Open
ryotatomioka opened this issue Jun 19, 2023 · 4 comments
Open

Bug: Command line argument 'vasp_gam' was not understood. #265

ryotatomioka opened this issue Jun 19, 2023 · 4 comments

Comments

@ryotatomioka
Copy link

When reporting bugs/issues, please supply the following information. If this
is a feature request, please simply state the requested feature.

System

  • Custodian version: 2023.6.5
  • Python version: 3.9.16
  • OS version: Ubuntu 20.04.6 LTS

Summary

I believe a bug was introduced in this commit around these lines

if not is_killed:
logger.warning(f"killing vasp processes in work dir {workdir} failed. Resorting to 'killall'.")
cmds = self.vasp_cmd
if self.gamma_vasp_cmd:
cmds += self.gamma_vasp_cmd
for k in cmds:
if "vasp" in k:
subprocess.run(["killall", f"{k}"])

The problem is that both self.vasp_cmd and self.gamma_vasp_cmd can be lists! In this case, self.gamma_vasp_cmd is appended to self.vasp_cmd every time terminate method is called. This is the case when custodian is called from atomate2. See:
https://github.com/materialsproject/atomate2/blob/02e44c038903d2c935c82b31afd8ab82a69c039e/src/atomate2/vasp/run.py#L86-L170

This results in

Command line argument 'vasp_gam' was not understood.
in vasp.out

For some reason, custodian does not see this as an error and keep applying the same correction until the maximum number of corrections are used resulting in a confusing error message. It would be better if we can improve the error message as well.

Example code

from atomate2.vasp.jobs.core import RelaxMaker
from jobflow import run_locally
from pymatgen.core import Structure
Structure(
    lattice=[[0, 2.13, 2.13], [2.13, 0, 2.13], [2.13, 2.13, 0]],
    species=["Ba", "O"],
    coords=[[0, 0, 0], [0.5, 0.5, 0.5]]
)
relax_job = RelaxMaker().make(structure)
run_locally(relax_job, create_folders=True)

Error message

2023-06-19 05:56:50,913 INFO Started executing jobs locally
2023-06-19 05:56:50,917 INFO Starting job - relax (d1196769-4075-47b3-8a2f-6087a8c96db5)
ERROR:custodian.custodian:LargeSigmaHandler
WARNING:custodian.vasp.jobs:killing vasp processes in work dir /scratch/job_2023-06-19-05-56-50-916083-15798 failed. Resorting to 'killall'.
vasp_std: no process found
vasp_gam: no process found
ERROR:custodian.custodian:LargeSigmaHandler
WARNING:custodian.vasp.jobs:killing vasp processes in work dir /scratch/job_2023-06-19-05-56-50-916083-15798 failed. Resorting to 'killall'.
vasp_std: no process found
vasp_gam: no process found
vasp_gam: no process found
ERROR:custodian.custodian:LargeSigmaHandler
WARNING:custodian.vasp.jobs:killing vasp processes in work dir /scratch/job_2023-06-19-05-56-50-916083-15798 failed. Resorting to 'killall'.
vasp_std: no process found
vasp_gam: no process found
vasp_gam: no process found
vasp_gam: no process found
ERROR:custodian.custodian:LargeSigmaHandler
WARNING:custodian.vasp.jobs:killing vasp processes in work dir /scratch/job_2023-06-19-05-56-50-916083-15798 failed. Resorting to 'killall'.
vasp_std: no process found
vasp_gam: no process found
vasp_gam: no process found
vasp_gam: no process found
vasp_gam: no process found
ERROR:custodian.custodian:Unrecoverable error for handler: <custodian.vasp.handlers.LargeSigmaHandler object at 0x7f9f367fbfd0>
2023-06-19 06:02:35,068 INFO relax failed with exception:
Traceback (most recent call last):
  File "/opt/miniconda/lib/python3.9/site-packages/jobflow/managers/local.py", line 98, in _run_job
    response = job.run(store=store)
  File "/opt/miniconda/lib/python3.9/site-packages/jobflow/core/job.py", line 544, in run
    response = function(*self.function_args, **self.function_kwargs)
  File "/opt/miniconda/lib/python3.9/site-packages/atomate2/vasp/jobs/base.py", line 147, in make
    run_vasp(**self.run_vasp_kwargs)
  File "/opt/miniconda/lib/python3.9/site-packages/atomate2/vasp/run.py", line 167, in run_vasp
    c.run()
  File "/opt/miniconda/lib/python3.9/site-packages/custodian/custodian.py", line 383, in run
    self._run_job(job_n, job)
  File "/opt/miniconda/lib/python3.9/site-packages/custodian/custodian.py", line 521, in _run_job
    raise NonRecoverableError(s, True, x["handler"])
custodian.custodian.NonRecoverableError: Unrecoverable error for handler: <custodian.vasp.handlers.LargeSigmaHandler object at 0x7f9f367fbfd0>

INFO:jobflow.managers.local:relax failed with exception:
Traceback (most recent call last):
  File "/opt/miniconda/lib/python3.9/site-packages/jobflow/managers/local.py", line 98, in _run_job
    response = job.run(store=store)
  File "/opt/miniconda/lib/python3.9/site-packages/jobflow/core/job.py", line 544, in run
    response = function(*self.function_args, **self.function_kwargs)
  File "/opt/miniconda/lib/python3.9/site-packages/atomate2/vasp/jobs/base.py", line 147, in make
    run_vasp(**self.run_vasp_kwargs)
  File "/opt/miniconda/lib/python3.9/site-packages/atomate2/vasp/run.py", line 167, in run_vasp
    c.run()
  File "/opt/miniconda/lib/python3.9/site-packages/custodian/custodian.py", line 383, in run
    self._run_job(job_n, job)
  File "/opt/miniconda/lib/python3.9/site-packages/custodian/custodian.py", line 521, in _run_job
    raise NonRecoverableError(s, True, x["handler"])
custodian.custodian.NonRecoverableError: Unrecoverable error for handler: <custodian.vasp.handlers.LargeSigmaHandler object at 0x7f9f367fbfd0>

2023-06-19 06:02:35,069 INFO Finished executing jobs locally
INFO:jobflow.managers.local:Finished executing jobs locally
{}

Suggested solution (if known)

  • Use an immutable data type (e.g., tuple) instead of list.

Files

<If input files are needed for the error, please copy and paste them here.>

<contents of file 1>
@janosh
Copy link
Member

janosh commented Jun 20, 2023

@ryotatomioka Thanks for the repro and great analysis!

+1 for making self.vasp_cmd and self.gamma_vasp_cmd immutable.

@MichaelWolloch
Copy link
Contributor

Good catch @ryotatomioka , and sorry for messing this up. I tried to stick to the old terminate functionality, but messed up at least one indentation. Maybe more.

@janosh , I am happy to make the commands immutable in a PR, but probably this can be included in #264, since this also has to do with termination. What do you think?

@janosh
Copy link
Member

janosh commented Jun 20, 2023

@MichaelWolloch Yes, if @fyalcin would like to include a fix for this in #264, that'd be great!

@Andrew-S-Rosen
Copy link
Member

@shyuep I think this was closed in #264.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants