New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

MPMorph Flows #938

Open

BryantLi-BLI wants to merge 366 commits into materialsproject:main from BryantLi-BLI:feature/mpmorph

+2,322 −8

BryantLi-BLI commented Jul 30, 2024 •

edited

Loading

This PR seeks to port the original MPMorph workflows from atomate to atomate2. These workflows are used to generate equilibrated amorphous/crystalline/glassy structures with MD runs via AIMD.

Tagging in @Tinaatucsd, @mcgalcode, @mattmcdermott, @esoteric-ephemera, @vir-k01, @sivonxay because they have used it recently or are apart of the original development team.

Completed

Volume equilibration workflow meant to equilibrate structure without NPT via a series of NVT runs (Equilibration + production volume)
Code independent equilibration and production workflow, with optional quench methods
VASP workflows used in MP papers below recreated in this PR:
There is an additional feature with this PR to incorporate MLFF MD runs with the same flow.
Multi quench options available:
Fast quench (double relax and static)
Slow quench (series of MD runs to simulate cooling with different temperature profiles)
Started to add tests for VASP
MLFF tests are completed
Pytests
Packmol conversion to pymatgen interface (used to generated initial amorphous structure)
Slow quench temperature profiles

MISC

Seeking feedback for code, missing features, usability

NOTE: Original PR closed and reopened due to some git commit history issues that caused large commit changes. This PR should have resolved the issue. Many thanks to @mcgalcode.

esoteric-ephemera and others added 30 commits

March 1, 2024 15:11


          Fix failing tests

9ea1ef9


          Change MD default TaskDoc ionic step to only store mandatory info


          update pymatgen requirement because of CifParser deprecation

56edc31


          Merge branch 'materialsproject:main' into mlff_md

b52a888


          Significant refactor to all forcefield jobs. Use commmon ASE calc str…

5bbe9d7

…ucture, allow loading via MontyDecoder. Add check in relaxation for force convergence, attr in Forcefield taskdoc


          Merge branch 'main' into mlff_md

656c5d6


          Fix forcefield utils test and lint

21dd709


          Add revert_dtype env for running forcefield relax and md jobs / undo …

9ba7d29

…removal


          Merge branch 'main' into mlff_md

0e8cce1


          refactor nequip jobs, add md default for nequip

8a2451e


          Add GAP and Nequip MD tests, fix arg / kwarg passing in MD

38ad2d6


          remove calculator_args / ase_calculator args, revert phonon job change

ac4916b


          Merge branch 'materialsproject:main' into mlff_md

21af2ae


          remove todo about adding magmoms to forcefield traj observer as that'…

c19b39b

…s now implemented


          Remove comments, add option to seed rng for MB velocities, turn on id…

d95397d

…eal gas stress contribution only when MD trajectory outputs (temp / velocity) are stored


          Merge branch 'materialsproject:main' into mlff_md

770aa3a


          Merge branch 'materialsproject:main' into mlff_md

3fa7b27


          Ensure CHGNet and M3GNet relax / static makers convert stress to eV/A…

8ecc1d1

…0**3; add tests for MD NVE ensemble and specifying MolecularDynamics object as input


          fix M3GNet test related to outdated cached model

7b8f584


          Merge branch 'mlff_md' of https://github.com/esoteric-ephemera/atomate2…

9dd428f

… into feature/mpmorph


          refactor slightly

de912e3


          clean up names for files and docstrings

030dc33


          implemented SlowQuench and FastQuench and finalized version of MPMorp…

516c954

…hVASPMDMaker


          minor updates to imports in mpmorph.py in forcefields and vasp

d6c9499


          finalized format of mpmorph.py for forcefields and fixed some errors …

7362fc8

…in vasp version of mpmorph


          minor typos and errors fix in mpmorph.py of forcefields

53f6177


          modified post init procedure for forcefields mpmorph.py

5ab1ae8


          test for mpmorph.py in forcefields

c08c458


          fix field init

d39cd7e


          remove test notebook

eb8e2fd

Author

BryantLi-BLI commented Oct 1, 2024

@utf @janosh @JaGeo: Should be ready for review!

orionarcher reviewed

View reviewed changes

src/atomate2/common/flows/mpmorph.py Outdated

Comment on lines 105 to 135

+                          working_outputs["relax"]["energy"] = [
+                              sum(frame[-self.energy_average_frames :]) / self.energy_average_frames
+                              for frame in working_outputs["relax"]["energies"]
+                          ]
+                          self.postprocessor.fit(working_outputs)
+                          working_outputs = dict(self.postprocessor.results)
+                          for k in ("pressure", "energy"):
+                              working_outputs["relax"].pop(k, None)
+                          if (
+                              working_outputs.get("V0") is None
+                          ):  # breaks whole flow here if EOS is not fitted properly
+                              return Response(output=working_outputs, stop_children=True)
+                          if (
+                              working_outputs.get("V0") <= working_outputs.get("Vmax")
+                              and working_outputs.get("V0") >= working_outputs.get("Vmin")
+                          ) or (
+                              self.max_attempts
+                              and (
+                                  len(working_outputs["relax"]["volume"])
+                                  - self.postprocessor.min_data_points
+                              )
+                              >= self.max_attempts
+                          ):
+                              # If the equilibrium volume is within the range of fitted volumes,
+                              # or if the maximum number of attempts has been performed, stop
+                              # and return structure at estimated equilibrium volume
+                              final_structure = structure.copy()
+                              final_structure.scale_lattice(working_outputs["V0"])
+                              return final_structure

Contributor

orionarcher Oct 1, 2024 •

edited

Loading

I found this a bit hard to parse, consider:

            # Fit EOS to the running volume and energy from previous calculations
            self.postprocessor.fit(working_outputs)
            working_outputs = dict(self.postprocessor.results)

            # breaks whole flow here if EOS is not fitted properly
            if working_outputs.get("V0") is None:
                return Response(output=working_outputs, stop_children=True)

            # Check if the number of calculations has reached the maximum
            n_calcs = len(working_outputs["relax"]["volume"])
            max_calcs = self.postprocessor.min_data_points + (self.max_attempts or 1000)
            max_calcs_reached = n_calcs >= max_calcs

            v_0 = working_outputs.get("V0")
            v_min = working_outputs.get("Vmin")
            v_max = working_outputs.get("Vmax")

            # return final structure if calculated volume is in interpolation regime
            if max_calcs_reached or (v_min <= v_0 <= v_max):
                final_structure = structure.copy()
                final_structure.scale_lattice(v_0)
                return Response(output=final_structure)

            # Else, if the extrapolated equilibrium volume is outside the range of
            # fitted volumes, scale appropriately
            v_ref = v_min if v_0 < v_min else v_max
            eps_0 = (v_0 / v_ref) ** (1 / 3) - 1.0

            linear_strain = [np.sign(eps_0) * (abs(eps_0) + self.min_strain)]

Contributor

esoteric-ephemera Oct 2, 2024

tweaked this a little bit - let me know if that looks better?

Contributor

orionarcher Oct 3, 2024

LGTM, I might save intermediate variables for the if statements but ultimately a matter of taste.

esoteric-ephemera and others added 7 commits

October 2, 2024 09:39


          cleanup common flow a bit

47cc837


          pcmt

d555560


          tweak

3b09254


          tweak

b69fdd2


          remove energy averaging logic for now

429b05b


          precommit

f7e4b8b


          Merge branch 'main' into feature/mpmorph

f935478

janosh requested changes

View reviewed changes

src/atomate2/common/flows/mpmorph.py Outdated

+                      Create a flow with MPMorph molecular dynamics (and relax+static).
+                      By default, production run is broken up into multiple smaller steps.
+                      Converegence and quench are optional and may be used to equilibrate

Member

janosh Oct 6, 2024

typo

Contributor

esoteric-ephemera Oct 7, 2024

Changed the docstr generally since it was confusing

src/atomate2/common/flows/mpmorph.py Outdated

Comment on lines 321 to 324

+                  relax_maker: Maker = Maker
+                  relax_maker2: Maker | None = None
+                  static_maker: Maker = Maker

Member

janosh Oct 6, 2024

is there any point in having generic Makers as defaults? nothing would happen unless you pass in explicit model/DFT Makers, no? in which case, maybe better not to set defaults at all?

Contributor

esoteric-ephemera Oct 7, 2024

The idea was probably to enforce that these are required, but the workflow itself will fail if relax_maker and static_maker are unspecified. Changed their defaults to None

src/atomate2/common/flows/mpmorph.py

Comment on lines +66 to +89

+                  def make(
+                      self,
+                      structure: Structure,
+                      prev_dir: str | Path | None = None,
+                      working_outputs: dict[str, Any] | None = None,
+                  ) -> Flow:
+                      """
+                      Run an NVT+EOS equilibration flow.
+                      Parameters
+                      ----------
+                      structure : Structure
+                          structure to equilibrate
+                      prev_dir : str | Path | None (default)
+                          path to copy files from
+                      working_outputs : dict or None
+                          contains the outputs of the flow as it recursively updates
+                      Returns
+                      -------
+                      .Flow, an MPMorph flow
+                      """
+                      if working_outputs is None:
+                          if isinstance(self.initial_strain, float | int):

Member

janosh Oct 6, 2024 •

edited

Loading

this make looks like it could benefit from more checking of assumptions and raising helpful error messages if they fail. e.g.

 if isinstance(self.initial_strain, float | int):
    self.initial_strain = (
        -abs(self.initial_strain),
        abs(self.initial_strain),
    )
elif not isinstance(self.initial_strain, tuple) or len(self.initial_strain) != 2:
   raise ValueError(f"initial_strain must be float | tuple[float, float], got {initial_strain}")

src/atomate2/common/flows/mpmorph.py

+                  name : str
+                      Name of the flows produced by this maker.
+                  convergence_md_maker : EquilibrateVolumeMaker
+                      MDMaker to generate the equilibrium volumer searcher

Member

janosh Oct 6, 2024

typo

src/atomate2/common/flows/mpmorph.py Outdated

Comment on lines 201 to 204

+                  convergence_md_maker: Maker | None = None  # check logic on this line
+                  # May need to fix next two into ForceFieldMDMakers later..)
+                  production_md_maker: Maker | None = None
+                  quench_maker: FastQuenchMaker | SlowQuenchMaker | None = None

Member

janosh Oct 6, 2024 •

edited

Loading

maybe add a __post_init__ to raise early if convergence_md_maker, production_md_maker, quench_maker are left as None. based on the doc string sounds like maybe production_md_maker shouldn't be allowed to be None anyway?

Contributor

esoteric-ephemera Oct 7, 2024

Added a __post_init__ to this and the base quench maker classes in the same file

src/atomate2/common/flows/mpmorph.py Outdated

Comment on lines 262 to 273

+                  @classmethod
+                  def from_temperature_and_steps(
+                      cls,
+                      temperature: float,
+                      n_steps_convergence: int = 5000,
+                      n_steps_production: int = 10000,
+                      end_temp: float | None = None,
+                      md_maker: Maker = None,
+                      quench_maker: FastQuenchMaker | SlowQuenchMaker | None = None,
+                  ) -> Self:
+                      """
+                      Create an MPMorph flow from a temperature and number of steps.

Member

janosh Oct 6, 2024

looks like this is missing an @abstractmethod decorator?

Contributor

esoteric-ephemera Oct 7, 2024 •

edited

Loading

Added this decorator to a few other places where I can see it's needed (eos common flows, ASE base classes)

src/atomate2/common/flows/mpmorph.py

Comment on lines +470 to +476

+                  def call_md_maker(
+                      self,
+                      structure: Structure,
+                      temp: float,
+                      prev_dir: str | Path | None = None,
+                  ) -> Flow | Job:
+                      """Call MD maker for slow quench.

Member

janosh Oct 6, 2024 •

edited

Loading

same here, should be marked @abstractmethod

src/atomate2/common/flows/mpmorph.py Outdated

+                  ----------
+                  name : str
+                      Name of the flows produced by this maker.
+                  convergence_md_maker : EquilibrateVolumeMaker

Member

janosh Oct 6, 2024

the name convergence_md_maker seems a bit inconsistent. maybe call this equilibrium_volume_maker?

src/atomate2/common/flows/mpmorph.py

+                      MDMaker to generate the equilibrium volumer searcher
+                  production_md_maker : Maker
+                      MDMaker to generate the production run(s)
+                  quench_maker :  SlowQuenchMaker or FastQuenchMaker or None

Member

janosh Oct 6, 2024

double space after :

src/atomate2/common/flows/mpmorph.py Outdated

+                  """
+                  name: str = "Equilibrium Volume Maker"
+                  md_maker: Maker | None = None

Member

janosh Oct 6, 2024

md_maker is allowed to be None but looks like make will fail if it is? why not make this a required arg?

Contributor

esoteric-ephemera Oct 7, 2024

See above, this has a __post_init__ for checking this is not None now

Member

janosh Oct 7, 2024 •

edited

Loading

i think adding a __post_init__ can be considered boilerplate. if you make md_maker a required arg and the user doesn't pass it, you dataclass should raise a similar error message

def __post_init__(self) -> None:
    if self.md_maker is None:
        raise ValueError("You must specify `md_maker` to use this flow.")

Contributor

esoteric-ephemera Oct 7, 2024 •

edited

Loading

Right - that's there already, unless I'm misunderstanding your suggestion?

Member

janosh Oct 7, 2024

i know, i'm suggesting removing it and changing

- md_maker: Maker | None = None
+ md_maker: Maker

to make md_maker a required arg

Contributor

esoteric-ephemera Oct 7, 2024

Ahhh yes sorry! Changed as requested, leaving in the __post_init__'s as well tho

esoteric-ephemera added 7 commits

October 7, 2024 09:53


          review changes

72e2d40


          precommit

bf2028a


          Merge remote-tracking branch 'upstream/main' into feature/mpmorph

a8e5881


          undo abstractmethod in openmm base job

343a291


          unset None default for required makers

5efadac


          remove pyace dependencies

00d1434


          precommit

d4abe91

BryantLi-BLI requested a review from janosh

October 8, 2024 18:13

janosh reviewed

View reviewed changes

src/atomate2/ase/jobs.py Outdated

                   def calculator(self) -> Calculator:
                       """ASE calculator, method to be implemented in subclasses."""
-                      raise NotImplementedError
+                      return NotImplementedError

Member

janosh Oct 8, 2024

was this change accidental? can't return NotImplementedError here, doesn't match the return type Calculator. should keep the raise and the @abstractmethod decorator

Author

BryantLi-BLI Oct 8, 2024

I swapped out return with raise.

I'm a bit confused on the @abstractmethod decorator. It seems that every forcefield that inherits the ASEMaker adapts their calculator as @property. Does this imply that all the calculators should be swapped from @property -> @abstractmethod ? or just the original base class where the calculator should be absent?

Contributor

esoteric-ephemera Oct 8, 2024

whoops typo, think @BryantLi-BLI fixed that

Member

janosh Oct 8, 2024

Does this imply that all the calculators should be swapped from @Property -> @AbstractMethod ? or just the original base class where the calculator should be absent?

just the base class should have the @abstractmethod decorator. but all the calculator methods should have the @property decorator

Contributor

esoteric-ephemera Oct 8, 2024

@BryantLi-BLI : that's my bad, the AseMaker.calculator should be both a property and abstractmethod. I accidentally removed this in 00d1434 (was trying to see if there was a nice way to serialize ASE calculators without having to subclass AseMaker, and just forgot to restore a few things)

BryantLi-BLI and others added 2 commits

October 8, 2024 13:51


          swapped return -> raise

2829c4c


          restore abstractmethod to ase calculator

41a5b17

janosh reviewed

View reviewed changes

pyproject.toml Outdated

               [tool.setuptools.package-data]
               atomate2 = ["py.typed"]
+              "atomate2.common.jobs" = ["*.json.gz"]

Member

janosh Oct 8, 2024

presumably this line is here to include src/atomate2/common/jobs/mp_avg_vol.json.gz with the PyPI package? probably best to pass the full file path to not auto-include future json.gz file in this dir by accident. but more importantly, that file is 650KB and needs to be downloaded by everyone who installs atomate2 if we merge the current config, whether they use MPMorph flows or not. up to @utf to make a decision here but i would vote against adding that file to the package. better to auto-download it when the user first needs it with a link where to get it if user doesn't have an internet connection.

finally, i looked into that file. looks like it's partially redundant. at least the oxi_state and count keys can likely be removed?

Contributor

esoteric-ephemera Oct 8, 2024 •

edited

Loading

@janosh the oxidation states are I think ignored by default, but the counts are needed. If a given chemical environment doesn't exist in a database, the code looks through subspaces of that chemical environment and does a weighted average of structure volumes. The weighted average depends on the counts

I started rewriting this in parquet to jointly store them - might be better to put the parquet file on OpenData and pull down/cache as needed

Contributor

esoteric-ephemera Oct 9, 2024

OK so for clarity: for the ICSD data, the user can either query from oxidation-state dependent or agnostic data, since ICSD includes manually-assigned charges. I pulled the MP data from summary rather than bonds, so the oxidation state info there is ignored.

That's a slight incongruity that I can change if requested

The counts are needed for the same reason given above (the code will sample subspaces of the total chemical space to perform a weighted averaging of reference volumes/atom if the total chemical space is not present in the dataset)

The data is now pulled / cached from figshare

Stuck with JSON since the parquet speed bump is probably not super relevant here (reading basically once rather than reading many times, and the user has the option to manually specify a DataFrame as input, if generating many random structures)

esoteric-ephemera added 5 commits

October 8, 2024 16:57


          move MP/ICSD avg vol data to parquet and remote figshare download/ ca…

43dd650

…che only as needed


          precommit

20f6382


          add checking for parquet readers

23afbac


          parquet --> json

6120b10


          precommit

44f625a

BryantLi-BLI requested a review from janosh

October 14, 2024 20:38

esoteric-ephemera added 2 commits

October 15, 2024 08:55


          Merge branch 'main' into feature/mpmorph

aa03ae8


          Merge branch 'main' into feature/mpmorph

23ceb99

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement forcefields md