Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fractional bond orders to SMIRFF #53

Open
davidlmobley opened this issue Jul 7, 2016 · 18 comments
Open

Add fractional bond orders to SMIRFF #53

davidlmobley opened this issue Jul 7, 2016 · 18 comments
Assignees

Comments

@davidlmobley
Copy link
Contributor

Eventually, we need to add capacity to SMIRFF for fractional bond orders, which we will obtain from a semi-empirical QM calculation. This will be used downstream eventually (after this summer) for parameter interpolation.

@jchodera
Copy link
Member

jchodera commented Jul 7, 2016

Suggested FFXML format:

<HarmonicBondForce length_unit="angstroms" k_unit="kilocalories_per_mole/angstrom**2" fractional_bond_order="interpolate-linear">
   <Bond smirks="[#6:1]-[#6:2]" length1="1.526" k1="620.0" length2="1.426" k2="820.0" length3="1.326" k3="1020.0"/> 
   ...
</HarmonicBondForce>

@davidlmobley
Copy link
Contributor Author

So, an interesting question is whether we would allow fractional bond orders only for SMIRKS matching ~ bond types or whether we would allow them for any. Presumably it would be for any, I suppose, as you have here. Otherwise the concern would be that the semiempirical calculation would give us a fractional bond order but the previous SMIRKS would prevent us from using it... Is that what you're thinking?

@cbayly13 , does this look suitable?

@cbayly13
Copy link
Collaborator

cbayly13 commented Jul 8, 2016

We want to exclude triple bonds so the form I would suggest is:
[#6:1]!#[#6:2]
which is asking for a carbon-carbon bond of any order except triple, thus allowing double, single, and aromatic. Most useful conjugation is across CC, CN, and CO bonds so I would want to start with
[#6:1]!#[#6:2]
[#6:1]!#[#7:2]
[#6:1][#8:2]
and see how far we can get with that. In the last case only a "
" is needed because singlet state C#O does not occur in pharmaceutically relevant molecules.

@cbayly13
Copy link
Collaborator

cbayly13 commented Jul 8, 2016

In principle triple bonds would not be ruled out, because they can conjugate too, but I am expecting a lot of finessing needed to map:
single<-->conjugatedSingle<-->aromatic<-->conjugatedDouble<-->double
correctly, which is where all atom-typed ligand force fields fall down.

@davidlmobley
Copy link
Contributor Author

davidlmobley commented Aug 15, 2016

@jchodera - for reasons I can explain in more detail when I see you in person later this week, we may now want to do this sooner rather than later (in brief, it has to do with trying to put parm@frosst in SMIRFF format, which will be MUCH MUCH easier for specific types of systems if we have this).

The format above looks reasonable to me for bonds but may require notation adjustments. Specifically, for torsions, we use k1, k2, k3, (periodicity1, periodicity2,...) etc. to pick up multiple torsion terms which apply to the same SMIRKS. This probably means the choice of k1, k2, k3 for bonds is less than ideal since the numbering would mean something different there than it does for torsions.

Would you be happy with k_bondorder1, k_bondorder2, etc. instead?

This would also mean that for torsions, we could accommodate multiple torsion parameters that apply to different bond orders (whereas, as currently proposed, you'd have a name clash between k1 for the first torsional term and k1 for the term applying to bond order 1). For example, right now we have

<PeriodicTorsionForce phase_unit="degrees" k_unit="kilocalories_per_mole">
<Proper smirks="[#6X4:1]-[#6X4:2]-[#8X2:3]-[#1:4]" idivf1='1' periodicity1="3" phase1="0.0" k1="0.16" idivf2="1" periodicity2="1" phase2="0.0" k2="0.25" id="t0006" parent_id="t0002"/> <!-- HO-OH-CT-CT from frcmod.Frosst_AlkEthOH -->
...

and this could extend to

<PeriodicTorsionForce phase_unit="degrees" k_unit="kilocalories_per_mole" fractional_bondorder="interpolate_linear">
<Proper smirks="[#6X4:1]-[#6X4:2]-[#8X2:3]-[#1:4]" idivf1='1' periodicity1="3" phase1="0.0" k1_bondorder1="0.16" k1_bondorder2="0.25" idivf2="1" periodicity2="1" phase2="0.0" k2_bondorder1="0.25" "k2_bondorder2"="0.0" id="t0006" parent_id="t0002"/> <!-- HO-OH-CT-CT from frcmod.Frosst_AlkEthOH -->

(Note that I made up the k1_bondorder2 and k2_bondorder2 values out of thin air for illustration purposes.)

If this looks reasonable, I expect I can actually implement this myself without significant difficulty so we can try it out.

@davidlmobley davidlmobley changed the title Low priority: Add fractional bond orders to SMIRFF Add fractional bond orders to SMIRFF Aug 15, 2016
@jchodera
Copy link
Member

jchodera commented Aug 15, 2016 via email

@jchodera
Copy link
Member

jchodera commented Aug 15, 2016 via email

@davidlmobley
Copy link
Contributor Author

davidlmobley commented Aug 15, 2016

How are we going to evaluate the bond orders, though? Are we planning to
take them from the OEMol for now, or should we add bond order to the
Topology at this point?

Well, we'll get the bond orders from an OEMol. But, right now createForce only works with the Topology so I think this means that either:

  1. we need to update the Topology to include bond order, or
  2. createForce will need major updates/re-architecting to be able to pull the bond orders from the stored reference molecules in the Topology class

It seems like the first would be much better and doesn't look like it should be much work -- we just need self._bondorders added to Topology to track the order of each bond in self._bonds, it seems. I'd update generateTopologyFromOEMol to store this info. Then this would be used in createForce to pick which parameters to use when actually assigning from the FFXML.

Sound reasonable?

Unrelatedly, I updated the above API proposal in my post to use your terminology of "bondorder" rather than "order" so I'd put this in in the right way.

@jchodera
Copy link
Member

I would just add to the _Topology class we use inside of forcefield.py for now. We already use this "augmented" topology instead of the OpenMM Topology as the argument to createForce, so this is the natural place to extend things for now until we can get them into OpenMM.

We do have an open OpenMM issue on bond orders that you should probably comment in too.

@jchodera
Copy link
Member

Also, let's standardize fractional_bond_order to fractional_bondorder too.

@davidlmobley
Copy link
Contributor Author

Agreed on both of those, @jchodera .

@davidlmobley
Copy link
Contributor Author

One other issue I forgot to bring up earlier, @jchodera - we think we need to extend the SMIRFF format slightly to add the option to kekulize molecules before assigning parameters (default False, as at present).

We're thinking of putting parm@frosst into SMIRFF format sooner rather than later, and two things will make it take a lot less human time: first, kekulizing molecules before assigning parameters, and second, having bond orders (this has to do with some issues relating to how to handle aromatic molecules that still have strong single/double character and ALSO handle aromatic molecules which do not). For Smarty/Smirky/etc., we're of course not kekulizing, but this will help Christopher in putting parm@frosst into this format.

What about adding this as a tag in the parameter file, i.e. something like

<SMIRFF kekulize="True">
...
</SMIRFF>

where if kekulize is False or unspecified, the current behavior would be retained?

@jchodera
Copy link
Member

Setting global parameters like kekulize in the <SMIRFF> tag sounds fine. This could instruct ForceField to copy and kekulize all of the molecules it is asked to parameterize.

@davidlmobley
Copy link
Contributor Author

I should probably also deal with #42 at the same time and put in a version attribute.

@davidlmobley
Copy link
Contributor Author

For the record, OpenEye support reports we can retrieve partial (Wiberg) bond orders from i.e. AM1-BCC calculations done with their tools, such as via this code snippet:

mol = OEGraphMol()
    ifs = oemolistream(argv[1])
    while OEReadMolecule(ifs,mol):
        if OEAssignPartialCharges(mol, OECharges_AM1BCCSym):
            for bond in mol.GetBonds():
                print (bond.GetData("WibergBondOrder"))

This may or may not be supported in future releases (they're putting in a feature request to retain it and I'll update with the outcome of that) but it's there for now and we can use it for this.

@jchodera
Copy link
Member

Are you going to add the support for this to forcefield.py, @davidlmobley?

@davidlmobley
Copy link
Contributor Author

Yes, @jchodera . Could be I'll get to work on this while traveling back on Friday, though more likely it'll be early next week. Since I can see how to do it easily enough, your time will be better spent on the property calculation things.

@davidlmobley
Copy link
Contributor Author

@jchodera - any thoughts on whether we should REQUIRE floating point bond orders, or fall back to something else (i.e. bond orders as integers if the user provides an OEMol and doesn't want to do a charging calculation) if the user insists on not doing an AM1 calculation?

I'm thinking the simplest and probably best behaved might be to populate the bond orders using bond.GetOrder() for regular OEMols if there are no wiberg bond orders (no AM1 calculation has been done), otherwise populate with the Wiberg bond orders. Probably I should have an optional argument to the charging calculation which would override this so we can have it retain integer orders when we want it to.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants