Step 4a
The creation of a new or updated parameter set for the OPLS force field is not an automatic procedure and requires much manual intervention. In what follows it is assumed that the OPLS definitions and parameters for the system being studied exist and that no parameter development needs to be undertaken.
The OPLS force field originates from the work of William Jorgensen and his group (see here). Parameters for certain classes of molecule can be obtained from one of the many published papers that concern OPLS but, for the most comprehensive and up-to-date collection, it is probably best to ask Professor Jorgensen directly.
To get an initial idea of how parameter sets are constructed with pDynamo, it is sufficient to look in the pDynamo-x.y/parameters/opls subdirectory of the distribution. For each parameter set there is an XPK file and a subdirectory of the same name that contains the set's definitions and parameters. There is also an additional directory, rawdata, with files that list atom types and parameters that have been gleaned from the literature and from parameter files obtained from Professor Jorgensen (please note that there is no guarantee that these are correct!).
The following scheme illustrates how to generate a parameter set for PKA but a similar procedure is appropriate more generally.
- Choose a name to call the modified parameter set and also a directory in which to put it. We shall use pkaProtein as the name and put the new set in the data subdirectory of the directory that contains the files for this tutorial. As was the case when creating a modified PDB component library, it is a good idea to choose a location separate from that of the pDynamo distribution.
- Copy the pDynamo-x.y/parameters/opls/protein directory and its files to the directory data/pkaProtein. The protein parameter set already contains most of the force field information that we shall need and so will provide a convenient base upon which to build.
- Inspection of the output from running the program of Step 4 shows that the program fails because there are "untyped" atoms from the residues ATP, MG, PO3 and the
THR covalently bound to PO3.
This is corrected by specifying appropriate atom types and MM patterns in the files atomtypes.txt and patterns.txt of the pkaProtein directory, respectively. As a simple example consider the magnesium dication whose atom type definition is:
# . Magnesium dication. MG atomicNumber 12 charge 2.0 epsilon 0.87504 sigma 2.91In this extract, the line beginning with the hash character (#) is a comment and is ignored. The atom type definition itself is self-explanatory and includes the name of the type (MG), the type's atomic number, its charge (charge) and Lennard-Jones parameters (epsilon and sigma). Although not needed here, it is also possible to specify a default atom type name for hydrogens bound to the type using the keyword hydrogenLabel.
MM patterns are more complicated. Essentially what they do is to define a mapping between specific chemical motifs or substructures and force field atom types. In the OPLS force field a single atom type can occur in different chemical environments and so it is possible to have multiple patterns that involve the same type. This is not the case for the MG atom type because it occurs in only a single environment. The appropriate MM pattern definition is:
# . Magnesium dication. Pattern AtomTypeLabels MG Charges 2.0 Label MagnesiumDication AtomPatterns 1 0 atomicNumber 12 connections 0 formalCharge 2 hydrogens 0The keyword Pattern initiates the pattern and is followed by keywords (on separate lines) which give the atom type names of the atoms concerned by the pattern (AtomTypeLabels), the charges of these atoms in the pattern (Charges) and a descriptive name for the pattern (Label). The heart of the definition is given by the atom pattern defined after the AtomPatterns keyword which identifies a magnesium with a formal charge of two and with no bonds to other atoms.
The specification of appropriate MM patterns can require some thought and so readers are encouraged to look at the patterns.txt files in the tutorial and the pDynamo OPLS parameters directory. In particular, check out the differences between the files in the protein and pkaProtein parameter sets. There are several important points to remember: (i) MM patterns are applied to a system in the order in which they are specified in the file, which means that more specific patterns should occur before more general ones; (ii) hydrogens are normally omitted from MM patterns and are typed implicitly using the value of the hydrogenLabel keyword of the typed atom to which they are attached. Of course, certain non-standard hydrogen types may require explicit treatment in a pattern in which case they will be handled equivalently to the atoms of other elements; (iii) the charge of an atom type given in a pattern overrides that specified in the atom type definition and refers to the total charge of the atom and any implicit hydrogens it may have. The charge that will actually be assigned to the atom is the difference between the MM pattern charge and the sum of the charges on its implicit hydrogens — these being defined in the hydrogen's atom type definition; and (iv) defining bond patterns that match aromatic systems can be troublesome because the latter can occur in a number of different, albeit equivalent, canonical forms. It is best to use Undefined as the aromatic bond type in these cases.
- Once the atom type and MM pattern files have been modified, a XPK file corresponding to the new parameter set can be generated with the MakeOPLSXPKFile function from the module OPLSScripts. The command is
simply:
# . Make the OPLS XPK file. MakeOPLSXPKFile ( "pkaProtein", inPath = "../data", outPath = "../data" )The program from Step 4 can now be rerun with the new parameter set by updating the definition of the MM model as follows:
# . Set up a MM model. mmmodel = MMModelOPLS ( "pkaProtein", path = "../data" )There should be no more untyped atoms if the atom type and MM pattern files have been updated correctly. If not, repeat the preceding and current steps as often as necessary.
- There will likely be missing bond, angle, dihedral and improper force field parameters once all atoms have been typed, but a complete list of all those that are required will be output by the program during the model-building process.
Parameters are much easier to specify than MM patterns. It suffices to append the appropriate definitions, in no special order, to the files bonds.txt, angles.txt, dihedrals.txt and impropers.txt, and then regenerate the parameter set's XPK file using the function MakeOPLSXPKFile. As a final point, it should be noted that all quantities in the files in the pkaProtein directory (and its equivalents) are in OPLS's default units. In particular, this means that energies are in kcal mol-1. Conversion to standard pDynamo units is performed automatically by the function MakeOPLSXPKFile.