Using a PDB File to Set Up a System for Simulation
Protein data bank (PDB) files are almost universally employed in the biomolecular simulation field. Unfortunately, using PDB files can be a painful experience. Reasons for this include:
- Many PDB files do not adhere to the PDB standard.
- Much information that is necessary when setting up a system for simulation is absent from a PDB file.
- The experimentally-determined structures of proteins and other biomolecules often have gaps or uncertainties. Thus, atoms may be absent (e.g. hydrogens) or may be present several times (e.g. due to the occurrence of multiple conformations for flexible parts of the molecule).
Devising a fully automatic scheme for setting up a simulation given a "well-behaved" PDB file and a "simple" system is relatively straightforward. However, most cases are likely to require manual intervention of some sort by the user.
A typical procedure that is followed when setting up a protein system for simulation with pDynamo is listed below. The order of some of the steps (particularly 4, 5 and 6) is variable and will depend on the system being studied.
Select a PDB file that is appropriate for the objectives of the simulation study.
Read the PDB file and write out the PDB model that it contains to a file.
Edit the PDB model file so that it conforms to the system that is to be simulated. It is often more reliable to set up different parts of a system separately, in which case the model file from Step 1 is split up into several pieces.
Create a system from the edited PDB model file and the original PDB file. The model file, along with the PDB component library supplied with pDynamo, is used to generate the system's atoms and bonds whereas the original PDB file provides atomic coordinates.
If there are errors in generating the system, it is most likely because not all the components, links and variants that appear in the PDB model have been defined. If not, these must be added and Step 3 repeated.
Generate an MM model for the system. At the moment this is only possible with the OPLS-AA force field.
Errors in Step 4 are common due to the absence of force field parameters that are needed to describe various groups in the system. If this is the case, these need to be added to the force field definition and Step 4 repeated. To see how to do this, check out the files in the
data/pkaProtein subdirectory of the tutorial.
Form the complete system by merging together its constituent parts if these were set up separately.
Check to see if the system has atoms whose coordinates are undefined or, in other words, that were absent in the original PDB file. If so, these must be constructed.
Relax the structure of the vacuum system.
Solvate the system using an appropriate solvent — normally water along with some counterions.
Refine and equilibrate the structure of the solvated system in preparation for subsequent simulation.
The files employed in the different steps of this tutorial are available in the
tutorials/pdbFiles subdirectory of the pDynamo distribution.