Protein Preparation Workflow Panel

Locate and fix structural defects in imported protein and nucleic acid structures and prepare them for use by various Schrödinger applications.

To open this panel, click the Protein Preparation Workflow item on the Favorites toolbar, or click the Tasks button and browse to Biologics → Protein Preparation Workflow.

For information on running a protein preparation from the command line, see Protein Preparation Workflow Command Help.

Using
Features
Additional Resources

Using the Protein Preparation Workflow Panel

The Protein Preparation Workflow allows you to take a protein or nucleic acid from its raw state, (which may be missing hydrogen atoms and have incorrect bond order assignments, charge states, or orientations of various groups) to a state in which it is properly prepared for calculations using Schrödinger products such as Glide, Prime, BioLuminate, Desmond, FEP+, QSite, and MacroModel..

The workflow itself is run using the Preparation Workflow tab. There are two main workflows:

Interactive. In this workflow, you perform the preparation step-by-step, with the opportunity to review the results at each step and make decisions about how to address any problems, or which parts of the structure to delete in the final preparation. You can switch back and forth between the other tabs as you perform the preparation.
Automatic. In this workflow, you make settings for all stages of the preparation first, then run the entire preparation as a job. You can also select which stages of the workflow are run in the job. This workflow allows you to process multiple protein or nucleic structures in a single job with the same settings—for example, if you have multiple structures of the same protein or nucleic acid with small variations. The other tabs can be used on individual structures displayed in the Workspace. This may be useful for diagnosing problems in each structure and addressing them before running the job, or for removing parts of the structure that would not otherwise be removed.

Because automatic procedures cannot cover all possible cases, it is very important to check the correctness of the structure before using it for other applications. The tools in the Diagnostics tab and the Substructures tab can be used to check the final structures.

Diagnostics

The purpose of this tab is to list the problems found in a protein structure, so that you can fix them before continuing with the preparation or using the protein structure for other applications. The subtabs have some features for performing fixes; other fixes can be done with other panels. Once the problems are fixed, click the Workflow button to return to the preparation workflow for completion of the preparation.

Substructures

This tab is automatically displayed, with the tables populated, when you click the Review Structure link or the Check Structure button in the Preparation Workflow tab. Otherwise, the tables are empty, and you can use the tab to analyze the substructures in the Workspace structure by clicking the Load Workspace Entry link. After you have made any changes to the structure, you should return to the Preparation Workflow tab to complete the preparation.

The three tables in this tab display a list of het groups, a list of water molecules, and a list of chains, which you can select, examine, and delete. Het groups are everything that is not a water or a protein residue or nucleic acid, and include ligands, metal ions, and cofactors. Chains are defined by the chain label, and can include waters and het groups. Chains are defined by the chain label, and can include waters and het groups.

You can select multiple rows in a table using shift-click and control-click, and you can make a selection from more than one table at a time.
You can move the selection of a single row up and down a table with the UP ARROW and DOWN ARROW keys.

At the top are a menu for selecting items, and a button for setting the display. Some actions you can take with these are:

You can zoom in on the selected objects as they are displayed by selecting Fit view from the display settings button menu, and you can limit the display to the selected objects by selecting Display only selected.
You can select a table row by picking atoms in the Workspace, then choosing Select → Get from Selection. If you pick an atom in a water or a het group, the water or het group row is selected; if you pick any other atom, the chain is selected.
You can select all hets within a given distance of the chains that are selected by choosing Select → Hets Near Selected Chains, and likewise for waters. The distance can be set with Select → Set Proximity. This feature is useful for reducing a multimer to a monomer: select the chain, select the het groups/waters, invert the selection and then delete.
You can select waters that have nothing other than waters within 5 Å by clicking Select → Isolated Waters.
You can invert the selection (select the objects that are not selected, deselect those that are selected) by choosing Select → Invert. This is useful if you want to select the objects to keep and delete the rest: make your selection, click Invert Selection, then click Delete.
You can also deselect all hets, all waters, or all chains, with Select → Deselect.

Before inspecting the structure, it is advisable to delete unwanted parts of the structure. If the protein is a multimer and you want to simplify it, you should delete the chains for the duplicate structural units. When you delete the chains, the waters and het groups that are labeled with that chain label are also deleted. Otherwise you must delete them separately. See Simplifying a Protein Complex for information on determining what to do with multimeric complexes.

To delete chains, waters, or het groups, select the table rows, then click Delete from Entry. When you have finished deleting unwanted parts of the structure, it is advisable to inspect the remaining parts and correct any structural problems, for example by going back to the Preparation Workflow tab.

It may be useful to display formal charges on the atoms when examining the het states. The charged atoms are automatically labeled with the charge, but if they are not, choose Formal Charge from the Apply Labels menu in the Style Toolbox.

You might also want to use other style features and options to set up the structure for viewing, such as rendering the ligand in ball-and-stick, or displaying hydrogen bonds to the ligand.

Before proceeding to optimization of the structure, you should select the desired ionization states for the het groups. The project entry is changed to use the structures that you select.

Protein Preparation Workflow Panel Features

The features of this panel are contained almost entirely in three tabs, which are described in the topics linked below.

Tabs

Preparation Workflow
Diagnostics
Substructures

Preparation Workflow Tab Features

Set up and run the protein preparation. You can run a batch job on one or more proteins or run the preparation interactively.

Interactive button
Specify Protein step
Preprocess step
Diagnose and Analyze step
- Check Structure button
Optimize H-bond Assignments step
Minimize and Delete Waters step
- Settings link
  - Minimize section
  - Delete waters section
    - Distant from ligands (hets) option and text box
    - With fewer than N bonds to non-waters option and text box
- Clean Up button
Job toolbar
Status bar
Workflow group text box

Interactive button

Run the workflow interactively rather than as a batch job. Clicking the button toggles between interactive and automatic processing mode. The steps available in the tab and the content of the steps depend on the processing mode.

Specify Protein step

Specify the protein or proteins to process.

Use structures from option menu

Choose the structure source for the protein to be prepared. Only present in automatic mode. In interactive mode, the structure is taken from the Workspace.

Project Table (n selected entries)—Use the entries that are currently selected in the Project Table or Entry List. The number of entries selected is shown on the menu item. An icon is displayed to the right which you can click to open the Project Table and select entries. When this option is selected, a Load button is displayed to the right.
Workspace (included entry)—Use the entry that is currently included in the Workspace. Only one entry must be included in the Workspace. When this option is selected, a Load button is displayed to the right.
File—Use the specified file. When this option is selected, the File name text box and Browse button are displayed.

File name text box and Browse button

Enter the file name in this text box, or click Browse and navigate to the file. The name of the file you selected is displayed in the text box.

Get PDB button

Retrieve a structure from the ProteinData Bank. Opens the Get PDB File Dialog Box.

Only present when the structure source is the Workspace or the Project Table.

Review Structure link

Analyze the Workspace structure into its components (ligands, waters, metals, chains, etc.) and switch to the Substructures tab to review the results. In that tab you can make choices about keeping or removing particular components.

This link is only present if the source of the protein structures is the Workspace, either in interactive or automatic mode.

Global Settings link

Make overall structural settings for the preparation of the protein and its environment.

Opens a small pane with the settings, which are described below.

pH Settings

Specify the pH value to use for the protein's aqueous environment in the preparation process (the center of the pH range). This value is used to determine the protonation state of residues in the protein as well as that of the small molecules in the structure. It is also used in the optimization of the H-bond assignments. The value is stored as an entry property, preprocess pH.

Simulation pH text box

Enter the pH value to use for the protein's aqueous environment. The default is 7.4, the physiological pH. If the PDB structure has a pH value, and it is different from this value by more than 1 pH unit, a warning is posted, as the processed protein structure could deviate from that in the input structure. In this case you can select Use PDB value instead to use the reported pH.

Use PDB value instead: value option

Use PDB value instead (if available) option

Use the value for the pH obtained from the PDB protein structure data.

If the structure source is the Workspace, the PDB value is reported to the right as value, and selecting this option disables the Simulation pH text box.

If the structure source is the Project Table or a file, the second form of the text is shown. The pH value is taken from the PDB structure if it has one, otherwise the pH value is taken from the Simulation pH text box, which remains active. This option allows you to run preparations with a different pH value for each structure that has a pH value.

Small molecules ("hets") to process options

Select options for the small molecules to be included in the generation of ionization states by Epik Classic and in the set considered for deletion of waters. Epik Classic determines appropriate protonation states for molecules in solution based on their pK_a, and for molecules that are bound to metal ions. If the option to delete waters beyond the vicinity of small molecules is selected (Delete waters beyond hets, see below), only the items on this list are considered for determining which waters are deleted.

Detected Ligands—Include ligands. The link opens the Preferences Panel at the Ligand Detection Preferences, so you can make settings for what is detected as a ligand.
Metals and ions—include metals and ions.
Non-water solvents—include solvent molecules other than water.
Other—include any other small molecules that are present but not included in the options above.

Prompt for FASTA file if sequence is missing option

If this option is enabled, and no sequences are found in the structure, you will .

Each chain must be a separate record in the file. Sequence information will be used for calculating residue connectivity and filling missing loops (if that option is enabled).

Note: This option only applies when a single structure is being processed.

Preprocess step

Preprocess the protein by fixing structural defects and adding information necessary for use in applications. The main options are exposed in the step; other options are available under More options.

Cap termini option

Add ACE (N-acetyl) and NMA (N-methyl amide) groups to uncapped N and C termini. These termini include any chain breaks where there are missing residues. If the chain breaks are far from the region of interest, it might be sufficient to cap them. If you want to fill in the chain breaks rather than cap them, select Fill in missing side chains.

Include peptides when capping termini option: Additionally cap peptide termini, where a peptide is defined as a polypeptide consisting of less than 200 atoms. For this option to have an effect, the Cap termini option must also be selected.

Fill in missing side chains option

Run a Prime refinement job to place and optimize the missing side chains. Side chains can be added manually later. See the Predict Side Chains Panel topic for more information.

More options link

This link opens a pane in which a variety of other options can be set, as listed below.

Align to options

Align the structure to another structure.

Selected entry—align the protein to the protein in the selected entry. Only one entry must be selected in the Project Table or Entry List.
PDB—Align the protein to a protein from the PDB. Enter the ID of the template protein in the text box.

Assign bond orders option and menu

Assign bond orders to all bonds in the structure or only where bond orders are missing based on one of the following selections:

Reassign all (CCD)—Re-assign all bond orders and charges based on the Chemical Component Dictionary (CCD) provided the substructure can be matched either by atom names or SMILES string; otherwise, bond orders and charges are set based on geometry. This option is recommended if the structure has been downloaded from an official database such as the Protein Data Bank.

Missing only (CCD)—will assign bond orders and charges only to those substructures that currently have single-bond orders only, i.e. whose bond orders are unassigned. Again, CCD matching will be attempted, with geometry-based assignment used if the match fails. This option takes the same action as Edit → Assign → Bond Orders.

Missing only (geometry-based)—will assign bond orders and charges to all unassigned substructures based on geometry.

Replace hydrogens option

Remove the original hydrogens, then add hydrogens to the structure. This option allows any problems with hydrogen atoms in the original structure to be fixed. It includes fixing nonstandard PDB atom names, which prevent proper H-bond assignment, and is important for the H-bond optimization tool.

Create options

Create bonds for specific structural features.

Zero-order bonds to metals—Break bonds to metals and correct the formal charge on the metal and the neighboring atoms to treat the bonds as ionic. This is required for use with force fields, which treat metal compounds as ionic rather than covalent. Then add zero-order bonds between the metal and its ligands, so that it is still considered part of the same molecule. Sulfurs that interact with metals have their hydrogens removed, if necessary, and are assigned a negative charge.
Disulfide bonds—Find sulfur atoms that are within 3.2 Å of each other, and add bonds between them. CYS residues are renamed to CYX when the bond is added.

Identify protein features option menu

Antibody—Select the annotation scheme to use for antibody structures, from the common antibody schemes. The default scheme is Kabat. Information on the antibody structural regions is added to the structure as an atom property, antibody region id scheme. This information is used in the Structure Hierarchy (Hierarchy) to identify antibody regions for display and selection.
T-Cell Receptor—Select the T-Cell Receptor from the numbering schemes IMGT or AHo.

Renumber residues to match scheme option: Renumber the residues to match the chosen antibody annotation scheme. By default the residue numbers are retained.

Convert selenomethionines to methionines option

This option converts selenomethionines (MSE) to methionines (MET). A dialog box opens to inform you if any conversions are done. The conversion may be useful for structures in which MSE substitution was performed for X-ray structure determination.

Fill in missing loops (using Prime) option

This option will build and add loops of up to 20 residues, but not tails or terminal regions. After the structure is prepared, newly created loops should be refined to improve their quality. The first attempt at loop building is made using a knowledge-based approach. If this approach fails, a subsequent attempt will use an energy-based approach. Built-in loops can be identified by selecting atoms based on the atom property b_psp_added_missing_loops.

Sequence information is required for this action. If no sequence information is found, the action will be skipped. In the Interactive workflow mode, this prompt appears if the Fill in missing loops (using Prime) option is selected and the Preprocess button is clicked when the structure in the Workspace has no SEQRES info embedded. In the Automatic workflow mode, the prompt appears if the Fill in missing loops (using Prime) option is selected and the Run button is clicked when there is only one structure included in the Workspace, Project Table, or provided file.

Note: The requirements for format of the sequence titles in the FASTA sequence file depend on the number of chains. If a structure consists of a single chain, there are no requirements for sequence titles. If the structure contains multiple chains, the title for each sequence should have the form <title>:<chainID> or <title>_<chainID>. No spaces are acceptable, and only the chains for which the sequence is provided in the file are filled, others are skipped.

Generate het states (with Epik) option

Run Epik Classic to generate probable ionization and tautomeric states in the pH range specified for all selected het groups, as well as states prepared for binding to metals if the het group is coordinated to a metal. This ensures that the ligands have the proper ionization and tautomeric state, which might not be assigned when using the CCD database for bond orders.

pH value and range text box: For the target pH specified under Global Settings (and reported here), set the pH range for the generation of probable ionization and tautomeric states in the text box.
Max states to process automatically text box: When running the protein preparation automatically as a job, specify the number of het states to return for each protein processed. The states are selected in order of increasing state penalty. If you are running the preparation interactively, you can view and select the het states in the Substructures tab.

Preprocess buttton

Perform the preprocessing step. This button is only present in interactive mode.

Diagnose and Analyze step

Diagnose structural issues and analyze the substructures in the protein for examination and selection.

Only present in interactive mode, as decisions made on the basis of the diagnosis and analysis must be done interactively.

Check Structure button: Run diagnostics on the structure, which can be reviewed in the Diagnostics tab. This tab is displayed after the diagnostics have been run. If there are any issues with the structure detected that will require user intervention, a yellow warning icon with the message "Issues were found" will appear.

Optimize H-bond Assignments step

Optimize the hydrogen bonding network within the structure.

The hydrogen bonding network is optimized by reorienting hydroxyl and thiol groups, water molecules, amide groups of asparagine (Asn) and glutamine (Gln), and the imidazole ring in histidine (His); and predicting protonation states of histidine, aspartic acid (Asp) and glutamic acid (Glu) and tautomeric states of histidine. This can be done interactively or automatically as a job. Both choices use the same underlying technology. A new entry is created for the optimized structure. The amino acid “flips” are labeled for easy identification.

These optimizations are necessary because the orientation of hydroxyl (or thiol) groups, the terminal amide groups in asparagine (Asn) and glutamine (Gln), and the ring of histidine (His), cannot be determined from the X-ray structure. Flipping the terminal amide groups and the histidine ring can improve charge-charge interactions with neighboring groups as well as improving hydrogen bonding. The 180° flips preserve the heavy-atom placement deduced from the X-ray electron density. In addition, the protonation state of histidine, aspartic acid, and glutamic acid are varied to optimize hydrogen bonding and charge interactions.

Each hydrogen bond donor, His ring, and Asn and Gln terminal amide is considered a separate orientable species. Optimizing the orientation of the various groups is an iterative process, which passes over all the groups whose H-bonds need to be optimized multiple times. For information on the algorithm used, see H-Bond Optimization Technical Notes.

Settings link

Make settings for the optimization of hydrogen bonding. Opens a pane in which you can select various options.

Sample water orientations option

Select this option to sample water orientations when running the H-bond optimization. Deselect this option if the water molecules are already optimally placed, and you only want to run the other optimizations. See Correcting Water Orientations for instructions on manually orienting waters.

Note: If you have a lot of water molecules in the structure, sampling can take a long time. You should ensure that you have deleted unwanted waters before you start this process.

Use crystal symmetry option

Use crystal symmetry when optimizing the H-bond network.

Minimize hydrogens of altered species option

Perform a minimization for hydrogens that were adjusted during the sampling.

Optimization uses options and settings

Choose an approach for specifying the protonation states of the protein, and make settings for that approach.

PROPKA—Perform the H-bond optimization with protonation states of residues at a given pH, as determined from a pK_a prediction by PROPKA. When this option is selected the pH value set earlier is reported and the Label pKas option is displayed, below.
Simplified rules—Perform the H-bond optimization with protonation states of residues in a chosen qualitative pH range. When this option is selected, the pH options are displayed below.

Label pKas option

Label the relevant side-chain atoms of the protein with their predicted pK_a values from PROPKA. The values are stored in the atom-level property PropKa pKa. The labels are displayed in the Workspace. See Labeling Atoms and Bonds for more information on labels.

pH options

These options are displayed when you choose Simplified rules for the optimization. The options perform the following actions, relative to the normal biological pH of 7.

very low—protonate ASP, GLU, HIS
low—protonate HIS
neutral—normal biological states
high—deprotonate CYS

Optimize button

Run the optimization with the current settings. Only present in interactive mode.

Assign with Constraints button

Interactively optimize the H-bonds. Opens the Interactive H-Bond Optimizer Panel, in which you can examine each assignment of terminal group or hydroxyl orientation and select the desired orientation. This button is only present in interactive mode.

Minimize and Delete Waters step

Perform a restrained minimization on the structure and delete classes of water molecules. By default, all atoms are minimized, and only the waters near het groups are kept.

Settings link

Make settings for energy minimization of the structure and removal of waters. Opens a pane in which you can select various options.

Minimize section

This section provides controls for optimizing the corrected structure, to relieve any strain and fine-tune the placement of various groups. Hydrogen atoms are always optimized fully, which allows relaxation of the H-bond network. Heavy atoms can be restrained, so that a small amount of relaxation is allowed, or they can be kept fixed. These parts of the process are more time-consuming than the earlier parts of the procedure. A new entry is created for each minimized structure.

All atoms - to max RMSD option and text box

Minimize the energy of all atoms, iterating until the RMSD of the heavy atoms relative to the unminimized structure exceeds the specified threshold (maximum of four iterations).

Hydrogens only option

Select this option if you want to leave heavy atoms in place in the minimization, and only optimize the hydrogen atom positions. Allowing some movement of the heavy atoms can relieve structural strain, but will result in some deviation from the input structure.

Force field option menu

Choose the force field for the minimization. The choices are OPLS_2005 and OPLS4. The default is OPLS4 if it is available, otherwise it is OPLS_2005, unless you set the default in the Force Field settings section of the Preferences Panel.

Use customized version option

Use your customized version of the OPLS4 force field, rather than the standard version in the distribution. Only available when you choose OPLS4 from the Force field option menu and you have the appropriate license. This option is set by default to the value of the Use custom parameters by default option in the Preferences panel, under Jobs - Force field, when the current panel is opened. The default directory for the customized version can also be specified as a preference, in the same location.

If the customized version is missing or invalid, the text of this option turns orange and an orange warning icon is displayed to the right, with a tooltip about the problem.

Parameter set button: Select the set of custom parameters for the OPLS4 force field. Opens the Set Custom Parameters Location Dialog Box. Only available when you choose OPLS4 from the Force field option menu and you have the appropriate license.

Delete waters section

At this point, you can choose to retain or remove waters that have not been previously removed: in particular, bulk waters, identified by the number of hydrogen bonds to non-water molecules; and remote waters, identified by their distance from het groups.

For applications such as docking, where retaining the volume of the binding site can be important, you might want to remove waters after performing the restrained minimization, rather than before, as the minimization without waters can cause the binding site to shrink, and therefore not allow larger ligands to dock.

Distant from ligands (hets) option and text box: Remove waters that are further away from any het groups by the specified amount.
With fewer than N bonds to non-waters option and text box: Remove waters that have fewer than the specified number of hydrogen bonds to non-water molecules. This allows you to keep waters that have significant binding to the receptor—for example, forming bridges—and remove the rest of the waters. This is an alternative to keeping waters based on the proximity to the ligand, or manually removing waters.

Clean Up button

Perform the minimization and water removal. This button is only present in interactive mode.

Job toolbar

Manage job submission and settings. See Job Toolbar for a description of this toolbar. Only present in batch mode.

The final prepared protein structures are incorporated into the project (according to the job incorporation setting made in the Job Settings Dialog Box). The structures are in an entry group named with the job name.

Status bar

Use the Reset button to reset the panel to its default settings and clear any data from the panel. If the panel has a Job toolbar, you can also reset the panel from the Settings button menu.

If you can submit a job from the panel, the status bar displays information about the current job settings and status for the panel. The settings include the job name, task name and task settings (if any), number of subjobs (if any) and the host name and job incorporation setting. The job status can include messages about job start, job completion and incorporation.

The status bar also contains the Help button , which opens an option menu with choices to open the help topic for the panel (Documentation), launch Maestro Assistant, or if available, choose from an option menu of Tutorials. If the panel is used by one or more tutorials, hover over the Tutorials option to display a list of tutorials. Choosing a tutorial opens the tutorial topic.

Workflow group text box

Specify the entry group for the structures generated by the workflow. This is the interactive mode alternative to the Job name text box for automatic mode, and is only present in interactive mode.

The entry group is placed below the input entry, and a copy of the input entry is added as the first entry in the group. Structures from each task run as part of the preparation are added to this entry group as the task is completed, with appropriate annotation in the title to indicate the task performed. If necessary, the Project Table and Entry List scroll to the most recent entry.

Diagnostics Tab Features

Lists errors or issues found in the Workspace structure.

Check Workspace Entry link
Tabs
Workflow button
Substructures tab

Check Workspace Entry link

Perform diagnostic tests on the structure in the Workspace and display the results in the tabs below.

Tabs

These tabs present the results of the diagnostic tests.

Valences
Missing
Overlapping
Alternates
Reports

Valences tab

This tab lists atoms that have valence violations, such as missing hydrogens or the incorrect number of bonds. Click on a table row to select and zoom in on the problem atom. You can fix the atom type by using the 3D Builder (see Building and Editing Structures) or from the shortcut menu for the atom or bond in the Workspace. See also Making and Breaking Bonds and Changing Bond Orders, Formal Charges, and Elements.

Missing tab

This tab has two options to review: Missing atoms and Missing loops. Click either option to select which category to review.

For Missing atoms: Indicates atoms that are missing from some residues in the accompanying table. Select a table row or rows to review the corresponding items in the Workspace. Click the boxes in the Select column to add a row to review for missing atoms. Click the Add Missing Side Chains button to attempt to predict the missing atoms for the checked rows. Note that the job may take several minutes to complete for a large number of residues.
For Missing loops: Indicates residues are missing from some loops in the accompanying table. Select a table row or rows to review the corresponding items in the Workspace. Click the boxes in the Select column to add a row to review for missing residues. Use the Fill Missing Loops button to attempt to generate the missing residues for the checked rows. Note that this job may take several minutes to complete.

Overlapping tab

This tab lists pairs of atoms that are too close (bad contacts). If the structure is not tangled and the atoms are not too close, the restrained minimization should move these atoms away from each other. If you want to fix the overlapping atoms immediately, select the residue that contains the atoms, or simply select the table row to select just the overlapping atoms, and press Ctrl+M (⌘M) to do a quick minimization.

If the atoms are superimposed, however (zero distance), this can indicate a duplicate het group, which you can delete in the Substructures tab. Otherwise, you can use the Atom Selection Dialog Box to select one of the duplicates for deletion (Select → Define).

When examining overlapping atoms, it is a good idea to ensure that all atoms are displayed. By default, only polar hydrogens are displayed, so you should select Show All Hydrogens from the hydrogens button menu in the Style Toolbox (see Showing and Hiding Atoms), then open the Atom Selection Dialog Box.

Alternates tab

This tab lists residues that have alternate positions in the input structure. Although these positions are legitimate, the protein preparation can only be done on a structure with a single set of coordinates, so you must choose one of the alternate positions for each residue.

To choose a position for a residue, first select the residue in the table. The view zooms in to the residue, the default position is marked, and the alternate position is drawn in with dotted lines. Click Switch to switch between positions (or use the check boxes in the Default and Alternate columns). When the position that you want to keep is displayed, click Commit. The atoms in this position are kept, and the alternate atoms are deleted.

If the alternate positions are needed for subsequent applications, you should set up and refine a copy of that protein for each set of alternate positions needed.

Reports tab

This tab contains tools that are also available in the Protein Reports Panel.

View option menu
Filter by threshold option
Report table
Export button
All reports option
Structure average text box

View option menu

This menu lists the various protein properties for which reports can be generated. When you choose an item, the table is updated with the relevant report. The items are described below.

Steric Clashes: Ratio of the interatomic distances to the sum of the atomic (van der Waals) radii from the force field. The threshold is 0.85.
Bond Length Deviations: Deviation from the ideal value derived from Engh and Huber. The RMSD is reported at the bottom.
Bond Angle Deviations: Deviation from the ideal value derived from Engh and Huber. The RMSD is reported at the bottom.
Backbone Dihedrals: G-factors for backbone dihedrals (see below for definition), along with dihedral angles. Less negative (closer to zero) means that the combination of angles is more probable.
Sidechain Dihedrals: G-factors for side-chain dihedrals (see below for definition), along with dihedral angles. Less negative (closer to zero) means that the combination of angles is more probable.
G-Factor Summary: G-factors for backbone dihedrals, side-chain dihedrals, and the sum of the two.
Average B-Factors: Average B-factors (temperature factors) for the backbone and the side chain for each residue, and their standard deviations.
Gamma-Atom B-Factor: B-factor for the gamma atom in each residue.
Peptide Planarity: RMSD of the atoms in the peptide linkage from the plane that minimizes the RMSD.
Sidechain Planarity: For side chains that have nominally planar groups, RMSD of the atoms in the planar groups from the plane that minimizes the RMSD.
Improper Torsions: For side chain atoms that are nominally planar, RMSD of the improper torsion for these atoms.
C-alpha Stereochemistry: Stereochemistry of the C-alpha atoms.
Missing Atoms: Residues for which atoms are missing from the structure.

G-factors are calculated by binning the dihedral angles from a collection of high-resolution structures into 10-degree ranges, and calculating the probability of a pair of angles lying in a given range. The G-factor is the logarithm of this probability. If there are no values in a given range, the G-factor is reported as "disallowed".

In addition, the menu has items that open the Protein Reliability Report Panel and the Ramachandran Plot Panel.

Filter by threshold option

When checked, this option will only export rows that are shown, unless the All reports options is selected.

Report table

The table displays the residue label in the first column, and the report data in the remaining columns. Clicking a row marks the residue or the relevant part of the residue in the Workspace, and zooms in on the residue. Clicking a column heading sorts the table by the data in the column. Repeated clicking cycles through ascending order, descending order, and original order.

Export button

Export the current report or all reports to a plain text file. Opens a file chooser so that you can specify the file name for the report.

All reports option

Export all reports; if not selected, the current report is exported.

Structure average text box

Reports the average of the property chosen from the View option menu over the entire structure, if applicable.

Workflow button

Go to the Preparation Workflow tab. Click this button after making fixes to the structure, to run through the preprocessing again with the fixed structure.

Substructures button

Go to the Substructures tab.

Substructures Tab Features

Lists the substructures in the Workspace protein structure: ligands, metals, waters, chains. You can choose items to view them in the Workspace, copy them, or delete them from the protein structure.

Load Workspace Entry link
Select option menu
Workspace effects settings button
- Fit view option
- Highlight atoms option
Ligands, Metals, Other table
Waters table
Chains table
Expand to PDB chain link
Selection report and Clear link
Copy to New Entry button
Delete from Entry
Prepare Selected Only button
Diagnostics button
Workflow button

Load Workspace Entry link

Reload Workspace Entry link

Analyze the structure in the Workspace into its substructures and display them in the tables below. If a structure has already been analyzed, the text reads Reload Workspace Entry and you can reload the structure from the Workspace and reanalyze it.

Select option menu

Select or deselect substructure items in the tables. The corresponding atoms are selected in the Workspace.

All—Select all items.
None—Clear the selection.
Previous—Select the items that were in the previous selection.
Invert—Invert the selection: the selected items are deselected, the unselected items are selected.
Deselect—Deselect the items (remove them from the selection)
Bulk Waters—Select waters that have nothing other than waters within 5 Å. This option can be used to select bulk waters for removal, for example.
Hets Near Selected Chains—Select het groups within a specified distance of the chains that are selected. The distance is specified with Set Proximity.
Waters Near Selected Chains—Select waters within a specified distance of the chains that are selected. The distance is specified with Set Proximity.
Set Proximity—Set the distance threshold for selecting hets or waters near selected chains. Choose a threshold value from this menu, or choose Custom and set the desired threshold in the dialog box that opens.
Get Workspace Selection—Select the substructure items in the tables that have at least one atom selected in the Workspace.

Workspace effects settings button

Choose options for applying effects in the Workspace when the selection changes.

Fit view option: Fit the Workspace view to the selected substructures. The view changes each time new substructures are selected.
Highlight atoms option: Highlight the selected substructures. By default, only the Maestro selection markers are shown when the substructures are selected. You can change the Workspace selection independently of the substructure selection, so this option allows you to see which substructures are selected while selecting other atoms in the Workspace.

Ligands, Metals, Other table

Lists the ligands, metals, and other molecules in the structure. Selecting rows in the table selects the corresponding het groups in the Workspace. The ligand is marked in the Lig column, based on the Ligand Detection Preferences

After preprocessing is done with state generation by Epik, extra columns are added to the table for the most likely states, labeled Sn. There are checkboxes in the columns to choose the state. The states are sorted by increasing state penalty, which includes a reward for the number of hydrogen bonds formed. The state that is selected by default is the state with the lowest penalty (and if there is more than one such state, it is the state with the most H-bonds).

When you select a state, the status area at the foot of the panel displays the state penalty, the charge, and the number of hydrogen bonds made to the het group. The selected state is displayed in the Workspace, with markers to indicate the atoms that differ between states. Additionally, the hover-over text for a state in a selected row will indicate if it is the original state.

Waters table

Lists the waters in the structure. Selecting rows in the table selects the corresponding waters in the Workspace.

Chains table

Lists the chains in the structure. Selecting rows in the table selects the corresponding chains in the Workspace. Selecting a chain selects only the protein, not the associated waters or hets.

Expand to PDB chain link

Expand the chain selection to include all substructures that are labeled with the chain name. The waters and hets are highlighted in their tables and selected in the Workspace.

Selection report and Clear link

Shows the number of items selected in the tables. To clear the selection in all tables, click the Clear link.

Copy to New Entry button

Copy the items selected in all of the tables to a new project entry, with a default name. A banner is displayed, where you can rename the entry (or you can rename it in the Entry List (Entries)).

Delete from Entry

Delete the items selected in all of the tables from the Workspace entry.

Prepare Selected Only button

Prepare a structure that contains only the selected items. A new entry with these items is created, which can be named in the dialog box that opens.

Only available in batch mode.

Diagnostics button

Go to the Diagnostics tab.

Workflow button

Go to the Preparation Workflow tab. This is the usual next step after making any changes in the structure.

Reset button

Reset the panel to its default settings.

Tutorials

Quick Reference Sheets

Protein Preparation Workflow

Videos

Preparing Protein Structures in Maestro