Protein Linker Design Panel

Crosslink Proteins Panel, fusion proteins, NNAA, NSAA, join proteins, amino acid, linking

Cross-link two protein chains with a peptide linker connecting the termini. The linker can be composed of several copies of a specified monomer unit. A loop prediction is performed on the linker to obtain a reasonable conformation.

To open this panel, click the Tasks button and browse to Biologics → Protein Linker Design.
To open this panel from the entry group for the results of an antibody structure prediction job, use the Workflow Action Menu .

Using the Protein Linker Design Panel

As part of protein design, it can be useful to cross-link two proteins. For example, suppose you have two oligopeptide fragments or protein domains that bind to a third protein. Both fragments or domains need to bind to the third protein for function. To increase efficacy, you might want to try to tether those two fragments or domains together. Another use for cross-linking is for circular permutation of a protein, in which you connect the termini and break the chain at some other point. Provided the break is outside the binding region, you could create a protein that still binds a ligand, but may interact differently in the cellular environment.

This panel allows you to cross-link two pre-positioned proteins by connecting chain termini with peptide linkers. The link is formed with standard peptide bonds, so the links must be formed from the N-terminus of one chain to the C-terminus of another.

The linkers are composed of monomer units. You can specify the entire linker as a monomer, you can create or read in shorter monomers and choose how many units are in each linker, or you can choose the standard amino acids as the monomers. If you choose to construct linkers from multiple monomer units, the monomer units in each linker are selected randomly. The number of linkers to construct can be specified explicitly or as a percentage of the possible variations. You can create linkers of different lengths, by specifying the range of numbers of monomer units in a linker.

The chains are connected with each linker that is generated, and the strain energy for the linker is evaluated as the difference between the linker energy in the linked conformation and the minimum energy of the linker in the unbound conformation. The minimization could change the conformation of the free linker; however it is not guaranteed to be the global minimum.

No adjustment of the relative position or orientation of the proteins is done in the process, so you must ensure that they are properly positioned before linking them—for example, if two proteins bind to a third and you want to link those two proteins, you may want to take their positioning from the complex with the third protein.

Before you can add the linkers, you must ensure that the proteins are in a single project entry. If they are in different entries, you can create a project entry by choosing Workspace → Create Project Entry or typing Ctrl+Shift+N (⇧⌘N), and naming the entry in the dialog box that opens.

You should ensure that the protein is prepared, by using the Protein Preparation Workflow Panel. To cross-link the proteins, you must delete all het groups and waters when you prepare the protein, and ensure that the protein contains only on the standard amino acids. The het groups can be restored later, for example by creating another project entry that contains only the het groups and waters, then merging this entry with the results of the cross-linking.

When you pick the termini, it can be useful to select them in the Workspace first. To do this, you can display the sequence viewer (Window → Sequence Viewer), select the terminal residues in the sequence viewer, then click the Fit to selected atoms button on the toolbar.

The residues are marked with yellow selection markers, which will make them easy to pick.

To run protein cross-linking from the command line, you can use the following command. Run the command with -h for more information.

$SCHRODINGER/run cross_link_proteins_backend.py

To write out input files, click the arrow next to the Settings button,

and choose Write (more...).

For information on command options, see cross_link_proteins_backend.py Command Help.

Protein Linker Design Panel Features

Setup tab

Define linker attachment points section

In this section you define the attachment points on each of the proteins.

Connection residue one tools

Select the Pick residue in Workspace button, and then pick a terminal residue in the Workspace. A warning message appears if you pick a non-terminal residue. The alpha carbon of the residue is marked with a green sphere in the Workspace. After the residue is picked the check box is automatically cleared, and the selection appears in the dropdown menu. The list is populated with the residue IDs in the form chain:resnum(resname), e.g. A:1(PRO). In the dropdown menu, you can also manually select from a list of all endpoints from all chains. Clicking Reset endpoint in the dropdown menu clears the selection.

Connection residue two tools

Select the Pick residue in Workspace button, and then pick a terminal residue in the Workspace. If residue one was an N-terminal residue, residue two must be a C-terminal residue, and vice versa. A warning message appears if you pick a non-terminal residue or a terminus of the wrong type. The alpha carbon of the residue is marked with a green sphere in the Workspace. The list is populated with the residue ID in the form chain:resnum(resname), e.g. B:99(PHE). After the residue is picked the check box is automatically cleared, and the selection appears in the dropdown menu. In the dropdown menu, you can also manually select from a list of all endpoints from all chains, except for the one already selected in Connection residue one field. If a structure contains more than one chain, and you select two residues of the same chain, a warning message will appear to indicate that this will link residues with the same chain. Clicking Reset endpoint in the dropdown menu clears the selection.

Inter-residue distance text box

This noneditable text box is automatically filled in with the distance between the alpha carbons of the two picked residues after picking is complete. The distance is used to display the approximate number of linkers needed to link the proteins.

Define monomer set for linker section

In this section you define the monomers that make up the linkers. The monomers can be single standard amino acids, or short sequences of standard amino acids.

Define multi-residue monomer text box

Enter the sequence for a monomer in this text box. The sequence input is case-insensitive. When you have typed in the sequence, click Add to List to add it to the list of monomers.

Add to List button

Add the sequence defined in the text box to the Choose monomers list. When the sequence is added, it is automatically selected.

Add from File button

Add the sequences defined in a file to the Choose monomers list. When you click this button, a file selector opens so you can choose the file. The file must be a plain text file with one sequence per line.

Choose monomers list

Choose the monomers from this list that you want to use to construct the linkers. The list is pre-populated with the standard amino acids, to which you can add sequences using the Define multi-residue monomer text box and Add to List button or the Add from File button. The monomers that you choose do not have to be sequences of the same length.

Inter-residue distance / Average chosen monomer length text box

This noneditable text box reports the ratio of the inter-residue distance (above) to the average length of the monomers chosen for the linker. This ratio gives an approximate number of monomers that must be included in the linker to form a proper link.

Lengths of linker chains to build boxes

Set the minimum and maximum number of monomers to be used in building linkers.

Define sampling extent and composition for linkers comprised of > 1 monomer section

In this section, you define the number of linkers to be tried. The monomers are selected at random, with possible restrictions placed on the proportion of each monomer.

Random composition. Evaluate up to N variations for each length box

Select this option to specify the maximum number of linkers to generate for each linker length. Enter the maximum number of linkers in the text box.

Random composition. Evaluate up to N % of possible variations for each length box

Select this option to specify the percentage of the total possible linkers to generate for each linker length. Enter the percentage in the text box. The total possible linkers scales as NM, where N is the number of monomers and M is the linker length.

Fraction of monomer option, menu and boxes

Select this option if you want to set limits on the fraction of any monomer that is used in a linker. Linkers for which the fraction of each monomer does not fall within the specified range are discarded. Choose a monomer from the menu, and set the limits in the text boxes. No checking is done that the fractions add up to 1, so you must ensure that your choices do not result in impossible requirements (such as setting the minimum for two monomers greater than 0.5). A warning is presented if no linkers are generated, so if you see this warning, you should check the fractions if you have set them.

Linker conformation prediction option menu

Choose the method for prediction of the linker conformations from this option menu. The choices are:

  • Interdomain link library—Predict the linker conformation through a database generated by inferring interdomain linkers from the PDB using a combination of PFAM and SCOP domain annotations. The location of interdomain boundaries are refined prior to addition into the library. For design, linkers are picked based on length, sequence similarity and stem (joining) geometry scores.
  • Intradomain loop library—Use a table of known loops taken from the PDB to determine the loop conformation. The loops are filtered by length, then by joining geometry to remove loops that cannot be successfully joined, and then the loop to use is chosen based on a combination of sequence similarity score and stem (joining) geometry score. This is the fastest method.
  • Simple de novo loop creation—Build the loop residue by residue to produce a single loop conformation that has no clashes with the existing structure.

It is a good idea to refine the structure after it is built, as the simpler methods used here might not give the optimal conformation.

Energy calculation option menu

Choose a model for calculating the strain energy of the new loop. The strain energy is calculated as the difference between the linker in the minimized conformation it adopts in the protein, and the minimized free linker.

Results tab

When the calculation finishes, the results are automatically loaded in the Results tab. A dialog box is displayed if any of the linkers could not be used to cross-link the proteins, listing these linkers.

Import button

Import a set of results that has previously been exported.

Results table

This table lists the structures for each linker, showing details of the linker and its composition, the total strain energy, and the strain energy per amino acid. You can sort the table by the values in any column by clicking on the column heading.

The strain energy is useful to score multiple possible crosslinking chains on a relative basis. Generally, chains with lower strain energies are better. Performing a search of multiple lengths and conformations and focusing on those crosslinks with the lower strain energies helps to select those candidates that are more likely to accommodate the connected domains in the starting conformation. However, note that the strain energy is only one component of the energy of the linked protein structure: it does not include the interaction energy between the linker and the protein, which can compensate for the strain.

Show original structure in gray for comparison option

Show the original structure of the unlinked structure in gray in the Workspace. This structure remains in the Workspace as you step through the results.

Step through results buttons

Click these buttons to display the resulting linked proteins in the Workspace in turn. The table row for the displayed protein is selected when you click the buttons.

Export button

Export the results of the calculation to a zip file. This file can be imported to display the results at some other time.

Job toolbar

Manage job submission and settings. See Job Toolbar for a description of this toolbar.

The Job Settings button opens the Protein Linker Design - Job Settings Dialog Box, where you can make settings for running the job.

Status bar

The status bar displays information about the current job settings and status for the panel. The settings includes the job name, task name and task settings (if any), number of subjobs (if any) and the host name and job incorporation setting. The job status can include messages about job start, job completion and incorporation.

Use the Reset button to reset the panel to its default settings and clear any data from the panel. You can also reset the panel from the Job toolbar.

The status bar also contains the Help button , which opens the help topic for the panel in your browser. If the panel is used by one or more tutorials, hovering over the Help button displays a button, which you can click to display a list of tutorials (or you can right-click the Help button instead). Choosing a tutorial opens the tutorial topic.