sequence alignment, force alignment, superposition, superimpose, constraints
Multiple Sequence Viewer/Editor — Align Pane
Align sequences and structures, and make settings for the alignments.
- Features
- Additional Resources
Align Pane Features
- Align options
- Sequence alignment options
- Using option menu
- Find globally conserved residues (Pfam) option
- Set Constraints option and Clear link
- Lock Gaps option
- Superimpose structures following alignment option
- Structure alignment options
- Selected only option
Align options
Select the objects to align, either Sequences or Structures. The tools displayed below depend on this choice.
Sequence alignment options
If you select sequence alignment, the following options are shown:
- Using option menu
-
Choose the method for performing the alignment. The options available below this menu vary according to the choice you make.
- Multiple sequence alignment
-
Align the selected sequences simultaneously using ClustalW or MUSCLE. If there are columns (residues) selected, the alignment is performed only on the selected residues. You can run an alignment on several discontinuous selected regions at the same time.
There are two options, the Find globally conserved residues (Pfam) option, and the Superimpose structures following alignment option; and in addition a Settings button
that opens a pane with the following options: - Pairwise sequence alignment
-
Align the selected sequences pairwise using a Smith-Waterman algorithm.
When multiple simultaneous pairwise alignments are requested, gaps are locked automatically after the first pair is aligned. Existing gaps in the Reference sequence are preserved, though more gaps may still be added.
There are three options, the Set Constraints option and Clear link, the Lock Gaps option, and the Superimpose structures following alignment option; and in addition a Settings button
that opens a pane with the following options:- Similarity matrix type option menu
-
Choose the type of similarity matrix to use for the alignment.
- Penalties settings
-
Specify the penalty for opening a gap or extending a gap.
- Prevent gaps in secondary structure option
-
Prevent opening a gap in a secondary structure element (helix or sheet). The alignment is slower when you use this option.
- Apply penalties to initial and final gaps option
-
Apply penalties to gaps at the termini of the sequence. The default is to not penalize these gaps. Setting this option can result in isolated residues at the termini with large gaps.
- Pairwise with secondary structure prediction
-
Align the selected sequences pairwise taking into account secondary structure matching as well as profile-sequence matching. See sta—Single Template Alignment for more information.
There are two options, the Set Constraints option and Clear link, and the Superimpose structures following alignment option; and in addition a Settings button
that opens a pane with the following option: - Current structure superposition
-
Align sequences according to the geometric proximity of their residues. The structures for the sequences must be already superimposed. The alignment is done by calculating the matrix of Cα–Cα distances for each sequence pair, then using these matrices for scoring to determine the alignment.
There is one option, the Superimpose structures following alignment option.
- Residue numbers
-
Align sequences so that residues with identical residue numbers (and insertion codes) are aligned. This is useful for families of proteins that share common numbering schemes, such as antibodies.
- Profile alignment
-
Align the selected sequences when the selection involves an alignment set that contains the reference sequence and another sequence or another alignment set. Alignment sets are sets of sequences in which the alignment between the sequences is preserved. This means that aligning one member of the set aligns them all.
Profile alignment cannot be performed with combined chains. Select Split chains in the View Options Pane to enable the alignment.
Most of these alignment methods are also available from the Align menu.
- Find globally conserved residues (Pfam) option
-
Find globally conserved residues by running HMMER on the Pfam database. A Hidden Markov Model (HMM) is generated from a multiple sequence alignment and used to identify the family of the reference and provide information about which residues are conserved in the consensus sequence. Capital letters indicate highly conserved residues, lowercase letters indicate a match to the HMM, + means the match is conservative, and a blank indicates that the residue does not match the HMM.
This option is only available for Multiple sequence alignment.
- Set Constraints option and Clear link
-
Apply constraints on pairwise alignments, so that the constrained residues are in the same position (same column) after the alignment.
When you select this option, a constraint row is displayed between the reference sequence and the other sequences, and the constraints banner is displayed in the toolbar, with instructions. To add a constraint, click on a residue in the reference sequence and then on a position in one of the other sequences. The constraints are displayed as blue lines connecting the constrained residue pair. To remove a constraint, click on the constrained residue pair again.
This option is currently only available for a single pairwise alignment. If the tab contains multiple non-reference sequences, you must select the desired sequence, and set Selected only in the Align pane. The sequence is moved to the top and the Constraints annotation row is added below the reference sequence.
Click the Clear link to clear the constraints.
This option is only available with the pairwise alignment methods.
- Lock Gaps option
-
Lock gaps in the alignment so that they are not filled when performing an alignment. If you insert a gap after locking the gaps, the new gap is not automatically locked. If you have a residue selection, the gaps are only locked in the selected region.
-
When multiple simultaneous pairwise alignments are requested, gaps are locked automatically after the first pair is aligned.
- Superimpose structures following alignment option
-
Superimpose the structures that are linked to the sequence using the sequence alignment. If the reference sequence is missing a structure, the first sequence with a structure is taken as the reference for the superposition.
This option is available for all alignment methods except Current structure superposition and Profile alignment.
Structure alignment options
If you select structure alignment, the following options are shown. Some are only shown for a particular method.
If the sequence selection contains a mix of sequences that have an associated structure ("structured sequences") and sequences that do not ("structureless sequences"), the alignment is performed only on the structured sequences, after confirmation.
- Using option menu
-
Choose the method for performing the alignment
- Current sequence alignment
-
Superimpose structures based on their sequence alignment. Uses the Superposition Panel, with sequence identities selected as the atoms for superposition. If the reference sequence does not have a structure, the first sequence in the set for alignment that has a structure is used as the reference for the superposition.
- Protein structure alignment
-
Run the Prime Protein Structure Alignment program on the selected (or all) protein structures and return the alignments. The sequences you select must have structures associated with them. See Protein Structure Alignment Panel and Multiple Template Alignment: structalign for more information.
There are two settings, the Force Alignment (when structures are dissimilar) option, and the Align sequences following superposition option; and in addition a Settings button
that opens a pane with the following settings:- Each sequence represents option
-
Choose an option for the kind of structure the sequence represents, either a single chain from an entry, or the entire entry. The choice determines the settings that are displayed in the rest of the pane. If the Split Chain View option is selected, the Entire entry option is not available.
- Single chain
- Align transforms options
-
Specify the part of the structure that the alignment transforms and how the result is stored.
- Existing entries—the alignment of the single chains is applied to the entire existing entries, whose coordinates are updated to align the specified chains.
- Individual chains (new entries)—the alignment of the single chains is applied to the individual chains of the source entries, and new entries are created for each of these chains. The source entries remain unchanged.
- Map sequences to specific reference chains option and table
-
Map the sequences that are selected in the viewer (or the entire sequence alignment) to specified reference entry chains. The reference entry must have multiple chains selected for alignment and have structure. For each other entry, only one sequence must be selected for alignment. The table allows you to choose the reference chain that maps to each of the sequences, by clicking in the Reference Chain table cell and selecting the chain from the list that is shown. All of the reference chains must be mapped to a sequence before the alignment can be performed. Reference chains can be mapped to more than one sequence.
Only available if Existing entries is selected.
- Entire entry
- Use reference structure residues options
-
Select an option for the residues from the reference structure to use for the alignment.
- All—use all residues
- Selected—use the selected residues
- Matching ASL—use the residues specified with the standard picking tools. The Load Selection button loads the residue selection from the sequence list.
- Perform alignment on other structures' residues options
-
Select an option for the residues from the other structures to use for the alignment.
- As defined above—use the same residues as for the reference sequence
- Matching different ASL—use the residues specified with the standard picking tools. The Load Selection button loads the residue selection from the sequence list.
- Binding site alignment
-
Run the Prime Align Binding Sites program to align the binding sites of the selected (or all) proteins and return the alignments. The sequences you select must have structures with ligands associated with them. See Align Binding Sites Panel andStructure Alignment: align_binding_sites for more information.
There are two settings, the Defined as residues within N Å of ligand text box, and the Align sequences following superposition option; and in addition a Settings button
that opens a pane with the following settings:- Align selected residues only (no detection) option
-
Align the binding sites defined by the selected residues in each sequence; do not automatically detect the binding sites.
- Ignore atom pairs further apart than N Å text box
-
If the Cα atoms in a residue pair are greater than the specified distance apart, do not use this pair in the alignment process. A residue pair consists of corresponding residues from two structures that are being aligned.
- Entire structures were previously aligned option
-
Selecting this option assumes that a global alignment has already been done to place the structures in a common frame of reference. By default, a global alignment is performed.
- Selected residues only option
-
Align only the selected residues, rather than the whole structure.
- Force Alignment (when structures are dissimilar) option
-
Align proteins even if the structures are not sufficiently similar.
- Defined as residues within N Å of ligand text box
-
Set the cutoff distance from the ligand for determining the binding site. A residue is considered to be in the binding site if any nonhydrogen ("heavy") atom in the residue is within the specified distance of any nonhydrogen atom in the ligand.
- Align sequences following superposition option
-
Align the sequences according to the structure alignment after performing the structure superposition (see Current structure superposition).
Selected only option
Align the sequence or structure for the selected non-reference sequences only. The reference sequence is always included, and sequences or structures are aligned to it. The state and availability of this option may depend on restrictions that apply to the alignment method chosen. Check the information for the method given under Using option menu, above.
If deselected, align all sequences.