Conformer Cluster Panel

Cluster a set of conformers using hierarchical agglomerative clustering based on the atomic or torsional RMSD of the conformers, and apply the clustering to produce a file for each cluster or a single file containing representatives of each cluster.

To open this panel: click the Tasks button and browse to Discovery Informatics and QSAR → Clustering of Conformers.

Conformer Cluster Panel Features

 

Use structures from option menu

Choose the structure source for clustering.

  • Project Table (n selected entries)—Use the entries that are currently selected in the Project Table or Entry List. The number of entries selected is shown on the menu item. An icon is displayed to the right which you can click to open the Project Table and select entries.
  • File—Use the specified file. When this option is selected, the File name text box and Browse button are displayed.
Open Project Table button

Open the Project Table panel, so you can select the entries for the structure source.

File name text box and Browse button

Enter the file name in this text box, or click Browse and navigate to the file. The name of the file you selected is displayed in the text box.

Conformer Clustering section

Set options for calculating the RMSD matrix to be used in the clustering, and run the calculation.

Cluster by options

Choose the quantity used to evaluate the RMSD matrix.

  • Atomic RMSD—Use the RMSD between corresponding atoms in the conformers. The controls for selecting comparison regions are displayed when you choose this option.

  • Torsional RMSD—Use the RMSD between corresponding torsions in the conformers. Controls for defining the torsions (dihedrals) are displayed when you choose this option.

Define regions section

Selecting either Atomic RMSD or Torsional RMSD in the Cluster by section above will change the controls displayed in this section accordingly.

    Select comparison regions controls

    Choose the atoms to use for calculating the atomic RMSD. These controls are displayed when you select Atomic RMSD for Cluster by.

      Workspace Selection option

      Select to use the Workspace selection as the atoms for the atomic RMSD.

      Picked Atoms option

      Pick atoms or groups of atoms in the Workspace for the atomic RMSD in conjunction with the Pick option and menu, detailed below.

      Pick option and menu

      Select this option to pick atoms or groups of atoms in the Workspace for the atomic RMSD, and choose a type of atom group from the menu, e.g. Residues.

      Custom ASL option

      Opens the Atom Selection Dialog Box, to select the atoms for the atomic RMSD.

      Heavy Atoms + OH, SH option

      Use all heavy (non-hydrogen) atoms, plus hydrogens attached to oxygen and sulfur, for the atomic RMSD.

      Heavy Atoms option

      Use all heavy (non-hydrogen) atoms for the atomic RMSD.

      All Atoms option

      Use all atoms for the atomic RMSD.

      Selected Atoms for RMSD list

      Displays the list of atoms that is to be used for comparing conformers. The list is populated by using the tools above the list. You can select rows in the list to delete using the Delete button (to the right), and you can delete all rows by clicking Delete All.

      More options for Atomic RMSD
        Retain mirror-image conformers option

        Treat mirror-image conformers (enantiomers) as separate structures in the clustering. If this option is not selected, enantiomers are treated as the same conformer (stereochemistry is ignored).

        Check structure equivalence for large molecules (>300 atoms) option

        Check that the structures are actually conformers. This is not done by default for structures with more than 300 atoms, to save time. If you want to perform the check on structures with more than 300 atoms, select this option.

        Output single structure per cluster with most options and menu

        Select this option to choose the representative structure from a cluster on the basis of an energy property. Choose the property from the option menu, and select the option for whether the desirable property value is the more negative or the more positive of the values in the cluster.

        Calculate RMSD only (without superimposing structures) option

        If using atomic RMSD to calculate the matrix, calculate it without superimposing the structures. If this option is not selected, a superposition is performed on the structures first. You might want to use this option if the conformers are already in the desired relative positions and do not need to be superimposed (e.g. for aligned proteins, docked poses).

    Define torsions controls

    Choose the torsions (dihedrals) to use for calculating the torsional RMSD. These controls are displayed when you select Torsional RMSD for Cluster by.There are two options for choosing the torsions:

      Picked Atoms option

      Pick atoms or groups of atoms in the Workspace for the torsional RMSD in conjunction with the Pick option and menu, detailed below.

      Pick option and menu

      Select this option to pick atoms or groups of atoms in the Workspace for the atomic RMSD, and choose a type of atom group from the menu:

      • Atoms—Pick four atoms in the Workspace to define a torsion. When you have picked four atoms, the torsion is added to the list and you can continue picking to define more torsions.
      • Bonds—Pick three bonds in the Workspace to define a torsion. When you have picked three bonds, the torsion is added to the list and you can continue picking to define more torsions.
      • Residues—Pick residues in the Workspace to add all the torsions (backbone and side chain) in the residues to the list.
      Dihedrals option and menu

      Pick the torsion quartets based on the critera of the selected option of main chain dihedral angles from the menu.

      Custom ASL option

      Select this option to choose protein dihedrals or use the Atom Selection Dialog Box to select atoms that define torsions.

      All (include terminal dihedrals) option
      Include all dihedrals in the list, including those to terminal atoms (such as methyl hydrogens).
      All (exclude terminal dihedrals) option
      Include all dihedrals in the list except those to terminal atoms (such as methyl hydrogens).
      All torsions along rotatable bonds option
      Include dihedrals in the list for rotatable bonds only. This option excludes ring dihedrals, for example.
      Selected Torsions for RMSD list

      Displays the list of torsions (given as a quadruple of atom numbers) that is to be used for comparing conformers. The list is populated by using the tools above the list. You can select rows in the list to delete using the Delete button (to the right), and you can delete all rows by clicking Delete All.

      More options for Torsional RMSD
        Retain mirror-image conformers option

        Treat mirror-image conformers (enantiomers) as separate structures in the clustering. If this option is not selected, enantiomers are treated as the same conformer (stereochemistry is ignored).

        Check structure equivalence for large molecules (>300 atoms) option

        Check that the structures are actually conformers. This is not done by default for structures with more than 300 atoms, to save time. If you want to perform the check on structures with more than 300 atoms, select this option.

        Output single structure per cluster with most options and menu

        Select this option to choose the representative structure from a cluster on the basis of an energy property. Choose the property from the option menu, and select the option for whether the desirable property value is the more negative or the more positive of the values in the cluster.

    Linkage method option menu

    Choose a linkage method for clustering from the following:

    Single Shortest distance between inter-cluster pairs. Produces diffuse, elongated clusters
    Complete Longest distance between inter-cluster pairs. Produces compact, spherical clusters
    Average Average distance between all inter-cluster pairs
    Centroid Euclidean distance between cluster centroids
    McQuitty Average distance to the two clusters merged in forming a given cluster
    Ward Sum of squared distances to merged cluster centroid (minimum variance)
    Weighted Centroid Weighted center of mass distance, also known as median
    Flexible beta Weighted average intra-cluster and inter-cluster distances (Lance-Williams) with beta=0.25.
    Schrödinger Closest distance between terminal (right-to-left) points in 1D cluster orderings.
    Number of clusters to generate text box

    Specify the number of clusters to generate.

    Output Structures to options

    Choose output destination and settings.

    • Project Table
      • All structures into clusters—incorporate all the structures; each cluster of structures is imported as an entry group.
      • One structure per cluster (closer to the centroid)—incorporate the structure that is nearest to the centroid of each cluster.
    • Working Directory

Result Analysis section

View the results of the clustering in various plots.

Clustering Statistics button

Display a plot of various statistics of the clustering as a function of the number of clusters, in the Clustering Statistics panel. The statistics are: Kelley penalty, R-squared, Semipartial R-squared, Merge distance, Separation ratio. You can click in the plot to set the number of clusters in the Number of clusters text box.

Dendrogram button

Display a dendrogram of the hierarchy of clusters, in the Dendrogram panel. You can click in the plot to set the number of clusters in the Number of clusters text box.

Distance matrix button

Display the distance matrix used for clustering graphically, with values represented by a color map, in the Distance Matrix panel. You can display the matrix in cluster order (as shown in the Dendrogram panel) or in the original (input) order. You can click in the plot to display the 2D structures in the panel, and optionally in the Workspace.