Batch Homology Modeling Using the Multiple Sequence Viewer/Editor
Tutorial Created with Software Release: 2024-2
Topics: Antibody Design , Biologics Drug Discovery , Enzyme Engineering , Structure Prediction & Target Enablement
Products Used: BioLuminate
|
0.9 KB |
This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayedthe 3D display area in the center of the main window, where molecular structures are displayed
Abstract:
In this tutorial, you will learn how to build multiple homology models using a single template with the Multiple Sequence Viewer/Editor
Tutorial Content
1. Creating Projects and Importing Structures
At the start of the session, change the file path to your chosen Working Directorythe location that files are saved in Maestro to make file navigation easier. Each session in Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project, which is not saved. A Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is created, the project is automatically saved each time a change is made.
Structures can be imported from the PDB directly, or from your Working Directorythe location that files are saved using File > Import Structures, and are added to the Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. The Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion is located to the left of the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.
- Double-click the BioLuminate icon
- (No icon? See Starting Maestro))
- Go to File > Change Working Directory
- Find your directory, and click Choose
- Pre-generated input and results files are included for running jobs or examining output. Download the zip file here: https://www.schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/batch_homology.zip
- After downloading the zip file, unzip the contents in your Working Directory for ease of access throughout the tutorial
- Go to File > Save Project As
- Change the File name to batch_homology , click Save
- The project is now named batch_homology.prj
2. Loading and Analyzing Sequences in the Multiple Sequence Viewer/Editor
In this section, we will load 8 sequences into the Multiple Sequence Viewer/Editor, perform a multiple sequence alignment, generate a logo plot, and calculate and display several sequence-based descriptors that can be used to triage large batches of sequences.
- Go to Tasks > Biologics > Multiple Sequence Viewer/Editor
- The Multiple Sequence Viewer/Editor opens
- The sequences includedthe entry is represented in the Workspace, the circle in the In column is blue in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed sequences are shown
- In the Multiple Sequence Viewer/Editor, go to File > Import Sequences from File
- Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the file pneumolysin_seqset.fasta
- Click Open
- The sequences are loaded into the Multiple Sequence Viewer/Editor
- Click Align
- For Using, choose Multiple Sequence Alignment
Note: Make sure Selected only is unselected
- Click Align
Now that the sequences are aligned we can look at the degree of conservation across using a Sequence Logo chart
- Hover over the chart icon and click the …
- Click Sequence Logo
- The Sequence Logo plot is now visible above the sequence
We are now going to compute sequence descriptors for all of our sequences. For a complete list of the available protein sequence descriptors, along with explanations and references, see the Protein Sequence Descriptors documentation page
- Go to Other Tasks > Compute Sequence Descriptors
Note: All of the descriptors in the table will be calculated by default click Add to calculate additional descriptors
- Click OK
- The Calculate Sequence Descriptor job is launched
Note: The descriptors will not be added to the Multiple Sequence Viewer/Editor by default - a pop-up will appear letting you know the job is completed
- Click Add
- Select Descriptors
- Type and select Bulkiness, Relative Mutability, and Transmembrane Tendency
- The three descriptors have been added to the Multiple Sequence Viewer/Editor
As we aren’t going to do anything with the calculated descriptors at the moment we can hide them from the Multiple Sequence Viewer/Editor.
- Click the + icon
- Click Show properties
- Click Hide all
- The properties are now removed from the Multiple Sequence Viewer/Editor
3. Building Batch Homology Models
In this section, we will use a single template structure that we identify using a BLAST search in order to build homology models for the 8 loaded sequences. Batch homology modeling is appropriate only for a set of sequences with high identity.
- In the Multiple Sequence Viewer/Editor, go to Other Tasks > Build Homology Model
- The Build Homology Model panel opens
- Click Find
- A dialog appears
- Click the cog icon
- Uncheck Use local server only
Note: This requires internet access. The ‘Use local server only option’ is checked by default to prevent your BLAST searches and related tasks from going out to remote servers. To allow remote access (after a confirmation), clear this option. It is also available from the top-level Edit → Settings and Defaults menu.
- Click Run Search
- A message appears requesting remote access. Click OK
- A BLAST search is launched
- The search may take 1-2 minutes to complete
There are many templates with 100% sequence identity which would surely be the most ideal if we were running this as part of a project. For pedagogical reasons we will select 5AOD, which itself has incredibly high sequence identity
- Select 5AOD_A
- Click Import
- 5AOD_A has been added to the Multiple Sequence Viewer/Editor
- Click Set as Reference
- 5AOD is set as the reference and is now in the top of the list
- For Job name, write homology_modeling_batch
- Click Generate Model
- This job will take ~ minutes to complete
- 8 models will be created
- Shift-click in Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion to include all eight homology modeling outputs
- The structures are included in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed
- Cyan colored ribbons correspond to when residue backbone conformation is copied from the template, and a side chain mutation is at this position
After building homology models in batch, the following are common next steps:
- Calculate Protein Descriptors - Assuming you had some observable (experimental endpoint) associated with the sequences (and now structure) you could load the descriptors along with the observable into an ML engine to build a model that associated some combination of the descriptors with the observable (or to look at feature importance/selection). See this paper for an example.
- Run protein_patch_calculation.py to run Protein Surface Analysis in bulk from the command line
4. Conclusion and References
In this tutorial, we analyzed a series of sequences by first aligning them to a reference, then displaying a sequence logo plot and several sequence-based descriptors. We then successfully built homology models for all of the sequences using a very close homolog.
For further learning:
- Chimeric Homology Modeling Using the Multiple Sequence Viewer/Editor
- Introduction to Structure Preparation and Visualization
- Introduction to Computational Antibody Engineering online course (Course Page | Preview)
- Introduction to Molecular Modeling in Drug Discovery online course (Course Page | Preview)
For further reading:
5. Glossary of Terms
Entry List - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion
included - the entry is represented in the Workspace, the circle in the In column is blue
incorporated - once a job is finished, output files from the Working Directory are added to the project and shown in the Entry List and Project Table
Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data
Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project
selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries
Working Directory - the location that files are saved
Workspace - the 3D display area in the center of the main window, where molecular structures are displayed