Preparing Nucleic Acid Structures

Tutorial Created with Software Release: 2025-4

Topics: Small Molecule Drug Discovery

Products Used: Maestro

Tutorial files

291 KB

This tutorial is written for use with a 3-button mouse with a scroll wheel.

Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed

Tip: You can hover over a glossary term to display its definition. You can click on an image to expand it in the page.

Abstract:

This tutorial will guide you through essential steps of preparing nucleic acid structures to resolve issues before carrying out any modeling-related tasks.

Tutorial Content

Introduction to Structure Preparation

Creating Projects and Importing Structures

Preparing the Structure

Conclusion and References

Glossary of Terms

1. Introduction to Structure Preparation

Structure files obtained from the PDB, vendors, and other sources often lack necessary information for performing modeling-related tasks mostly due to experimental limitations of structural biology techniques and insufficient or ambiguous data. Typically, these files are missing hydrogens, partial charges, side chains, and/or whole loop regions. To make these structures suitable for modeling tasks, you will use the Protein Preparation Workflow to find and resolve common structural issues.

The Protein Preparation Workflow in Maestro involves a series of structural and functional checks to prepare structures for accurate molecular modeling and simulations. First, the structure is assessed for missing atoms or residues which are then added or corrected. Hydrogen atoms are added to ensure proper protonation states, and the structure is optimized for correct bond lengths, angles, and torsions. The overall geometry is checked for any steric clashes or unusual bond geometries. Functional checks include evaluating the protonation states of ionizable residues at the relevant pH and ensuring proper orientation of active site residues. The final structure is minimized to relieve any unfavorable interactions and prepare it for further computational analysis, such as docking or molecular dynamics simulations.

This tutorial uses the 6DN3 RNA structure, which is a flavin mononucleotide (FMN) riboswitch bound to a flavin analog ligand. The FMN riboswitch is a structured RNA element that regulates the riboflavin concentrations and controls genes related to its biosynthesis and transport upon binding with FMN ligand. In this tutorial, you will learn how to prepare nucleic acids for modeling applications.

2. Creating Projects and Importing Structures

At the start of the session, change the file path to your chosen Working Directorythe location where files are saved in Maestro to make file navigation easier. Each session in Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project, which is not saved. A Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is saved, the project is automatically saved each time a change is made.

Structures can be built in Maestro or can be imported using File > Import Structures (or drag-and-dropped), and are added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. The Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion is located to the left of the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.

Double-click the Maestro icon.
- (No icon? See Starting Maestro)

Figure 2-1. Change Working Directory option.

Go to File > Change Working Directory.
Find your directory, and click Choose.
Pre-generated files are included for running jobs or examining output. Download the zip file here: link
After downloading the zip file, unzip the contents in your Working Directorythe location where files are saved for ease of access throughout the tutorial.

Figure 2-2. Saving the project in the Save Project panel.

Go to File > Save Project As.
Change the File name to RNA_preparation.
Click Save.
- The project is now named RNA_preparation.prj.

Figure 2-3. The Get PDB File dialog box.

Go to File > Get PDB.
- The Get PDB File dialog box opens.
For PDB IDs, type 6DN3.
Click Download.
- A new entry titled 6DN3 is added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and a banner appears confirming successful import of the structure.

Note: Imported structures in Maestro are includedthe entry is represented in the Workspace, the circle in the In column is blue in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed and selected(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries in the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion by default. Please refer to the Glossary of Terms for the difference between includedthe entry is represented in the Workspace, the circle in the In column is blue and selected(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries.

3. Preparing the Structure

In this section, you will prepare the 6DN3 RNA structure using the Protein Preparation Workflow panel to make it suitable for modeling tasks performed later in this tutorial. Despite the name Protein Preparation Workflow, this panel in Maestro involves a series of structural and functional checks that can be used to prepare both proteins and nucleic acids (DNA/RNA) for accurate modeling and simulations. For more information, see the Protein Preparation Workflow panel documentation.

Figure 3-1. The Protein Preparation Workflow in Tasks.

Find and select Protein Preparation Workflow in Tasks.
- The Protein Preparation Workflow panel opens in the Preparation Workflow tab.

Note: You can also click Protein Preparation in the Favorites Toolbar.

Figure 3-2. The Valences tab in Diagnostics showing valence errors in the structure.

Before preparing the structures, you should check for potential issues.

Go to the Diagnostics tab.
Click Check Workspace Entry.

The Valences tab shows valence errors present in the structure. These are caused by missing hydrogen atoms or incorrect bond assignments and will be resolved during structure preparation.

Figure 3-3. The Missing tab in Diagnostics showing missing atoms.

The Missing tab shows a few missing heavy atoms, including the terminal phosphate groups and a Uracil residue.

Note: If following along with your own structure, see the Protein Preparation Workflow panel help for how to use the other diagnostics tabs.

Also note that the structure and quality metrics in the Reports tab are specific for protein structures, and not applicable to nucleic acid structures.

It is important to note that the 6DN3 structure itself is a split RNA, representing a segment of a larger functional molecule. In many cases, crystallographic structures are derived from constructs that are truncated or represent only a portion of the complete biological sequence to facilitate crystallization. This can result in incomplete electron density maps for flexible terminal regions, such as 5’ or 3’ ends, leading to unresolved atoms or residues in the deposited PDB file. As these missing atoms are located at the extremities of the crystallized fragment, their absence is unlikely to significantly impact the structural integrity or functional analysis of the core riboswitch fold or its ligand-binding interactions. Therefore, manual rebuilding of these terminal missing atoms or residues is generally not necessary.

Figure 3-4. Reviewing the structure before preparation.

You can review the contents of the structure to remove crystallographic artifacts, solvents, etc. needed.

Go to the Substructures tab.
Click Load Workspace Entry.
- The tables are populated.
Ctrl+Click (Cmd+Click) to select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries all the ions (CL XXX, MG XXX and K XXX) in the hets (Ligands, Metals, Other) table.
Click Delete from entry.
- A new entry 6DN3 - with-deletions is added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and is includedthe entry is represented in the Workspace, the circle in the In column is blue in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.

Ions such as K⁺, Mg²⁺, and Cl⁻ are often added during crystallization. While ions are crucial for nucleic acid structure and function, our current best practice is to remove experimentally resolved ions from PDB structures and add appropriate ions in silico during MD and FEP+ simulations. This recommendation is based on several observations:

This approach maintains nucleic acid structural stability and yields binding affinities consistent with experiment (see this publication).
In silico–added cations localize near the nucleic acid backbone in simulations (see this publication).
Many crystallographic ions are artifacts used to stabilize structures and do not reflect dynamic RNA–ion interactions (see this publication).
Monovalent ions interact diffusely and dynamically with RNA, while divalent ions form localized hydration shells (see this, this, and this publications). Importantly, precise ion positions are not essential for structural integrity.

We recommend adding counterions and physiological salt concentrations during system setup for MD and FEP+ calculations. Please note, while these are our current best practices we continue to investigate the role of ions in RNA modeling and will update our recommendations as new information comes up.

Figure 3-5. Running the preparation job.

Return to the Preparation Workflow tab.

We recommend leaving the crystallographic waters in place during preparation.

Under Minimize and Delete Waters, click Settings.
For Delete waters, uncheck Distant from ligands (hets).
Change the Job name to RNAprep_6DN3.
Click Run.
- This job takes ~2 minutes.
- A new group RNAprep_6DN3-out is added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion.
Close the Protein Preparation Workflow panel.

The default settings of the Protein Preparation Workflow can generally be applied to nucleic acids. However, a few options in the panel are specific to proteins: The Capping termini option adds N-acetyl (ACE) and N-methyl amide (NMA) groups to the uncapped charged amino group at the N-terminus and a carboxyl group at the C-terminus respectively. Nucleic acids have chemically different termini, which are not recognized or capped. In addition, the “Fill in missing side chains” will not work for nucleic acids.

Figure 3-6. Duplicating the prepared structure.

For many methods using static receptor structures, you will need a ‘dry’ receptor with all water molecules removed.

Include 6DN3 – prepared in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.
Right-click 6DN3 – prepared and choose Duplicate > Entries Only (In Place).

Figure 3-7. Renaming the duplicated structure.

Double-click the duplicated 6DN3 – prepared entry and rename it to 6DN3 – prepared_dry.

Figure 3-8. Deleting water molecules from the duplicated structure.

By deleting waters in the duplicated entry, the original structure is preserved in case you need it for other applications.

In the Hierarchy, expand Solvents.
Right-click Waters and choose Delete Atoms.
- The water molecules are deleted.

4. Conclusion and References

In this tutorial, you learned how to prepare nucleic acid structures in Maestro using the Protein Preparation Workflow panel, resolving issues so that the structure can be used for modeling-related tasks. You can now use this prepared structure to learn to visualize RNA-ligand interactions, analyze binding sites, dock ligands into the nucleic acid receptor, optimize ligand-RNA interactions, or set up FEP+ calculations. See the further learning section below or the oligonucleotide modeling learning path for additional resources.

Click to Expand

For further learning:

Click to Expand

For further reading:

Predicting and Modeling RNA Architecture

Small molecule approaches to targeting RNA

5. Glossary of Terms

Entries - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion

Included - the entry is represented in the Workspace, the circle in the In column is blue

Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data

Recent actions - This is a list of your recent actions, which you can use to reopen a panel, displayed below the Browse row. (Right-click to delete.)

Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project

Selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries

Working Directory - the location where files are saved

Workspace - the 3D display area in the center of the main window, where molecular structures are displayed