Creating a Coarse-Grained Model for Protein Formulations
Tutorial Created with Software Release: 2025-4
Topics: Pharmaceutical Formulations
Methodology: Coarse-Grained Modeling
Products Used: MS CG , MS Maestro
|
585 MB |
This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed
Abstract:
In this tutorial, we will use the Coarse-Grained Force Field Builder to automatically fit parameters to the Martini coarse-grained force field for a complex protein solution system.
Tutorial Content
1. Introduction
The Martini force field is a widely used coarse-grained (CG) molecular dynamics (MD) model employed in simulations of complex biological and chemical systems. By representing groups of atoms as single particles, CG models such as Martini reduce the computational cost while maintaining a balance between accuracy and efficiency. Originally developed to study lipid membranes, the force field has since been extended to a wide range of biomolecular systems, including proteins, nucleic acids, and carbohydrates, as well as non-biological materials like polymers and small molecules (see References).
CG force field parameters are derived by simplifying molecular models and parameterizing interactions to reproduce key properties of the system. This involves mapping detailed atomistic models to simplified representations while preserving essential physics. Atoms are grouped into CG particles, with each particle representing a functional group, molecule, or other structural unit (e.g., a CH3 group, a water molecule, or an aromatic ring). The degree of coarse-graining—how many atoms are grouped per particle—depends on the system's complexity and the size and length scale of interest. In Martini, the standard mapping scheme is 4 heavy atoms per particle. However, depending on the molecule's topology, some particles may be mapped to 2 or 3 heavy atoms instead. Martini’s parametrization is based on both experimental data and atomistic simulations, making it a versatile tool for exploring large-scale molecular interactions and processes over extended timescales (see References).
In general, CG force field parameters are derived using either top-down fitting, which targets bulk properties, or bottom-up fitting, which focuses on average structural properties. Schrödinger's automated CG fitting technology employs a bottom-up fitting approach. In the automated CG fitting, an atomistic simulation is used as reference providing bonds, angles, and dihedrals distributions, as well as radial distribution functions used respectively for fitting CG bonded and non-bonded parameters. This approach ensures that the CG model captures the relevant structural properties of the system.
In this tutorial, we will take the all-atom equilibrated protein solution system from the Simulating Complex Protein Solutions tutorial and map it to a CG system using the Coarse-Grained Mapping and Coarse-Grained Force Field Builder panels with a Martini 2 force field. This tool allows us to parametrize the CG force-field with system specific details following an iterative procedure to optimize bonded and non bonded interactions. The interactions are then visualized in the Coarse-Grained Force Field Builder Viewer panel.
In addition to the panel help documentation, please visit the Coarse-Grained Modeling in the Materials Science Suite page for an overview. For this tutorial, it is recommended to read Potentials and Simulation Types for Coarse-Grained Modeling, Coarse-Grained Modeling with the Martini Force Field, Selecting Martini Parameters and Site Types for Martini.
2. Creating Projects and Importing Structures
At the start of the session, change the file path to your chosen Working Directorythe location where files are saved in MS Maestro to make file navigation easier. Each session in MS Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project, which is not saved. A MS Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is saved, the project is automatically saved each time a change is made.
Structures can be built in MS Maestro or can be imported using File > Import Structures (or drag-and-dropped), and are added to the Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. The Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion is located to the left of the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.
-
Double click the Maestro or Materials Science icon to start Maestro or MS Maestro
- No icon? See Starting Maestro
- This tutorial uses MS Maestro, but this workflow can be performed in Maestro or MS Maestro. Use whichever interface you are comfortable with or typically use for your projects.
- Go to File > Change Working Directory
- Find your directory, and click Choose
- Pre-generated files are included for running jobs or examining output. Download the zip file here: schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/automartini_protein.zip
- After downloading the zip file, unzip the contents in your Working Directorythe location where files are saved for ease of access throughout the tutorial
- Go to File > Save Project As
-
Change the File name to automartini_protein_tutorial, click Save
-
The project is now named
automartini_protein_tutorial.prj
-
The project is now named
- Go to File > Import Structures
-
Navigate to where you downloaded the provided tutorial files (presumably in your working directory), choose
multistage_simulation_trp_cage_excipients > multistage_simulation_trp_cage_excipients-out.cms -
Click Open
- A new entry is added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion containing the entry disordered_system_trp_cage_excipients_all_components_amorphous system
- The imported structure is the equilibrated MD system from the Simulating Complex Protein Solutions tutorial and it contains the 2JOF protein, Tween 20, sucrose, water, chlorine anions, and sodium cations
3. Coarse-Grained Model Creation for a Complex Protein Solution
In this section, we will use the Coarse-Grained Mapping panel and Coarse-Grained Force Field Builder to generate Martini CG parameters for our 2JOF protein system.
- In the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the only entry
-
Go to Tasks > Materials > Classical Mechanics> Coarse Grain Models > Coarse-Grained Mapping
- The Coarse-Grained Mapping panel opens
- Check Use automated CG mapping
- Select Martini
- Change the Job name to coarse_grained_mapping_trp_cage_excipients
-
Adjust the job settings
as needed. This job requires a CPU host. The job can be completed in about 5 minutes.
- If you receive a warning that says “Nucleic acids are not fully supported for automated Martini mapping, we recommend contacting Schrödinger Support,” click OK
The panel is populated with the mapped particles and the workspace is updated when the job completes.
- Go to the Restraints tab
- Check Unique Molecules
- Checking unique molecules allows one to examine how individual types of molecules are mapped as opposed to seeing the mapping of the whole system
-
Select all the rows and view the restraints in the workspace
- Restraints can be added or removed and then saved for future use
- Go back to the Particle Types tab
- Check Systems
- Export mapped structures to CGFF Builder
- When either Systems or Unique Molecules is selected the atomistic and CG versions can be saved to the project table. By hitting Export mapped structures to Project table these can be manually transferred to the CGFF builder and the molecules can also be used to build systems (e.g., with the disordered system builder).
- When Systems is selected the mapping information can be transferred directly to the CGFF builder panel. This will automatically open up the CGFF builder.
The Coarse-Grained Force Field Builder panel automatically opens and the disordered_system_trp_cage_excipients_all_components entry is automatically loaded. This entry contains both the CG system and its associated all-atom trajectory.
- Select Martini for the Coarse-graining type
- Check Use force field and select Martini_solution from the dropdown menu
- Go to the Mapped Atoms tab
This tab is autopopulated with the information transferred from the Coarse-Grained Mapping panel. Each particle contains a complex SMARTS string.
- Go to the FF Parameters tab
- Click Populate using structures
- The Particle subtab contains the list of CG bead names and the Martini bead type
- This can take a minute or two
-
Go to the Nonbonded subtab
- The non-bonded interaction pairs are listed in the table
- The interaction strength (ε) is a fitting parameter while the particle size will be kept constant by default.
- Click Import from Force Field
- Encrypted parameter values are listed for fixed protein-protein interactions, see the help documentation for more information
-
Go to the CG Simulation tab
- For each iteration a 20 ns NPT CG MD simulation will be performed for the fitting procedure. We will keep the default settings.
- Go to the Fitting tab
- Analyze last 40% of simulation
- For each iteration the last 40% of the trajectory will be used for the fitting procedure
- Change the Job name to cgff_builder_trp_cage_excipients
-
Adjust the job settings
as needed. This job requires a GPU host. The job can be completed in about 10 hours.
-
If you would like to run the job yourself, click Run. Otherwise, go to File > Import Structures, navigate to the provided tutorial files and Open
Section_03 > cgff_builder_trp_cage_excipients > cgff_builder_trp_cage_excipients-out.cms - Close the Coarse-Grained Force Field Builder panel
Once the job is completed or after importing, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries and includethe entry is represented in the Workspace, the circle in the In column is blue cgff_builder_trp_cage_excipients-out.cms
4. Review of the Automated Martini Coarse-Grained Results
In this section, we will review the output and the fitting quality of the results of the automated fitting procedure from the previous step.
- Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries and includethe entry is represented in the Workspace, the circle in the In column is blue the cgff_builder_trp_cage_excipients-out entry in the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion
-
Use the Workflow Action Menu (WAM) button (
) to open the Coarse-Grained Force Field Results panel
- Alternatively go to Tasks > Materials > Classical Mechanics> Coarse Grain Models > Coarse-Grained Force Field Builder Results
- The Coarse-Grained Force Field Builder Viewer panel opens
In the Builder Data tab is a summary of input parameters set in the parameters from the simulation. Specifically, the Particle name, SMARTS string for each bead type, Charge and Mass of the CG particles. Note that the WF type particle appears as one of the CG particle types in the system.
- Go to the Convergence tab
This plot shows the force field parameters fitted using the Coarse-Grained Force Field Builder Panel, plotted against the number of fitting iterations to illustrate how they evolve during the fitting process. By default, the non-bonded interactions are displayed. The Epsilon values have a clear trend for the first 15-20 iterations. The values become more stable (graph plateauing) as more iterations are accumulated. Using a longer CG simulation time (60 ns rather than 20ns) might help get more stable values.
The initial sigma values are recommended since they were not included in this fitting process.
Later in this section, we will look at ASP_S1 (ASP sidechain) with E2 (PEG group from tween 20). There are two ASP groups and 80 E2 groups in the system so there may be moderate sampling for this interaction.
-
Select ASP_S1,E2 for the Type
- E2 is a moderately small site with a short bond-length to adjacent sites.
- In Martini 2, the standard rules tend to overestimate the strength of non-bonded interactions involving particles with short bonds.
- We see that the Epsilon drops from the initial value during the iterations resulting in weaker interactions.
-
Change the Forcefield type to Bonds
- In the final iterations, the parameters should consistently oscillate around a stable value
- Change the Forcefield type to Angles
-
Select E2(C2),(C3,1,1)C2(O4),(C1)E2 for the Type
- Here, we are focusing on an E2-C2-E2 angle potential (specifically E2(C2),(C3,1,1)C2(O4),(C1)E2 ) from the Tween 20 molecule
- There are four Tween 20 molecules and each has only one of this particular type of E2-C2-E2 angle
- This provides relatively little averaging during sampling leading to larger noise
- Go to the Fit Quality tab
The R-squared value for the fitting parameters are displayed compared to the all-atom reference at specific fitting iterations. You can explore the fitting score for other interaction types from the dropdown options.
- Change the Number of profiles to display to 3
- Go to the Plot subtab
Note: R-squared parameters serve as a guide for users to determine whether further inspection of the plots is warranted. While high R-squared values generally indicate good fits, low R-squared values can also represent acceptable fits in some cases depending on the total fit. Additionally, small molecules grouped together to form a particle are often difficult to fit accurately, making it preferable to use a fixed standard value in such instances.
This tab visualizes how the fitting parameters in the CG model compare against the all-atom reference.
- Select ASP_S1,E2 for the Type
-
Select g(r)
- This specifies the pair distribution function for the parameter detailed in the Type option menu
-
Set the Number of profiles to display to 5
- Initially, there were excessive contacts between ASP_S1 and E2.
- The reference g(r) is small and noisy, suggesting this interaction is uncommon, potentially unimportant, and undersampled in the simulation.
- Longer CG simulations could reduce noise in the fitting process, but this may not be essential due to the small reference g(r), indicating this interaction might not be critical.
-
Change the Forcefield type to Angles
- The distribution of the bond angles for the reference and the CG simulations are plotted
- Select E2(C2),(C3,1,1)C2(O4),(C1)E2 for the Type
-
Set the Number of profiles to display to 3
- The plot displays a bimodal reference distribution, which is colored red.
- The CG model, which employs a harmonic angle potential, typically exhibits a single peak.
- The fitting process should produce a coarse-grained (CG) distribution that closely matches the average and standard deviation of the reference distribution. Specifically, the average of the CG distribution should be similar to the reference average, and its standard deviation should also approximate that of the reference.
- Each of the four Tween 20 molecules contains only one instance of this specific E2-C2-E2 angle.
- This results in minimal averaging during sampling, which leads to increased noise, as evident in the distributions.
Feel free to explore other options in the panel for further analysis. You can save the CG forcefield data to the Schrödinger directory. This can then be used to run further simulations with different compositions and system sizes of the same components.
The .json file, located in the output files (e.g., cgff_builder_trp_cage_excipients-out_cgff.json), provides access to the Martini particle types and force field parameters. This file contains the parameters and all-atom to CG mapping patterns, which are useful for future training runs. To use this force field with other systems via the Coarse Grained Assign Force Field panel, either copy the file to ~/.schrodinger/matsci_templates/coarse_grain_force_field_parameters/ or utilize the "Save Force Field Data..." option within the Coarse Grained Force Field Builder panel. Once saved with a new name, the force field will appear as an option in the "import force field" dropdown menu of the Coarse Grained Force Field Assignment panel. Additionally, the cgff_builder_trp_cage_excipients_cg.maegz file offers a CG version of the system. This file can be edited in MS Maestro to reproduce individual CG molecules, which can then be used to construct new CG systems.
If the current force field is unsatisfactory, consider the following potential solutions:
- Review the all-atom reference system trajectory: Ensure the trajectory provides sufficient and relevant sampling data.
- Increase CG fitting iterations: Use a greater number of iterations in the coarse-grained fitting process.
- Extend simulation lengths: Run longer simulations during each iteration.
- Re-fit using previous output: Seed the fitting process with the cgff_builder_trp_cage_excipients-out_cgff.json file from the current process and perform another fit.
5. Conclusion and References
We explored using the Coarse-Grained Mapping and Coarse-Grained Force Field Builder panels to automate force field fitting for Martini CG parameters for a complex protein solution.
For further learning:
For introductory content, focused on navigating the Schrödinger Materials Science interface, an Introduction to Materials Science Maestro tutorial is available. Please visit the materials science training website for access to 100+ tutorials. For scientific inquiries or technical troubleshooting, submit a ticket to our Technical Support Scientists at help@schrodinger.com.
For self-paced, asynchronous, online courses in Materials Science modeling, including access to Schrödinger software, please visit the Schrödinger Online Learning portal on our website.
For some related practice, proceed to explore other relevant tutorials:
- Simulating Complex Protein Solutions
-
For more practice on Coarse-graining and related workflows:
- Ibuprofen Cyclodextrin Inclusion Complexes with the Martini Coarse-Grained Force Field
- Building a Coarse-Grained Surfactant Model with Martini Force Field
- Building a Coarse-Grained Polymer Model using Dissipative Particle Dynamics
- Automated Dissipative Particle Dynamics (DPD) Parametrization
- Evaporation
For further reading:
- The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations, DOI:10.1021/jp071097f
- MARTINI Coarse-Grained Models of Polyethylene and Polypropylene, DOI:10.1021/acs.jpcb.5b03611
- Quantitative Predictions of the Interfacial Tensions of Liquid−Liquid Interfaces through Atomistic and Coarse Grained Models, DOI:10.1021/ct500053c
6. Glossary of Terms
Entry List - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion
Included - the entry is represented in the Workspace, the circle in the In column is blue
Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data
Recent actions - This is a list of your recent actions, which you can use to reopen a panel, displayed below the Browse row. (Right-click to delete.)
Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project
Selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries
Working Directory - the location where files are saved
Workspace - the 3D display area in the center of the main window, where molecular structures are displayed