Machine Learning Property Prediction

Tutorial Created with Software Release: 2025-3

Topics: Catalysis & Reactivity, Consumer Packaged Goods, Energy Capture & Storage, Informatics and Team Collaboration, Organic Electronics, Polymeric Materials, Thin Film Processing

Methodology: Machine Learning

Products Used: MS Informatics, MS Maestro

Tutorial files

65 KB

This tutorial is written for use with a 3-button mouse with a scroll wheel.

Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed

Tip: You can hover over a glossary term to display its definition. You can click on an image to expand it in the page.

Abstract:

In this tutorial, we will learn to use pre-built machine learning models to predict polymer glass transition temperature, polymer dielectric constant, polymer dissipation loss, density of molecular liquids, and volatility of organic and organometallic molecules.

Tutorial Content

Introduction to Machine Learning for Property Predictions

Creating Projects and Importing Structures

Predicting Polymer Properties

Predicting Volatility of Organic and Organometallic Molecules

Predicting Density of Molecular Liquids

Conclusion and References

Glossary of Terms

1. Introduction to Machine Learning for Property Predictions

Machine learning (ML) methods can accelerate the design of new materials by rapidly and accurately predicting material properties at a low computational cost. However, training ML models to predict a property of interest often requires extensive data curation and model development, which hinders the usage of ML models for designing new materials. A possible workaround is to use accurate pre-built ML models developed by experts and leverage them to rapidly predict properties of new materials.

In this tutorial, we will use pre-built ML models to predict the properties of materials such as polymers, organic compounds, and organometallic complexes. Figure 1 summarizes this tutorial. First, we will show how to use pre-built ML models to generate predictions for polymers, which are generally challenging to model because of the arbitrary number of monomers that affects the chain length, molecular mass, and large-scale properties. As an alternative to modeling the entire polymer, we focus on predicting polymers using the monomer structure alone, which can accelerate the design of new polymers. Next, we will show how to use the pre-built ML models for predicting the volatility of an organic or inorganic compound (including organometallic complexes), which informs us about the ease by which a molecule can evaporate/sublime from liquid/solid to vapor phase. Since evaporation and sublimation can not be simulated explicitly at the atomic scale (even with today's computing power), the statistical ML approach is very valuable. Then, we will show how to use pre-built ML models to predict the density of small organic molecules in the liquid phase, which is an alternative low-cost approach to measuring density as compared to molecular dynamics simulations or experiments. Hence, pre-built ML models can provide property estimates of bulk polymer, organic, and organometallic systems without having to perform experiments, in some cases predicting properties that can not currently be simulated with complex ab initio or molecular dynamics calculations (an example is the volatility of a compound).

Figure 1. Overview of using pre-built machine learning models to predict properties for polymeric, organic, and organometallic molecules using the Machine Learning Property Prediction panel.

Schrödinger’s Machine Learning Property Prediction panel uses pre-built ML models to predict polymer properties: glass transition temperature, dielectric constant, and dissipation loss; density of molecular liquids; as well as boiling/sublimation point and vapor pressure for organic and organometallic compounds. This tutorial provides step-by-step instructions to use these pre-built ML models to calculate material properties using the Materials Science Maestro interface. The available models are summarized in the table below:

Machine Learning Model	Property Label	Units	Description	Structure File	Additional User Input
Aqueous Solubility	Aqueous Solubility (log10(mol/L))	log10(mol/L)	Aqueous solubility of small molecules	-	-
Density of molecular liquids	Density (g/cm³)	g/cm³	Density of small organic molecules in the liquid phase	molecular_liquids_density_examples.mae	-
Melting Point	Melting Point (K)	K	Melting point of small organic molecules	-	-
Non-aqueous solubility	Non-aqueous Solubility (log10(mol/L))	mol/L	Solubility of small molecules in non-aqueous solutions	-	Temperature and Solvent
Optoelectronic properties of molecules: see Table 2 below.
Oxidation potential	Oxidation Potential (V)	V	Oxidation potential of molecules in acetonitrile at 298 K	-	-
Polymer dielectric constant	Dk	-	Dielectric constant of polymers	polymer_Dk_examples.mae	Frequency
Polymer dissipation loss	Df	-	Polymer dissipation factor (tan delta)	polymer_Df_examples.mae	Frequency
Polymer glass transition temperature	Tg (K)	Kelvin	Glass transition temperature of polymers	polymer_Tg_examples.mae	-
Reduction potential	Reduction Potential (V)	V	Reduction potential of molecules in acetonitrile at 298 K	-	-
Singlet-triplet energy gap	∆E(S1-T1) (eV)	eV	Singlet-triplet energy gap of organic light emitting diodes	-	-
Viscosity of organic liquids	Viscosity (cP)	cP	Viscosity of organic molecules in the liquid phase	-	Temperature
Volatility of organic molecules (boiling point)	Temperature (K)	Kelvin	Boiling/sublimation point of organic molecules	boiling_point_organic_molecules_examples.mae	Pressure
Volatility of organic molecules (vapor pressure)	Vapor Pressure (Torr)	Torr	Vapor pressure of organic molecules	vapor_pressure_organic_molecules_examples.mae	Temperature
Volatility of organometallic molecules (boiling point)	Temperature (K)	Kelvin	Boiling/sublimation point of organometallic or inorganic molecules	boiling_point_organometallic_molecules_examples.mae	Pressure

Table 1. Table of models in the Machine Learning Property Prediction panel, which contains the machine learning model name, property label, units, a brief description, example structure file used for the model, and any extra user input necessary for the model to run predictions.

Machine Learning Model	Property Label	Units	Description
Absorption peak position	Absorption Lmax nm (λ_abs,max)	nm	Visible wavelengths for which the absorption spectrum has the highest intensity
Absorption bandwidth	Absorption Bandwidth cm^-1 (σ_abs,FWHM)	^{cm^-1}	Bandwidth of the absorption peak position in full width at half maximum
Electron reorganization energy	Electron Reorganization Energy (eV)	eV	Energy required to relax the geometries of two molecules of the same species undergoing transfer of a negative charge
Emission peak position	Emission Emax nm (λ_emi,max)	nm	Wavelength of the fluorescence maximum
Emission bandwidth	^{Emission Bandwidth cm^-1} (σ_emi,FWHM)	cm^-1	Bandwidth of the emission peak position in full width at half maximum
Emission lifetime	Emission Lifetime log(ns) (τ)	log(ns)	Fluorescence or excited state lifetime
Extinction coefficient	Extinction Coefficient mol^-1dm³cm^-1 (ε_max)	log(mol^-1dm³cm^-1)	Molar extinction coefficient
Hole reorganization energy	Hole Reorganization Energy (eV)	eV	Energy required to relax the geometries of two molecules of the same species undergoing transfer of a positive charge
Photoluminescence quantum yield	Photoluminescence Quantum Yield (Φ_QY)	-	Ratio of photons emitted to photons absorbed for organic molecules
Scaled HOMO	Scaled HOMO (eV)	eV	Energy level for the highest occupied molecular orbital (HOMO), derived from oxidation calculation
Scaled LUMO	Scaled LUMO (eV)	eV	Energy level for the lowest unoccupied molecular orbital (LUMO), derived from reduction calculation
Singlet-Triplet Energy Gap	Singlet-Triplet Energy Gap (eV)	eV	Energy difference between the lowest triplet state (T1) and the first excited singlet state (S1) – corresponds to vertical transition at the T1 geometry.
Triplet Energy (E(S0T1))	Triplet Energy (eV)	eV	Energy difference between the lowest triplet state (T1) and the ground singlet state (S0)
Triplet Reorganization Energy	Triplet Reorganization Energy (eV)	eV	Energy required to relax the geometries of two molecules of the same species undergoing transitions between the lowest triplet state and the ground singlet state.

Table 2: Table of optoelectronic property models in the Machine Learning Property Prediction panel. The description of columns is the same as Table 1. For optoelectronic properties, you can select either organic molecules in solvent or as thin film for predictions.

As an alternative to the pre-built ML models, the polymer glass transition temperature, dielectric constant, and dissipation loss can be simulated using the Thermophysical Properties Calculations and Amorphous Dielectric Properties panels. The density of small molecules can be simulated using the Disordered System Builder and Molecular Dynamics Multistage Workflows panels.

Several models are included in the panel but are not covered in this tutorial, these have no ‘Structure Files’ in Table 1 and none in Table 2. Please see the help documentation for more information on these models.

For background on the Machine Learning Property Prediction panel which will be described in this tutorial, see the help documentation.

For more information about building and applying machine learning models in Materials Science Maestro, see the Machine Learning for Materials Science tutorial.

2. Creating Projects and Importing Structures

At the start of the session, change the file path to your chosen Working Directorythe location where files are saved in MS Maestro to make file navigation easier. Each session in MS Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project, which is not saved. A MS Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is saved, the project is automatically saved each time a change is made.

Structures can be built in MS Maestro or can be imported using File > Import Structures (or drag-and-dropped), and are added to the Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. The Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion is located to the left of the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.

Double-click the Materials Science icon
- (No icon? See Starting Maestro)

Figure 2-1. Change Working Directory option.

Go to File > Change Working Directory
Find your directory, and click Choose
Pre-generated files are included for running jobs or examining output. Download the zip file here: schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/ml_property_prediction.zip
After downloading the zip file, unzip the contents in your Working Directorythe location where files are saved for ease of access throughout the tutorial

Figure 2-2. Save Project panel.

Go to File > Save Project As
Change the File name to ml_property_prediction_tutorial, click Save
- The project is now named ml_property_prediction_tutorial.prj

For this tutorial, all of the input files necessary to run the Machine Learning Property Prediction panel are provided and will be imported in the next step. Each file listed in Table 1 contains 5 example structures that were used for ML model development.

If you are interested in building organic compounds, monomers, polymers, or organometallic complexes on your own, please visit the Introduction to Maestro for Materials Science, Building, Equilibrating and Analyzing Amorphous Polymers, or Organometallic Complexes tutorials, respectively.

Figure 2-3. Importing the structure files.

Go to File > Import Structures
Navigate to where you downloaded the tutorial files (presumably your working directory) and select the seven structure files available, each one ending in _examples.mae. Click Open
- Seven new entry groups are added to the entry list. Each group corresponds to a machine learning model as described in Table 1

3. Predicting Polymer Properties

In this section, we will use the Machine Learning Property Prediction panel to predict polymer properties for a sampling of monomers.

Figure 3-1. Opening the Machine Learning Property Prediction panel.

In the entry list, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the polymer_Tg group and includethe entry is represented in the Workspace, the circle in the In column is blue the first entry, Polyethylene
Go to Tasks > Materials > Informatics > Machine Learning Property Prediction
- The Machine Learning Property Prediction panel opens

Let’s learn about the settings and capabilities of the Machine Learning Property Prediction panel a bit more:

Models can be selected using the Machine learning model drop-down menu and further parameters specified in the Predict section.
The Model Performance section provides statistics about the machine learning model selected, which informs on the appropriate chemical space for the model and typical error expected for results. For all machine learning models, the dataset was split such that 90% of the data (i.e. training set) was used to train the model and 10% of the data was left out to evaluate whether the model can predict unseen data (i.e. testing set). The statistics is summarized below:
- Training data size: Total size of the data used for ML model training. For example, polymer Tg has 582 rows of monomer SMILES with experimental Tg values.
- Training (90%) R²: Coefficient of determination for the training set, which consists of 90% of the data. A perfect model would have R² of 1.
- Test (10%) R²: Coefficient of determination for the testing set, which consists of 10% of the data.
- RMSE (Training): Root-mean-square error (RMSE) of the training set. An ideal model would have RMSE values close to zero.
- RMSE (Test): RMSE of the test set.
- Chemical Space: Unique elements within the machine learning model. Please consider only using structures with the elements in the chemical space. Using elements outside the chemical space may lead to erroneous predictions.
A prediction can be made for a single included entry or multiple selected entries. Note that structures do not require hydrogens to be specified because they are featurized using implicit hydrogens.
- If Interactive mode is selected, predictions are made in the Results table within the panel, and up to 10 entries can be selected. This is an interactive feature and does not require running a job.
- If Batch mode is selected, a new entry group is incorporated after the job is completed, and the predicted properties can be viewed from the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data.

Visit the help documentation for a complete summary of the panel.

Here we use the Machine Learning Property Prediction panel to predict the Tg, Dk, and Df of polymers.

Figure 3-2. Representing polymers for the Machine Learning Property Prediction panel using polyethylene as an example.

Polymers are very large molecules consisting of n-repeats of a monomer unit. For ML models, we represent polymers, such as polyethylene, as monomers with connection points denoted as two [At] atoms. Hence, all polymers should be represented as monomers when inputting them into the Machine Learning Property Prediction panel consisting of two [At] reference atoms for the end points of the monomer.

Alternatively, the Mark Monomer Head and Tail panel can be used to prepare monomers to be used for polymer property predictions using the Machine Learning Property Prediction panel.

Figure 3-3. Downloading the ML model.

Now, let’s make our first prediction, polymer Tg:

Ensure the radio-button for Interactive mode is selected
From the Machine learning model drop-down menu, select Polymer glass transition temperature
Click Download Model
- When you click this button, an ensemble model for the selected ML model with prediction uncertainty is downloaded. Due to the large file size of the model, it is not available with your Schrödinger software installation and must be downloaded using this button. The download may take up to a few minutes and is complete when Model available is displayed in place of the button. For more information refer to the help documentation.

Figure 3-4. Predicting the Polymer glass transition temperature.

Click Predict
- The Results from selected entries section is populated with the predicted Tg for the selected structures along with their uncertainty
- For example, polyethylene has a predicted Tg of 220.3 K with a prediction uncertainty of ~14 K

Figure 3-5. Running the job.

We can also run the prediction as a job on the selected entries. To do this, ensure Use structures from shows Project Table (5 selected entries)
Select the radio-button for Batch mode
Change the Job name to ml_prop_prediction_Tg
Adjust the job settings () as needed
- This job can be completed in a few minutes on a CPU host
Click Run
Close the Machine Learning Property Prediction panel

Figure 3-6. The entry list and workspace after running the Tg property prediction calculation.

When the job finishes, a new entry group is incorporated and added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion entitled ml_prop_prediction_Tg-out (5). The group contains all of the same structures as the original group, but now each entry is also associated with the predicted Tg.

Figure 3-7. Viewing the experimental and predicted Tg values in the Property Table.

We can see the Tg values by opening the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data in the next step

Open the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data ()

The properties may not appear by default. To add the relevant properties as columns in the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data:

Go to the Property Tree (), expand All > Materials Science > Primary > Polymer Tg (K), All > Materials Science > Primary > Polymer Tg Uncertainty, and All > Canvas > Secondary > Exp Tg (K)

The Exp Tg (K) column corresponds to experimental values and the Polymer Tg (K) and Polymer Tg Uncertainty (K) column are populated by the Machine Learning Property Prediction job. We can see that for our sample of 5 structures, the predicted Tg is in great agreement with the experimental values (stored as the Tg column).

Note: You can export directly from the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data to spreadsheet form if needed by clicking Data > Export > Spreadsheet

Close the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data before proceeding

Figure 3-8. The entry list and workspace after running the Dk and Df property prediction calculations.

Repeat all of the steps in this section to predict Dk and Df. Ensure you make the following changes in your procedure:

Note that predicting Dk and Df requires user input of frequency, which is the frequency used for dielectric experiments, typically in ranges of 50 Hz to 10 MHz. For this example, we will use the default value of 100 Hz for both calculations.
Make sure that you download the corresponding ML model from the dropdown options.
For predicting Dk, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the polymer_Dk group and the Polymer dielectric constant Machine learning model. Change the Job name to ml_prop_prediction_Dk
For predicting Df, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the polymer_Df group and the Polymer dissipation loss Machine learning model. Change the Job name to ml_prop_prediction_Df

Figure 3-9. Viewing the experimental and predicted Dk and Df values in the Property Table.

We can see the experimental and predicted values of both Dk and Df by opening the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data

Once again, we see very good agreement between the experimental and predicted values.

Close the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data before proceeding

4. Predicting Volatility of Organic and Organometallic Molecules

In this section, we will use the Machine Learning Property Prediction panel to predict volatility, boiling/sublimation point and vapor pressure, for organic and inorganic molecules, including organometallic complexes.

Figure 4-1. Representations of organic molecules and organometallic complexes.

Organic molecules are any structures primarily made of carbon, oxygen, hydrogen, and nitrogen. An example of an organic molecule is methane or CH₄.

Organometallic or metal-organic complexes consist of a metal surrounded by organic-based ligands. An example of an organometallic molecule is trimethylaluminum or Al(CH₃)₃. This panel also works on inorganic molecules that do not contain metals, like NH₃.

Figure 4-2. Opening the Machine Learning Property Prediction panel.

Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the organic_volatility_bp group and includethe entry is represented in the Workspace, the circle in the In column is blue the first entry, methane
Go to Tasks > Materials > Informatics > Machine Learning Property Prediction
- The Machine Learning Property Prediction panel opens

Figure 4-3. Predicting the boiling point.

Now, let’s make our first volatility prediction, boiling/sublimation point:

Select the radio-button for Interactive mode
From the Machine learning model drop-down menu, select Volatility of organic molecules
From the Property drop-down menu, select Boiling point
Click Download Model
Set the pressure to Predict at 760 Torr
- The evaporation or sublimation temperature of the selected compounds will be calculated at the specified pressure - in this case, at atmospheric pressure, 760 Torr. You can also change the pressure units to bar or atm by using the drop-down menu
- Additionally, the boiling point can be predicted over a range of pressures. See the documentation to learn more.
Click Predict
- The Results from selected entries section is populated with the predicted boiling point for the selected structures
- For example, methane has a predicted boiling point of 117.8 K

Figure 4-4. Running the job.

We can also run the prediction as a job on the selected entries. To do this, ensure Use structures from shows Project Table (5 selected entries)
Select the radio-button for Batch mode
Change the Job name to ml_prop_prediction_organic_bp
Adjust the job settings () as needed
- This job can be completed in a few minutes on a CPU host
Click Run
Close the Machine Learning Property Prediction panel

Figure 4-5. The entry list and workspace after running the boiling point prediction.

When the job finishes, a new entry group is incorporated and added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion entitled ml_prop_prediction_organic_bp-out (5). The group contains all of the same structures as the original group, but now each entry also has a property with the predicted boiling/sublimation point. We will compare the experimental and predicted properties after a few more predictions.

Figure 4-6. Opening the Machine Learning Property Prediction panel.

Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the organic_volatility_vp group and includethe entry is represented in the Workspace, the circle in the In column is blue the first entry, dichlorobenzene
Go to Tasks > Materials > Informatics > Machine Learning Property Prediction
- The Machine Learning Property Prediction panel opens
- Reset the panel

Figure 4-7. Predicting the vapor pressure.

Select the radio-button for Interactive mode
From the Machine learning model drop-down menu, select Volatility of organic molecules
From the Property drop-down menu, select Vapor pressure
Set the pressure to Predict at 293.15 K
- The vapor pressure of the selected compounds will be calculated at a temperature of 293.15 K (i.e. 20°C). You can also change the temperature units to Celsius by using the drop-down menu
Click Download Model
Click Predict
- The Results from selected entries section is populated with the predicted vapor pressure for the selected structures
- For example, ethyl formate has a predicted vapor pressure of 200 Torr

Figure 4-8. Running the job.

Next, let’s run the prediction for the selected group of 5 compounds:

Select the radio-button for Batch mode
Change the Job name to ml_prop_prediction_organic_vp
Adjust the job settings () as needed
- This job can be completed in a few minutes on a CPU host
Click Run
Close the Machine Learning Property Prediction panel

Figure 4-9. The entry list and workspace after running the vapor pressure prediction.

When the job finishes, a new entry group is incorporated and added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion entitled ml_prop_prediction_organic_vp-out (5). The group contains all of the same structures as the original group, but now each entry also has the predicted vapor pressure as an associated property

Figure 4-10. The entry list and workspace after running the organometallic boiling point pressure prediction.

Repeat steps 1-10 in this section to predict the boiling/sublimation point for a set of organometallic complexes. Ensure you make the following changes in your procedure:

Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the organometallic_bp group
Select the Volatility of organometallic molecules Machine learning model
Make sure that you download the corresponding ML model from the dropdown options.
Set the pressure to Predict at 760 Torr
Change the Job name to ml_prop_prediction_organometallic_bp

For the sample molecules in this section, we have provided some experimentally-measured volatility values. Now, let’s compare the experimental and predicted volatility values using the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data:

If necessary, add the user defined (experimental) properties using the Property Tree.

Here we see that the experimental and predicted values for the three calculations run in this section are all in very good agreement. Volatility can be efficiently and accurately predicted using the Machine Learning Property Prediction panel on the appropriate chemical spaces.

5. Predicting Density of Molecular Liquids

In this section, we will use the Machine Learning Property Prediction panel to predict the density of molecular liquids.

Figure 5-1. Representations of organic molecules.

Organic molecules are any structures primarily made of carbon, oxygen, hydrogen, and nitrogen. An example of an organic molecule is isopentane. Inputs for the density of molecular liquids model should be small organic molecules in the liquid phase such as those shown in the Figure.

Figure 5-2. Opening the Machine Learning Property Prediction panel.

Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the molecular_liquids_density group and includethe entry is represented in the Workspace, the circle in the In column is blue the first entry, Isopentane
Go to Tasks > Materials > Informatics > Machine Learning Property Prediction
- The Machine Learning Property Prediction panel opens
- Reset the panel

Figure 5-3. Predicting the density.

Now, let’s predict the density of our structures:

Select the radio-button for Interactive mode
From the Machine learning model drop-down menu, select Density of molecular liquids
Click Download Model
- Model available is displayed in place of the button
Click Predict
- The Results from selected entries section is populated with the predicted density for the selected structures along with their uncertainty
- For example, isopentane has a predicted density of 0.66 g/cm³ with a prediction uncertainty of ~0.01 g/cm³

Figure 5-4. Running the job.

We can also run the prediction as a job on the selected entries. To do this, ensure Use structures from shows Project Table (5 selected entries)
Select the radio-button for Batch mode
Change the Job name to ml_prop_prediction_density
Adjust the job settings () as needed
- This job can be completed in a few minutes on a CPU host
Click Run
Close the Machine Learning Property Prediction panel

Figure 5-5. The entry list and workspace after running the density prediction.

When the job finishes, a new entry group is incorporated and added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion entitled ml_prop_prediction_density-out (5). The group contains all of the same structures as the original group, but now each entry also has a property with the predicted density.

Figure 5-6. Viewing the experimental and predicted density values in the Property Table.

We can see the density values by opening the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data in the next step

Open the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data ()

Go to the Property Tree (), expand All > Materials Science > Primary > Density at 293.15 K (g/cm3), All > Materials Science > Primary > Density at 293.15 K Uncertainty (g/cm3), and All > Canvas > Secondary > Ref. Density (g/cm3)

The Ref. Density (g/cm3) column corresponds to experimental values and the Density at 293.15 K (g/cm3) and Density at 293.15 K Uncertainty (g/cm3) column are populated by the Machine Learning Property Prediction job. We can see that for our sample of 5 structures, the predicted density is in great agreement with the experimental values.

Close the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data

6. Conclusion and References

In this tutorial, we learned how to use pre-built machine learning models to predict properties for polymers, organic, and organometallic molecules. While using pre-built machine learning models provides a fast way to obtain material properties, we caution that the predictions are good for a first estimate and should be further validated with experiments or physics-based methods.

Click to Expand

For further learning:

For introductory content, focused on navigating the Schrödinger Materials Science interface, an Introduction to Maestro for Materials Science tutorial is available. Please visit the materials science training website for access to 70+ tutorials. For scientific inquiries or technical troubleshooting, submit a ticket to our Technical Support Scientists at help@schrodinger.com.

For self-paced, asynchronous, online courses in Materials Science modeling, including access to Schrödinger software, please visit the Schrödinger Online Learning portal on our website.

For some related practice, proceed to explore other relevant tutorials:

For more machine learning:

Optimizing Viscosity and Cost in Formulations with Missing Structural Data

For general polymer workflows:
For general organic and organometallic molecule workflows:

Click to Expand

For further reading:

See the help documentation on the Machine Learning Property Prediction panel
For polymer Tg, Dk, and Df dataset, see Bicerano, J. Prediction of Polymer Properties (3rd ed.). CRC Press. DOI:10.1201/9780203910115. Polymer Tg data set is also available in Afzal, Mohammad Atif Faiz, et al. High-throughput molecular dynamics simulations and validation of thermophysical properties of polymers for various applications. DOI:10.1021/acsapm.0c00524
For sample volatility data, see Vapor Pressure of Pure Substances. Organic and Inorganic Compounds. DOI:10.1021/ie50448a022
For the density data, see CRC Handbook of Physics and Chemistry 104th edition.

7. Glossary of Terms

Entry List - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion

Included - the entry is represented in the Workspace, the circle in the In column is blue

Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data

Recent actions - This is a list of your recent actions, which you can use to reopen a panel, displayed below the Browse row. (Right-click to delete.)

Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project

Selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries

Working Directory - the location where files are saved

Workspace - the 3D display area in the center of the main window, where molecular structures are displayed