Machine Learning Property Prediction

Tutorial Created with Software Release: 2025-3
Topics: Catalysis & Reactivity, Consumer Packaged Goods, Energy Capture & Storage, Informatics and Team Collaboration, Organic Electronics, Polymeric Materials, Thin Film Processing
Methodology: Machine Learning
Products Used: MS Informatics, MS Maestro

Tutorial files

65 KB

This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed

 

Tip: You can hover over a glossary term to display its definition. You can click on an image to expand it in the page.
Abstract:

 

In this tutorial, we will learn to use pre-built machine learning models to predict polymer glass transition temperature, polymer dielectric constant, polymer dissipation loss, density of molecular liquids, and volatility of organic and organometallic molecules.

 

Tutorial Content
  1. Introduction to Machine Learning for Property Predictions

  1. Creating Projects and Importing Structures

  1. Predicting Polymer Properties

  1. Predicting Volatility of Organic and Organometallic Molecules

  1. Predicting Density of Molecular Liquids

  1. Conclusion and References

  1. Glossary of Terms

1. Introduction to Machine Learning for Property Predictions

Machine learning (ML) methods can accelerate the design of new materials by rapidly and accurately predicting material properties at a low computational cost. However, training ML models to predict a property of interest often requires extensive data curation and model development, which hinders the usage of ML models for designing new materials. A possible workaround is to use accurate pre-built ML models developed by experts and leverage them to rapidly predict properties of new materials.

In this tutorial, we will use pre-built ML models to predict the properties of materials such as polymers, organic compounds, and organometallic complexes. Figure 1 summarizes this tutorial. First, we will show how to use pre-built ML models to generate predictions for polymers, which are generally challenging to model because of the arbitrary number of monomers that affects the chain length, molecular mass, and large-scale properties. As an alternative to modeling the entire polymer, we focus on predicting polymers using the monomer structure alone, which can accelerate the design of new polymers. Next, we will show how to use the pre-built ML models for predicting the volatility of an organic or inorganic compound (including organometallic complexes), which informs us about the ease by which a molecule can evaporate/sublime from liquid/solid to vapor phase. Since evaporation and sublimation can not be simulated explicitly at the atomic scale (even with today's computing power), the statistical ML approach is very valuable. Then, we will show how to use pre-built ML models to predict the density of small organic molecules in the liquid phase, which is an alternative low-cost approach to measuring density as compared to molecular dynamics simulations or experiments. Hence, pre-built ML models can provide property estimates of bulk polymer, organic, and organometallic systems without having to perform experiments, in some cases predicting properties that can not currently be simulated with complex ab initio or molecular dynamics calculations (an example is the volatility of a compound).

Figure 1. Overview of using pre-built machine learning models to predict properties for polymeric, organic, and organometallic molecules using the Machine Learning Property Prediction panel.

Schrödinger’s Machine Learning Property Prediction panel uses pre-built ML models to predict polymer properties: glass transition temperature, dielectric constant, and dissipation loss; density of molecular liquids; as well as boiling/sublimation point and vapor pressure for organic and organometallic compounds. This tutorial provides step-by-step instructions to use these pre-built ML models to calculate material properties using the Materials Science Maestro interface. The available models are summarized in the table below:

Machine Learning Model

Property

Label

Units

Description

Structure File

Additional User Input

Aqueous Solubility

Aqueous Solubility (log10(mol/L))

log10(mol/L)

Aqueous solubility of small molecules

-

-

Density of molecular liquids

Density (g/cm3)

g/cm3

Density of small organic molecules in the liquid phase

molecular_liquids_density_examples.mae

-

Melting Point

Melting Point (K)

K

Melting point of small organic molecules

-

-

Non-aqueous solubility

Non-aqueous Solubility (log10(mol/L))

mol/L

Solubility of small molecules in non-aqueous solutions

-

Temperature and Solvent

Optoelectronic properties of molecules: see Table 2 below.

Oxidation potential

Oxidation Potential (V)

V

Oxidation potential of molecules in acetonitrile at 298 K

-

-

Polymer dielectric constant

Dk

-

Dielectric constant of polymers

polymer_Dk_examples.mae

Frequency

Polymer dissipation loss

Df

-

Polymer dissipation factor (tan delta)

polymer_Df_examples.mae

Frequency

Polymer glass transition temperature

Tg (K)

Kelvin

Glass transition temperature of polymers

polymer_Tg_examples.mae

-

Reduction potential

Reduction Potential (V)

V

Reduction potential of molecules in acetonitrile at 298 K

-

-

Singlet-triplet energy gap

∆E(S1-T1) (eV)

eV

Singlet-triplet energy gap of organic light emitting diodes

-

-

Viscosity of organic liquids

Viscosity (cP)

cP

Viscosity of organic molecules in the liquid phase

-

Temperature

Volatility of organic molecules (boiling point)

Temperature (K)

Kelvin

Boiling/sublimation point of organic molecules

boiling_point_organic_molecules_examples.mae

Pressure

Volatility of organic molecules (vapor pressure)

Vapor Pressure (Torr)

Torr

Vapor pressure of organic molecules

vapor_pressure_organic_molecules_examples.mae

Temperature

Volatility of organometallic molecules (boiling point)

Temperature (K)

Kelvin

Boiling/sublimation point of organometallic or inorganic molecules

boiling_point_organometallic_molecules_examples.mae

Pressure

Table 1. Table of models in the Machine Learning Property Prediction panel, which contains the machine learning model name, property label, units, a brief description, example structure file used for the model, and any extra user input necessary for the model to run predictions.

Machine Learning Model

Property

Label

Units

Description

Absorption peak position

Absorption Lmax nm (λabs,max)

nm

Visible wavelengths for which the absorption spectrum has the highest intensity

Absorption bandwidth

Absorption Bandwidth cm-1abs,FWHM)

cm-1

Bandwidth of the absorption peak position in full width at half maximum

Electron reorganization energy

Electron Reorganization Energy (eV)

eV

Energy required to relax the geometries of two molecules of the same species undergoing transfer of a negative charge

Emission peak position

Emission Emax nm

emi,max)

nm

Wavelength of the fluorescence maximum

Emission bandwidth

Emission Bandwidth cm-1

emi,FWHM)

cm-1

Bandwidth of the emission peak position in full width at half maximum

Emission lifetime

Emission Lifetime log(ns)

(τ)

log(ns)

Fluorescence or excited state lifetime

Extinction coefficient

Extinction Coefficient mol-1dm3cm-1

max)

log(mol-1dm3cm-1)

Molar extinction coefficient

Hole reorganization energy

Hole Reorganization Energy (eV)

eV

Energy required to relax the geometries of two molecules of the same species undergoing transfer of a positive charge

Photoluminescence quantum yield

Photoluminescence Quantum Yield

QY)

-

Ratio of photons emitted to photons absorbed for organic molecules

Scaled HOMO

Scaled HOMO (eV)

eV

Energy level for the highest occupied molecular orbital (HOMO), derived from oxidation calculation

Scaled LUMO

Scaled LUMO (eV)

eV

Energy level for the lowest unoccupied molecular orbital (LUMO), derived from reduction calculation

Singlet-Triplet Energy Gap

Singlet-Triplet Energy Gap (eV)

eV

Energy difference between the lowest triplet state (T1) and the first excited singlet state (S1) – corresponds to vertical transition at the T1 geometry.

Triplet Energy (E(S0T1))

Triplet Energy (eV)

eV

Energy difference between the lowest triplet state (T1) and the ground singlet state (S0)

 

Triplet Reorganization Energy

Triplet Reorganization Energy (eV)

eV

Energy required to relax the geometries of two molecules of the same species undergoing transitions between  the lowest triplet state and the ground singlet state.

 

Table 2: Table of optoelectronic property models in the Machine Learning Property Prediction panel. The description of columns is the same as Table 1. For optoelectronic properties, you can select either organic molecules in solvent or as thin film for predictions.

As an alternative to the pre-built ML models, the polymer glass transition temperature, dielectric constant, and dissipation loss can be simulated using the Thermophysical Properties Calculations and Amorphous Dielectric Properties panels. The density of small molecules can be simulated using the Disordered System Builder and Molecular Dynamics Multistage Workflows panels.

Several models are included in the panel but are not covered in this tutorial, these have no ‘Structure Files’ in Table 1 and none in Table 2. Please see the help documentation for more information on these models.

For background on the Machine Learning Property Prediction panel which will be described in this tutorial, see the help documentation.

For more information about building and applying machine learning models in Materials Science Maestro, see the Machine Learning for Materials Science tutorial.

2. Creating Projects and Importing Structures

At the start of the session, change the file path to your chosen Working Directorythe location where files are saved in MS Maestro to make file navigation easier. Each session in MS Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project, which is not saved. A MS Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is saved, the project is automatically saved each time a change is made.

Structures can be built in MS Maestro or can be imported using File > Import Structures (or drag-and-dropped), and are added to the Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. The Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion is located to the left of the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.

  1. Double-click the Materials Science icon

Figure 2-1. Change Working Directory option.

  1. Go to File > Change Working Directory
  2. Find your directory, and click Choose
  3. Pre-generated files are included for running jobs or examining output. Download the zip file here: schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/ml_property_prediction.zip
  4. After downloading the zip file, unzip the contents in your Working Directorythe location where files are saved for ease of access throughout the tutorial

Figure 2-2. Save Project panel.

  1. Go to File > Save Project As
  2. Change the File name to ml_property_prediction_tutorial, click Save
    • The project is now named ml_property_prediction_tutorial.prj

For this tutorial, all of the input files necessary to run the Machine Learning Property Prediction panel are provided and will be imported in the next step. Each file listed in Table 1 contains 5 example structures that were used for ML model development.

 

If you are interested in building organic compounds, monomers, polymers, or organometallic complexes on your own, please visit the Introduction to Maestro for Materials Science, Building, Equilibrating and Analyzing Amorphous Polymers, or Organometallic Complexes tutorials, respectively.

Figure 2-3. Importing the structure files.

  1. Go to File > Import Structures
  2. Navigate to where you downloaded the tutorial files (presumably your working directory) and select the seven structure files available, each one ending in _examples.mae. Click Open
    • Seven new entry groups are added to the entry list. Each group corresponds to a machine learning model as described in Table 1

3. Predicting Polymer Properties

In this section, we will use the Machine Learning Property Prediction panel to predict polymer properties for a sampling of monomers.

Figure 3-1. Opening the Machine Learning Property Prediction panel.

  1. In the entry list, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the polymer_Tg group and includethe entry is represented in the Workspace, the circle in the In column is blue the first entry, Polyethylene
  2. Go to Tasks > Materials > Informatics > Machine Learning Property Prediction

Let’s learn about the settings and capabilities of the Machine Learning Property Prediction panel a bit more:

  • Models can be selected using the Machine learning model drop-down menu and further parameters specified in the Predict section.
  • The Model Performance section provides statistics about the machine learning model selected, which informs on the appropriate chemical space for the model and typical error expected for results. For all machine learning models, the dataset was split such that 90% of the data (i.e. training set) was used to train the model and 10% of the data was left out to evaluate whether the model can predict unseen data (i.e. testing set). The statistics is summarized below:
    • Training data size: Total size of the data used for ML model training. For example, polymer Tg has 582 rows of monomer SMILES with experimental Tg values.
    • Training (90%) R2: Coefficient of determination for the training set, which consists of 90% of the data. A perfect model would have R2 of 1.
    • Test (10%) R2: Coefficient of determination for the testing set, which consists of 10% of the data.
    • RMSE (Training): Root-mean-square error (RMSE) of the training set. An ideal model would have RMSE values close to zero.
    • RMSE (Test): RMSE of the test set.
    • Chemical Space: Unique elements within the machine learning model. Please consider only using structures with the elements in the chemical space. Using elements outside the chemical space may lead to erroneous predictions.
  • A prediction can be made for a single included entry or multiple selected entries. Note that structures do not require hydrogens to be specified because they are featurized using implicit hydrogens.
    • If Interactive mode is selected, predictions are made in the Results table within the panel, and up to 10 entries can be selected. This is an interactive feature and does not require running a job.
    • If Batch mode is selected, a new entry group is incorporated after the job is completed, and the predicted properties can be viewed from the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data.

 

Visit the help documentation for a complete summary of the panel.

 

Here we use the Machine Learning Property Prediction panel to predict the Tg, Dk, and Df of polymers.

Figure 3-2. Representing polymers for the Machine Learning Property Prediction panel using polyethylene as an example.

Polymers are very large molecules consisting of n-repeats of a monomer unit. For ML models, we represent polymers, such as polyethylene, as monomers with connection points denoted as two [At] atoms. Hence, all polymers should be represented as monomers when inputting them into the Machine Learning Property Prediction panel consisting of two [At] reference atoms for the end points of the monomer.

 

Alternatively, the Mark Monomer Head and Tail panel can be used to prepare monomers to be used for polymer property predictions using the Machine Learning Property Prediction panel.

 

Figure 3-3. Downloading the ML model.

Now, let’s make our first prediction, polymer Tg:

 

  1. Ensure the radio-button for Interactive mode is selected
  2. From the Machine learning model drop-down menu, select Polymer glass transition temperature
  3. Click Download Model
    • When you click this button, an ensemble model for the selected ML model with prediction uncertainty is downloaded. Due to the large file size of the model, it is not available with your Schrödinger software installation and must be downloaded using this button. The download may take up to a few minutes and is complete when Model available is displayed in place of the button. For more information refer to the help documentation.

Figure 3-4. Predicting the Polymer glass transition temperature.

  1. Click Predict
    • The Results from selected entries section is populated with the predicted Tg for the selected structures along with their uncertainty
    • For example, polyethylene has a predicted Tg of 220.3 K with a prediction uncertainty of ~14 K

 

Figure 3-5. Running the job.

  1. We can also run the prediction as a job on the selected entries. To do this, ensure Use structures from shows Project Table (5 selected entries)
  2. Select the radio-button for Batch mode
  3. Change the Job name to ml_prop_prediction_Tg
  4. Adjust the job settings () as needed
    • This job can be completed in a few minutes on a CPU host
  5. Click Run
  6. Close the Machine Learning Property Prediction panel

Figure 3-6. The entry list and workspace after running the Tg property prediction calculation.

When the job finishes, a new entry group is incorporated and added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion entitled ml_prop_prediction_Tg-out (5). The group contains all of the same structures as the original group, but now each entry is also associated with the predicted Tg.

Figure 3-7. Viewing the experimental and predicted Tg values in the Property Table.

We can see the Tg values by opening the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data in the next step

  1. Open the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data ()

The properties may not appear by default. To add the relevant properties as columns in the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data:

  1. Go to the Property Tree (), expand All > Materials Science > Primary > Polymer Tg (K), All > Materials Science > Primary > Polymer Tg Uncertainty, and All > Canvas > Secondary > Exp Tg (K)

The Exp Tg (K) column corresponds to experimental values and the Polymer Tg (K) and Polymer Tg Uncertainty (K) column are populated by the Machine Learning Property Prediction job. We can see that for our sample of 5 structures, the predicted Tg is in great agreement with the experimental values (stored as the Tg column).

Note: You can export directly from the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data to spreadsheet form if needed by clicking Data > Export > Spreadsheet

  1. Close the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data before proceeding

 

Figure 3-8. The entry list and workspace after running the Dk and Df property prediction calculations.

Repeat all of the steps in this section to predict Dk and Df. Ensure you make the following changes in your procedure:

  • Note that predicting Dk and Df requires user input of frequency, which is the frequency used for dielectric experiments, typically in ranges of 50 Hz to 10 MHz. For this example, we will use the default value of 100 Hz for both calculations.
  • Make sure that you download the corresponding ML model from the dropdown options.
  • For predicting Dk, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the polymer_Dk group and the Polymer dielectric constant Machine learning model. Change the Job name to ml_prop_prediction_Dk
  • For predicting Df, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the polymer_Df group and the Polymer dissipation loss Machine learning model. Change the Job name to ml_prop_prediction_Df

Figure 3-9. Viewing the experimental and predicted Dk and Df values in the Property Table.

We can see the experimental and predicted values of both Dk and Df by opening the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data

Once again, we see very good agreement between the experimental and predicted values.

  1. Close the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data before proceeding

4. Predicting Volatility of Organic and Organometallic Molecules

In this section, we will use the Machine Learning Property Prediction panel to predict volatility, boiling/sublimation point and vapor pressure, for organic and inorganic molecules, including organometallic complexes. 

Figure 4-1. Representations of organic molecules and organometallic complexes.

 

Organic molecules are any structures primarily made of carbon, oxygen, hydrogen, and nitrogen. An example of an organic molecule is methane or CH4.

 

Organometallic or metal-organic complexes consist of a metal surrounded by organic-based ligands. An example of an organometallic molecule is trimethylaluminum or Al(CH3)3. This panel also works on inorganic molecules that do not contain metals, like NH3.

Figure 4-2. Opening the Machine Learning Property Prediction panel.

  1. Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the organic_volatility_bp group and includethe entry is represented in the Workspace, the circle in the In column is blue the first entry, methane
  2. Go to Tasks > Materials > Informatics > Machine Learning Property Prediction

Figure 4-3. Predicting the boiling point.

Now, let’s make our first volatility prediction, boiling/sublimation point:

 

  1. Select the radio-button for Interactive mode
  2. From the Machine learning model drop-down menu, select Volatility of organic molecules
  3. From the Property drop-down menu, select Boiling point
  4. Click Download Model
  5. Set the pressure to Predict at 760 Torr
    • The evaporation or sublimation temperature of the selected compounds will be calculated at the specified pressure - in this case, at atmospheric pressure, 760 Torr. You can also change the pressure units to bar or atm by using the drop-down menu
    • Additionally, the boiling point can be predicted over a range of pressures. See the documentation to learn more.
  6. Click Predict
    • The Results from selected entries section is populated with the predicted boiling point for the selected structures
    • For example, methane has a predicted boiling point of 117.8 K

Figure 4-4. Running the job.

  1. We can also run the prediction as a job on the selected entries. To do this, ensure Use structures from shows Project Table (5 selected entries)
  2. Select the radio-button for Batch mode
  3. Change the Job name to ml_prop_prediction_organic_bp
  4. Adjust the job settings () as needed
    • This job can be completed in a few minutes on a CPU host
  5. Click Run
  6. Close the Machine Learning Property Prediction panel

Figure 4-5. The entry list and workspace after running the boiling point prediction.

When the job finishes, a new entry group is incorporated and added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion entitled ml_prop_prediction_organic_bp-out (5). The group contains all of the same structures as the original group, but now each entry also has a property with the predicted boiling/sublimation point. We will compare the experimental and predicted properties after a few more predictions.

Figure 4-6. Opening the Machine Learning Property Prediction panel.

  1. Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the organic_volatility_vp group and includethe entry is represented in the Workspace, the circle in the In column is blue the first entry, dichlorobenzene
  2. Go to Tasks > Materials > Informatics > Machine Learning Property Prediction

Figure 4-7. Predicting the vapor pressure.

  1. Select the radio-button for Interactive mode
  2. From the Machine learning model drop-down menu, select Volatility of organic molecules
  3. From the Property drop-down menu, select Vapor pressure
  4. Set the pressure to Predict at 293.15 K
    • The vapor pressure of the selected  compounds will be calculated at a temperature of 293.15 K (i.e. 20°C). You can also change the temperature units to Celsius by using the drop-down menu
  5. Click Download Model
  6. Click Predict
    • The Results from selected entries section is populated with the predicted vapor pressure for the selected structures
    • For example, ethyl formate has a predicted vapor pressure of 200 Torr

Figure 4-8. Running the job.

Next, let’s run the prediction for the selected group of 5 compounds:

  1. Select the radio-button for Batch mode
  2. Change the Job name to ml_prop_prediction_organic_vp
  3. Adjust the job settings () as needed
    • This job can be completed in a few minutes on a CPU host
  4. Click Run
  5. Close the Machine Learning Property Prediction panel

Figure 4-9. The entry list and workspace after running the vapor pressure prediction.

When the job finishes, a new entry group is incorporated and added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion entitled ml_prop_prediction_organic_vp-out (5). The group contains all of the same structures as the original group, but now each entry also has the predicted vapor pressure as an associated property

Figure 4-10. The entry list and workspace after running the organometallic boiling point pressure prediction.

Repeat steps 1-10 in this section to predict the boiling/sublimation point for a set of organometallic complexes. Ensure you make the following changes in your procedure:

  • Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the organometallic_bp group
  • Select the Volatility of organometallic molecules Machine learning model
  • Make sure that you download the corresponding ML model from the dropdown options.
  • Set the pressure to Predict at 760 Torr
  • Change the Job name to ml_prop_prediction_organometallic_bp

For the sample molecules in this section, we have provided some experimentally-measured volatility values. Now, let’s compare the experimental and predicted volatility values using the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data:

 

If necessary, add the user defined (experimental) properties using the Property Tree.

Here we see that the experimental and predicted values for the three calculations run in this section are all in very good agreement. Volatility can be efficiently and accurately predicted using the Machine Learning Property Prediction panel on the appropriate chemical spaces.

5. Predicting Density of Molecular Liquids

In this section, we will use the Machine Learning Property Prediction panel to predict the density of molecular liquids. 

Figure 5-1. Representations of organic molecules.

Organic molecules are any structures primarily made of carbon, oxygen, hydrogen, and nitrogen. An example of an organic molecule is isopentane. Inputs for the density of molecular liquids model should be small organic molecules in the liquid phase such as those shown in the Figure.

Figure 5-2. Opening the Machine Learning Property Prediction panel.

  1. Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the molecular_liquids_density group and includethe entry is represented in the Workspace, the circle in the In column is blue the first entry, Isopentane
  2. Go to Tasks > Materials > Informatics > Machine Learning Property Prediction

Figure 5-3. Predicting the density.

Now, let’s predict the density of our structures:

 

  1. Select the radio-button for Interactive mode
  2. From the Machine learning model drop-down menu, select Density of molecular liquids
  3. Click Download Model
    • Model available is displayed in place of the button
  4. Click Predict
    • The Results from selected entries section is populated with the predicted density for the selected structures along with their uncertainty
    • For example, isopentane has a predicted density of 0.66 g/cm3 with a prediction uncertainty of ~0.01 g/cm3

Figure 5-4. Running the job.

  1. We can also run the prediction as a job on the selected entries. To do this, ensure Use structures from shows Project Table (5 selected entries)
  2. Select the radio-button for Batch mode
  3. Change the Job name to ml_prop_prediction_density
  4. Adjust the job settings () as needed
    • This job can be completed in a few minutes on a CPU host
  5. Click Run
  6. Close the Machine Learning Property Prediction panel

Figure 5-5. The entry list and workspace after running the density prediction.

When the job finishes, a new entry group is incorporated and added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion entitled ml_prop_prediction_density-out (5). The group contains all of the same structures as the original group, but now each entry also has a property with the predicted density.

Figure 5-6. Viewing the experimental and predicted density values in the Property Table.

We can see the density values by opening the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data in the next step

  1. Open the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data ()

The properties may not appear by default. To add the relevant properties as columns in the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data:

  1. Go to the Property Tree (), expand All > Materials Science > Primary > Density at 293.15 K (g/cm3), All > Materials Science > Primary > Density at 293.15 K Uncertainty (g/cm3), and All > Canvas > Secondary > Ref. Density (g/cm3)

The Ref. Density (g/cm3) column corresponds to experimental values and the Density at 293.15 K (g/cm3) and Density at 293.15 K Uncertainty (g/cm3) column are populated by the Machine Learning Property Prediction job. We can see that for our sample of 5 structures, the predicted density is in great agreement with the experimental values.

  1. Close the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data

6. Conclusion and References

In this tutorial, we learned how to use pre-built machine learning models to predict properties for polymers, organic, and organometallic molecules. While using pre-built machine learning models provides a fast way to obtain material properties, we caution that the predictions are good for a first estimate and should be further validated with experiments or physics-based methods.

For further learning:

For introductory content, focused on navigating the Schrödinger Materials Science interface, an Introduction to Maestro for Materials Science tutorial is available. Please visit the materials science training website for access to 70+ tutorials. For scientific inquiries or technical troubleshooting, submit a ticket to our Technical Support Scientists at help@schrodinger.com.

For self-paced, asynchronous, online courses in Materials Science modeling, including access to Schrödinger software, please visit the Schrödinger Online Learning portal on our website.

For some related practice, proceed to explore other relevant tutorials:

For further reading:

7. Glossary of Terms

Entry List - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion

Included - the entry is represented in the Workspace, the circle in the In column is blue

Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data

Recent actions - This is a list of your recent actions, which you can use to reopen a panel, displayed below the Browse row. (Right-click to delete.)

Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project

Selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries

Working Directory - the location where files are saved

Workspace - the 3D display area in the center of the main window, where molecular structures are displayed