Genetic Optimization

Tutorial Created with Software Release: 2026-1
Topics: Organic Electronics
Methodology: Machine Learning
Products Used: GA Optoelectronics, MS Maestro

Tutorial files

60 MB

This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed

 

Tip: You can hover over a glossary term to display its definition. You can click on an image to expand it in the page.
Abstract:

 

In this tutorial, we will learn to generate new structures for which a chosen set of optoelectronic properties is optimized by mutating the structures with a genetic algorithm.

 

Tutorial Content
  1. Introduction to Genetic Optimization

  1. Creating Projects and Importing Structures

  1. Performing an Optoelectronics Genetic Optimization

  1. Viewing the Optoelectronic Genetic Optimization Results

  1. Conclusion and References

  1. Glossary of Terms

1. Introduction to Genetic Optimization

A genetic optimization takes two or more input structures and performs two basic operations (crossover and mutation) on each generation. A crossover takes two parent structures and creates two child structures by cutting up molecules and piecing back together. The operation doesn't introduce any new information into the population but rather just shuffles up genes. Mutation, on the other hand, introduces external information into the population.

The Optoelectronics Genetic Optimization panel in Materials Science (MS) Maestro identifies new structures for which a chosen set of optoelectronic properties is optimized by mutating the structures with a genetic algorithm. Each structure is considered an individual in a population, and is mutated by changing aspects of its chemistry into another structure with more desirable properties. In this sense, it is a discovery tool rather than an optimization tool. In addition to the standard optimization (for molecules), the same approach can be applied to monomers which can then be used to construct polymers.

In the Machine Learning for Materials Science tutorial, we learned about AutoQSAR, a tool for automated creation, validation, and application of QSPR models following a best practices approach. We used AutoQSAR to build and rank order numerical QSPR models, visualize atomic contributions to property predictions, and use these models to make predictions on new, unseen datasets.

This tutorial leverages the dataset from the Machine Learning for Materials Science tutorial as our initial population in order to optimize the singlet-triplet splitting energy (ΔEST) for a set of thermally activated delayed fluorescence (TADF) molecules.

In this tutorial, we will learn how to perform a genetic optimization using the Optoelectronics Genetic Optimization panel to generate a new diverse set of structures. We will perform the calculation twice. First, using the AutoQSAR results from the Machine Learning for Materials Science tutorial then using a pre-defined machine learning (ML) model. The first case demonstrates the process of model training and utilization. Conversely, the latter case illustrates how to rapidly use the tools without requiring model training.Then we will monitor the results using the Optoelectronics Genetic Optimization Viewer panel. 

Here is a schematic of the overall workflow:

2. Creating Projects and Importing Structures

At the start of the session, change the file path to your chosen Working Directorythe location where files are saved in MS Maestro to make file navigation easier. Each session in MS Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project, which is not saved. A MS Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is saved, the project is automatically saved each time a change is made.

Structures can be built in MS Maestro or can be imported using File > Import Structures (or drag-and-dropped), and are added to the Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. The Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion is located to the left of the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.

  1. Double-click the Materials Science icon

Figure 2-1. Change Working Directory option.

  1. Go to File > Change Working Directory
  2. Find your directory, and click Choose
  3. Pre-generated files are included for running jobs or examining output. Download the zip file here: schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/genetic_optimization.zip
  4. After downloading the zip file, unzip the contents in your Working Directorythe location where files are saved for ease of access throughout the tutorial

Figure 2-2. Save Project panel.

  1. Go to File > Save Project As
  2. Change the File name to genetic_optimization_tutorial, click Save
    • The project is now named genetic_optimization_tutorial.prj

Figure 2-3. The entry list after importing.

 

We will import a library of 230 TADF molecules:

  1. Go to File > Import Structures
  2. Navigate to where you downloaded the provided tutorial files, choose TADF_train_set.mae and click Open
    • A new entry group is added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion containing 230 entries

 

The 230 TADFs comprising the dataset have experimental ΔEST values in the range of 0.0-1.1. If interested, view these values in the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data ().

3. Performing an Optoelectronics Genetic Optimization

In this section, we will use the series of TADF molecules with known ΔEST values as the initial population to generate hundreds of possible structures with the Optoelectronics Genetic Optimization panel.

Figure 3-1. Selecting the entire entry group.

  1. Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the entire train_set (230) entry group by clicking on the group header
    • Recall that selecting means to highlight the entries in the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion

Figure 3-2. The Optoelectronics Genetic Optimization panel.

  1. Go to Tasks > Materials > Quantum Mechanics > Optoelectronics > Genetic Optimization

 

Let’s explore the panel.

 

Generations section: specify the initial population and the number of generations.

 

Properties section: choose the properties to be optimized, and set optimization criteria. More than one property can be selected and a weight in the optimization can be specified. The properties include the usual optoelectronic properties calculated with the Optoelectronics Calculations panel, and some structure-based properties (molecular weight, number of atoms, number of elements). SMARTS patterns can also be specified, for which the property is the number of occurrences of matches to a pattern in a structure.

 

Genetic diversity section: select the actions that are taken to diversify the population, creating new individuals. These actions are: bond crossover, crossover rate, element mutation, isoelectronic mutation, fragment mutation, and mutation rate.

 

Bond crossover: Swap fragments from two structures.

 

Crossover rate: Specify the crossover rate as a percentage, which defines the frequency of crossover events.

 

Element mutation: Mutate an element in the structure to another element in the same group of the periodic table.

 

Isoelectronic mutation: Mutate an element in the structure to another element in the same row of the periodic table, with addition or deletion of hydrogens to maintain the same number of electrons.

 

Fragment mutation: Replace a fragment on a structure with a fragment selected at random from the specified fragment libraries. The fragment that is replaced is one that has a single acyclic bond that is not to a hydrogen atom.

 

Mutation rate: Specify the mutation rate as a percentage, which defines the frequency of mutations

Figure 3-3. Selecting the property.

  1. Ensure that Project Table (230 selected entries) is selected for Use structures from
  2. Change the Maximum generations to 5
    • This specifies the maximum number of generations, after which the optimization stops
  3. Change the consecutive unproductive generations to 3
  4. Select Statistical > Define new AutoQSAR property in the Property option menu

Figure 3-4. Defining the AutoQSAR property.

  1. Change the Name to E_st
  2. Change the Units to eV
  3. Click Browse
  4. Navigate to where you downloaded the provided tutorial files, choose Section_03 > qsar_build_TADF > qsar_build_TADF.qzip

Figure 3-5. Closing the panel.

qsar_build_TADF.qzip is the selected AutoQSAR model file. This qzip file is the results from the Machine Learning for Materials Science tutorial

 

  1. Click OK to close the panel

Figure 3-6. Starting the genetic optimization calculation.

  1. Set Evolution to Equals
    • The Evolution is the criterion on which the success of the mutation is assessed
  2. Set Target to 0
    • In this case the objective is to minimize the E_st value
  3. Uncheck Isoelectronic mutation option
  4. Change the Job name to opto_ga_TADF
  5. Adjust the job settings () as needed
    • This job requires a CPU host. The job can be completed in about 12 hours
  6. If running the job, proceed to click Run. If you would prefer to proceed with imported files, please proceed to the next steps.

Figure 3-7. Selecting the property using the pre-defined ML model.

Now let’s setup the calculation using a pre-defined ML model

  1. Select ML > Singlet-triplet energy gap in the Property option menu

Figure 3-8. Starting the genetic optimization calculation.

  1. Set Evolution to Equals
  2. Set Target to 0
    • All other settings will remain the same as before
  3. Change the Job name to opto_ga_ML_TADF
  4. Adjust the job settings () as needed
    • This job requires a CPU host. The job can be completed in about 2 hours
  5. If running the job, proceed to click Run. If you would prefer to proceed with imported files, please proceed to the next steps.
  6. Close the Optoelectronics Genetic Optimization panel

Figure 3-9. Viewing the newly generated structures.

Let’s import the two result files:

  1. Go to File > Import Structures
  2. Navigate to where you downloaded the provided tutorial files, choose Section_03 > opto_ga_TADF > opto_ga_TADF-out.mae and Section_03 > opto_ga_ML_TADF > opto_ga_ML_TADF-out.mae

The entry list is updated with two new entry groups, each containing 1380 new structures.

4. Viewing the Optoelectronics Genetic Optimization Results

In this section, we will view the newly generated TADF molecules and plot their ΔEST values with the Optoelectronics Genetic Optimization Viewer panel.

Figure 4-1. Opening the Optoelectronics Genetic Optimization Viewer panel.

Please compare the results of both setups, although the opto_ga_TADF calculation results will be displayed.

  1. In the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries and includethe entry is represented in the Workspace, the circle in the In column is blue opto_go_TADF-out
  2. Go to Tasks > Materials > Quantum Mechanics > Optoelectronics > Genetic Optimization Monitoring or use the Workflow Action Menu (WAM) button which appears next to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion
    • The Optoelectronics Genetic Optimization Viewer panel opens
    • This panel views the results of a completed genetic optimization job, or to view the progress of a running job. As the jobs can take a long time, it may be useful to examine the results as it is running, for example to assess the progress or success of the optimization, or to make use of structures that meet some criteria before the job finishes. The panel offers several ways of assessing the progress of the optimization.
  3. Click the Min/Max Plot tab

Figure 4-2. The Min/Max Plot.

In the Min/Max Plot tab, the minimum value and the maximum value of any property as a function of the generation is displayed. For the scores, if the maximum is not increasing, then the optimization is not making progress. If the maximum is not increasing much and the spread between the maximum and the minimum is increasing, the process is generating structures whose properties are less desirable.

 

  1. Click the Histogram tab

Figure 4-3. The histogram results.

In the Histogram tab, the number of structures that had a particular range of values, color coded by generations is viewed. From this view one can assess whether the number of structures in the desirable range is increasing with each generation.

 

  1. Click the Evolution tab

Figure 4-4.The evolution results.

In the Evolution tab, the family tree of any particular individual in any generation can be viewed. The tree is marked with green lines to the parents and blue lines to the children, and the individuals are colored by a property value. This shows how the property value is changing between generations. The properties of the family (parents, individual, children) are shown in the Structure Window so all properties can be examined, not just the one used for coloring.

  1. Close the Optoelectronics Genetic Optimization Viewer panel

Figure 4-5. Opening the Project Table.

The ΔEST values for the new structures determined by AutoQSAR can be viewed in the project table.

 

  1. Open the Project Table by clicking on the Table icon in the top right corner of the toolbar

Figure 4-6. Viewing the Project Table.

  1. Use the Property Tree () to navigate to Materials Science > Secondary > Auto QSAR E st/eV and select the checkbox

 

Feel free to include any other property of interest or to generate various plots using Window > Manage Charts.

 

 

Using the plotting tools in the Project Table, we plot the Auto QSAR E st vs. the raw score. We see that as the ΔEST values approach 0.0 the raw score increases so we have confidence that our optimization procedure was successful.

5. Conclusion and References

In this tutorial, we learned how to generate new virtual TADF molecules by mutating an initial TADF input population with a genetic algorithm calculation.

For further learning:

For introductory content, focused on navigating the Schrödinger Materials Science interface, an Introduction to Materials Science Maestro tutorial is available. Please visit the materials science training website for access to 100+ tutorials. For scientific inquiries or technical troubleshooting, submit a ticket to our Technical Support Scientists at help@schrodinger.com.

For self-paced, asynchronous, online courses in Materials Science modeling, including access to Schrödinger software, please visit the Schrödinger Online Learning portal on our website.

For some related practice, proceed to explore other relevant tutorials:

For further reading:

6. Glossary of Terms

Entry List - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion

Included - the entry is represented in the Workspace, the circle in the In column is blue

Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data

Recent actions - This is a list of your recent actions, which you can use to reopen a panel, displayed below the Browse row. (Right-click to delete.)

Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project

Selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries

Working Directory - the location where files are saved

Workspace - the 3D display area in the center of the main window, where molecular structures are displayed