Genetic Optimization
Tutorial Created with Software Release: 2026-1
Topics: Organic Electronics
Methodology: Machine Learning
Products Used: GA Optoelectronics , MS Maestro
|
60 MB |
This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed
Abstract:
In this tutorial, we will learn to generate new structures for which a chosen set of optoelectronic properties is optimized by mutating the structures with a genetic algorithm.
Tutorial Content
1. Introduction to Genetic Optimization
A genetic optimization takes two or more input structures and performs two basic operations (crossover and mutation) on each generation. A crossover takes two parent structures and creates two child structures by cutting up molecules and piecing back together. The operation doesn't introduce any new information into the population but rather just shuffles up genes. Mutation, on the other hand, introduces external information into the population.
The Optoelectronics Genetic Optimization panel in Materials Science (MS) Maestro identifies new structures for which a chosen set of optoelectronic properties is optimized by mutating the structures with a genetic algorithm. Each structure is considered an individual in a population, and is mutated by changing aspects of its chemistry into another structure with more desirable properties. In this sense, it is a discovery tool rather than an optimization tool. In addition to the standard optimization (for molecules), the same approach can be applied to monomers which can then be used to construct polymers.
In the Machine Learning for Materials Science tutorial, we learned about AutoQSAR, a tool for automated creation, validation, and application of QSPR models following a best practices approach. We used AutoQSAR to build and rank order numerical QSPR models, visualize atomic contributions to property predictions, and use these models to make predictions on new, unseen datasets.
This tutorial leverages the dataset from the Machine Learning for Materials Science tutorial as our initial population in order to optimize the singlet-triplet splitting energy (ΔEST) for a set of thermally activated delayed fluorescence (TADF) molecules.
In this tutorial, we will learn how to perform a genetic optimization using the Optoelectronics Genetic Optimization panel to generate a new diverse set of structures. We will perform the calculation twice. First, using the AutoQSAR results from the Machine Learning for Materials Science tutorial then using a pre-defined machine learning (ML) model. The first case demonstrates the process of model training and utilization. Conversely, the latter case illustrates how to rapidly use the tools without requiring model training.Then we will monitor the results using the Optoelectronics Genetic Optimization Viewer panel.
Here is a schematic of the overall workflow:
2. Creating Projects and Importing Structures
At the start of the session, change the file path to your chosen Working Directorythe location where files are saved in MS Maestro to make file navigation easier. Each session in MS Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project, which is not saved. A MS Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is saved, the project is automatically saved each time a change is made.
Structures can be built in MS Maestro or can be imported using File > Import Structures (or drag-and-dropped), and are added to the Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. The Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion is located to the left of the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.
- Double-click the Materials Science icon
- (No icon? See Starting Maestro)
- Go to File > Change Working Directory
- Find your directory, and click Choose
- Pre-generated files are included for running jobs or examining output. Download the zip file here: schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/genetic_optimization.zip
- After downloading the zip file, unzip the contents in your Working Directorythe location where files are saved for ease of access throughout the tutorial
- Go to File > Save Project As
- Change the File name to genetic_optimization_tutorial, click Save
- The project is now named
genetic_optimization_tutorial.prj
- The project is now named
We will import a library of 230 TADF molecules:
- Go to File > Import Structures
- Navigate to where you downloaded the provided tutorial files, choose
TADF_train_set.maeand click Open- A new entry group is added to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion containing 230 entries
The 230 TADFs comprising the dataset have experimental ΔEST values in the range of 0.0-1.1. If interested, view these values in the Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data (
).
3. Performing an Optoelectronics Genetic Optimization
In this section, we will use the series of TADF molecules with known ΔEST values as the initial population to generate hundreds of possible structures with the Optoelectronics Genetic Optimization panel.
- Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the entire train_set (230) entry group by clicking on the group header
- Recall that selecting means to highlight the entries in the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion
- Go to Tasks > Materials > Quantum Mechanics > Optoelectronics > Genetic Optimization
- The Optoelectronics Genetic Optimization panel opens
Let’s explore the panel.
Generations section: specify the initial population and the number of generations.
Properties section: choose the properties to be optimized, and set optimization criteria. More than one property can be selected and a weight in the optimization can be specified. The properties include the usual optoelectronic properties calculated with the Optoelectronics Calculations panel, and some structure-based properties (molecular weight, number of atoms, number of elements). SMARTS patterns can also be specified, for which the property is the number of occurrences of matches to a pattern in a structure.
Genetic diversity section: select the actions that are taken to diversify the population, creating new individuals. These actions are: bond crossover, crossover rate, element mutation, isoelectronic mutation, fragment mutation, and mutation rate.
Bond crossover: Swap fragments from two structures.
Crossover rate: Specify the crossover rate as a percentage, which defines the frequency of crossover events.
Element mutation: Mutate an element in the structure to another element in the same group of the periodic table.
Isoelectronic mutation: Mutate an element in the structure to another element in the same row of the periodic table, with addition or deletion of hydrogens to maintain the same number of electrons.
Fragment mutation: Replace a fragment on a structure with a fragment selected at random from the specified fragment libraries. The fragment that is replaced is one that has a single acyclic bond that is not to a hydrogen atom.
Mutation rate: Specify the mutation rate as a percentage, which defines the frequency of mutations
- Ensure that Project Table (230 selected entries) is selected for Use structures from
- Change the Maximum generations to 5
- This specifies the maximum number of generations, after which the optimization stops
- Change the consecutive unproductive generations to 3
- Select Statistical > Define new AutoQSAR property in the Property option menu
- Change the Name to E_st
- Change the Units to eV
- Click Browse
- Navigate to where you downloaded the provided tutorial files, choose
Section_03 > qsar_build_TADF > qsar_build_TADF.qzip
qsar_build_TADF.qzip is the selected AutoQSAR model file. This qzip file is the results from the Machine Learning for Materials Science tutorial
- Click OK to close the panel
- Set Evolution to Equals
- The Evolution is the criterion on which the success of the mutation is assessed
- Set Target to 0
- In this case the objective is to minimize the E_st value
- Uncheck Isoelectronic mutation option
- Change the Job name to opto_ga_TADF
- Adjust the job settings (
) as needed
- This job requires a CPU host. The job can be completed in about 12 hours
- If running the job, proceed to click Run. If you would prefer to proceed with imported files, please proceed to the next steps.
Now let’s setup the calculation using a pre-defined ML model
- Select ML > Singlet-triplet energy gap in the Property option menu
- Set Evolution to Equals
- Set Target to 0
- All other settings will remain the same as before
- Change the Job name to opto_ga_ML_TADF
- Adjust the job settings (
) as needed
- This job requires a CPU host. The job can be completed in about 2 hours
- If running the job, proceed to click Run. If you would prefer to proceed with imported files, please proceed to the next steps.
- Close the Optoelectronics Genetic Optimization panel
Let’s import the two result files:
- Go to File > Import Structures
- Navigate to where you downloaded the provided tutorial files, choose
Section_03 > opto_ga_TADF > opto_ga_TADF-out.maeandSection_03 > opto_ga_ML_TADF > opto_ga_ML_TADF-out.mae
The entry list is updated with two new entry groups, each containing 1380 new structures.
4. Viewing the Optoelectronics Genetic Optimization Results
In this section, we will view the newly generated TADF molecules and plot their ΔEST values with the Optoelectronics Genetic Optimization Viewer panel.
Please compare the results of both setups, although the opto_ga_TADF calculation results will be displayed.
- In the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries and includethe entry is represented in the Workspace, the circle in the In column is blue opto_go_TADF-out
- Go to Tasks > Materials > Quantum Mechanics > Optoelectronics > Genetic Optimization Monitoring or use the Workflow Action Menu (WAM) button
which appears next to the entry lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion
- The Optoelectronics Genetic Optimization Viewer panel opens
- This panel views the results of a completed genetic optimization job, or to view the progress of a running job. As the jobs can take a long time, it may be useful to examine the results as it is running, for example to assess the progress or success of the optimization, or to make use of structures that meet some criteria before the job finishes. The panel offers several ways of assessing the progress of the optimization.
- Click the Min/Max Plot tab
In the Min/Max Plot tab, the minimum value and the maximum value of any property as a function of the generation is displayed. For the scores, if the maximum is not increasing, then the optimization is not making progress. If the maximum is not increasing much and the spread between the maximum and the minimum is increasing, the process is generating structures whose properties are less desirable.
- Click the Histogram tab
In the Histogram tab, the number of structures that had a particular range of values, color coded by generations is viewed. From this view one can assess whether the number of structures in the desirable range is increasing with each generation.
- Click the Evolution tab
In the Evolution tab, the family tree of any particular individual in any generation can be viewed. The tree is marked with green lines to the parents and blue lines to the children, and the individuals are colored by a property value. This shows how the property value is changing between generations. The properties of the family (parents, individual, children) are shown in the Structure Window so all properties can be examined, not just the one used for coloring.
- Close the Optoelectronics Genetic Optimization Viewer panel
The ΔEST values for the new structures determined by AutoQSAR can be viewed in the project table.
- Open the Project Table by clicking on the Table icon in the top right corner of the toolbar
5. Conclusion and References
In this tutorial, we learned how to generate new virtual TADF molecules by mutating an initial TADF input population with a genetic algorithm calculation.
For further learning:
For introductory content, focused on navigating the Schrödinger Materials Science interface, an Introduction to Materials Science Maestro tutorial is available. Please visit the materials science training website for access to 100+ tutorials. For scientific inquiries or technical troubleshooting, submit a ticket to our Technical Support Scientists at help@schrodinger.com.
For self-paced, asynchronous, online courses in Materials Science modeling, including access to Schrödinger software, please visit the Schrödinger Online Learning portal on our website.
For some related practice, proceed to explore other relevant tutorials:
- Machine Learning for Materials Science
- Optoelectronics Active Learning
- Machine Learning Property Prediction
- Molecular Dynamics Descriptors for Machine Learning
- Optoelectronics
- Kinetic Monte Carlo (KMC) Charge Mobility
- Band Shape
- Excited State Analysis
- Calculating Transition Dipole Moments (TDM), TDM Distributions, and Order Parameter
- Singlet Excitation Energy Transfer
For further reading:
- See the help documentation on the Optoelectronics Genetic Optimization and Optoelectronics Genetic Optimization Viewer panel
- AutoQSAR help documentation
- Design of Organic Electronic Materials With a Goal-Directed Generative Model Powered by Deep Neural Networks and High-Throughput Molecular Simulations. DOI:10.3389/fchem.2021.800370
- Active Learning Accelerates Design and Optimization of Hole-Transporting Materials for Organic Electronics. DOI:10.3389/fchem.2021.800371
- Accelerated design and optimization of OLED materials via active learning. DOI:10.1117/12.2598140
- DeepAutoQSAR Hardware Benchmark (Schrödinger white paper)
6. Glossary of Terms
Entry List - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion
Included - the entry is represented in the Workspace, the circle in the In column is blue
Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data
Recent actions - This is a list of your recent actions, which you can use to reopen a panel, displayed below the Browse row. (Right-click to delete.)
Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project
Selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries
Working Directory - the location where files are saved
Workspace - the 3D display area in the center of the main window, where molecular structures are displayed