qm_descriptors.py: Generate and Extract Descriptor Sets
The script qm_descriptors.py provides a set of descriptors for one or more structures in the form of a .mae, .sdf, or .csv file. The descriptors are obtained either by parsing existing Jaguar .out files or by generating a set of Jaguar jobs and parsing the resulting .out files. These two methods of operation are invoked with the command line options -outs and -maes respectively and are detailed below. The descriptors are written to one file per .out file parsed as well as an aggregate file with all structures in it called all_calcs.props.ext.
The command syntax for -outs is:
jaguar run qm_descriptors.py -outs outfile1 outfile2 ... [options]
There are two available options when using -outs: -props and -formats. The -props option is used to specify the properties to harvest from the passed Jaguar .out files; the syntax is -props prop1 prop2 ... . The properties are listed in the tables below. The -formats option is used to select the format or formats for writing out the properties (one or more of Maestro, SDF, and CSV), with the syntax -formats {mae|sdf|csv} [{mae|sdf|csv}...]. If neither -props or -formats are provided, reasonable defaults are used (see the example config file provided below, sections OutputFormats and Properties).
The command syntax for -maes is:
jaguar run qm_descriptors.py -maes maefile [options]
where maefile is a single Maestro file. This file can have multiple structures, all of which are used to generate descriptors. The only available option when using -maes is -configs, which controls the settings of the generated Jaguar jobs as well as the properties that are harvested and the format used for output. If -configs is not provided, reasonable defaults are used (given below). The syntax of the option is -config configfile. The config file format is a Python-digestible YAML file. An example is given below, showing the default settings. The allowed settings for the OutputFormats and Properties sections are the same as in the -formats and -props options when using -outs.
The BasisSets, Functionals, and GenKeyvals sections define the settings for the generated Jaguar jobs. For BasisSets and Functionals, you can use any basis set or functional defined in Jaguar, and you can specify multiple basis sets and functionals. Each combination of a basis set and a functional is used with each structure, resulting in multiple subjobs. The GenKeyvals section takes any legal Jaguar gen section keyword: value pair. The values of the charge and multiplicity (molchg and multip) specified in this section can be overridden by values set in the Maestro input file. The Properties section defines the properties to extract from the Maestro file. The allowed values of the property keywords are listed in the tables below, and they do not take any arguments. The example below shows one property keyword, all_descriptor_numeric, which you could replace with a different keyword from any of the tables, and you can add as many properties as you want, in the same format, one per line.
OutputFormats:
- sdf
BasisSets:
- LACVP*
Functionals:
- b3lyp-d3
GenKeyvals:
mulken: 2
ldips: 5
ipolar: -2
nmr: 1
fukui: 1
esp_analysis: 1
epn: 1
nbo: 1
ifreq: 1
icfit: 1
Properties:
- all_descriptor_numeric
| Keyword | Description |
|---|---|
| atom_name | Atom Name |
| charge_esp | Atomic Charges from ESP |
| charge_lowdin | Lowdin Atomic Charges |
| charge_mulliken | Mulliken Atomic Charges |
| charge_nbo | Atomic Charges from NBO |
| charge_stockholder | Stockholder Atomic Charges |
| epn | Electrostatic Potential at Atomic Nuclei |
| forces | Atomic Forces |
| homo_nn | Atomic Fukui Indices, f_NN HOMO |
| homo_ns | Atomic Fukui Indices, f_NS HOMO |
| homo_sn | Atomic Fukui Indices, f_SN HOMO |
| homo_ss | Atomic Fukui Indices, f_SS HOMO |
| lumo_nn | Atomic Fukui Indices, f_NN LUMO |
| lumo_ns | Atomic Fukui Indices, f_NS LUMO |
| lumo_sn | Atomic Fukui Indices, f_SN LUMO |
| lumo_ss | Atomic Fukui Indices, f_SS LUMO |
| maxat_alie | Max Atomic ALIE Values |
| maxat_esp | Max Atomic ESP Values |
| minat_alie | Min Atomic ALIE Value |
| minat_esp | Min Atomic ESP Values |
| nmr_2d_avg_shift | NMR 2D-Averaged Relative Shifts |
| nmr_abs_shift | NMR Atomic Absolute Shifts |
| nmr_h_avg_shift | NMR H-Averaged Relative Shifts |
| nmr_rel_shift | NMR Atomic Relative Shifts |
| nmr_shielding | NMR Isotropic Shielding per Atom |
| spin_lowdin | Lowdin Spin Densities |
| spin_mulliken | Mulliken Spin Densities |
| Keyword | Description |
|---|---|
| GTotal | Total Gibbs Free Energy (HTotal - T*S), in Hartrees |
| HTotal | Total Enthalpy (UTotal + pV), in Hartrees |
| S_min_eval | Minimum value of S (overlap matrix) |
| UTotal | Total Internal Energy (SCFE + ZPE + U), in Hartrees |
| ani_energy | neural network potential energy, in Hartree |
| ani_stddev | standard deviation in prediction of neural network energy, in Hartree |
| balance_alie | ALIE balance on isodensity surface |
| balance_esp | ESP balance on isodensity surface |
| bond_midpoint_charge | Bond-Midpoint Charges Calculated in ESP Fitting |
| canonical_orbitals | Number of canonical orbitals |
| dipole_strength | Dipole Strengths of Normal Modes |
| dipolecomp_esp | Dipole Moment Components Calc'd from Electrostatic Potential Charges, in Debye |
| dipolecomp_mulliken | Dipole Moment Components Calc'd from Mulliken Charges, in Debye |
| dipolecomp_qm | Dipole Moment Components Calc'd from Wavefunction, in Debye |
| dipolemag_esp | Dipole Moment Magnitude Calc'd from Electrostatic Potential Charges, in Debye |
| dipolemag_mulliken | Dipole Moment Magnitude Calc'd from Mulliken Charges, in Debye |
| dipolemag_qm | Dipole Moment Magnitude Calc'd from Wavefunction, in Debye |
| doubted_geom | Indicates a geometry step was not expected to be good |
| energy_aposteri | a posteriori correction to the total energy (component (N0) in SCF summary), in Hartree |
| energy_aposteri0 | Uncorrected energy in the case of a posteri-corrected calculations (energy-energy_aposteri), in Hartree |
| energy_electronic | Total electronic energy (component (L) in SCF summary), in Hartree |
| energy_one_electron | Total one-electron energy (component (E) in SCF summary), in Hartree |
| energy_two_electron | Total two-electron energy (component (I) in SCF summary), in Hartree |
| enthalpy | Total Calculated Enthalpy |
| enthalpy_elec | Electronic Contribution to Enthalpy |
| enthalpy_rot | Rotational Contribution to Enthalpy |
| enthalpy_trans | Translational Contribution to Enthalpy |
| enthalpy_vib | Vibrational Contribution to Enthalpy |
| entropy | Total Calculated Entropy |
| entropy_elec | Electronic Contribution to Entropy |
| entropy_rot | Rotational Contribution to Entropy |
| entropy_trans | Translational Contribution to Entropy |
| entropy_vib | Vibrational Contribution to Entropy |
| et_H_if | Hamiltonian of initial to final state in e- transfer |
| et_H_ii | Hamiltonian of initial state in e- transfer |
| et_S_if | Overlap of initial and final state wfns in e- transfer |
| et_T_if | e- transfer transition energy |
| excitation_energies | Excitation energies, in eV |
| external_program_energy | Energy produced by external program, in Hartree |
| force_constant | Force Constants of Normal Modes |
| frequency | Frequencies of Normal Modes, in cm-1 |
| gas_phase_energy | Gas Phase Energy, in Hartree |
| gibbs_free_energy | Total Calculated Gibbs Free Energy |
| gibbs_free_energy_elec | Electronic Contribution to Gibbs Free Energy |
| gibbs_free_energy_rot | Rotational Contribution to Gibbs Free Energy |
| gibbs_free_energy_trans | Translational Contribution to Gibbs Free Energy |
| gibbs_free_energy_vib | Vibrational Contribution to Gibbs Free Energy |
| heat_capacity | Total Calculated Heat Capacity |
| heat_capacity_elec | Electronic Contribution to Heat Capacity |
| heat_capacity_rot | Rotational Contribution to Heat Capacity |
| heat_capacity_trans | Translational Contribution to Heat Capacity |
| heat_capacity_vib | Vibrational Contribution to Heat Capacity |
| homo | HOMO energy (set to None for open-shell calcs), in Hartree |
| homo_alpha | Alpha HOMO energy (set to None for closed-shell calcs), in Hartree |
| homo_beta | Beta HOMO energy (set to None for closed-shell calcs), in Hartree |
| homo_lumo_gap | HOMO-LUMO Gap energy. Calculated as lower of same-spin orbital differences in unrestricted calcs, in Hartree |
| internal_energy | Total Calculated Internal Energy |
| internal_energy_elec | Electronic Contribution to Internal Energy |
| internal_energy_rot | Rotational Contribution to Internal Energy |
| internal_energy_trans | Translational Contribution to Internal Energy |
| internal_energy_vib | Vibrational Contribution to Internal Energy |
| ir_intensity | IR Intensities of Normal Modes |
| lambdamax_ev | Excitation energy (eV) of state with highest oscillator strength, in eV |
| lambdamax_nm | Excitation energy (nm) of state with highest oscillator strength, in nm |
| lmp2_energy | LMP2 Energy, in Hartree |
| lnq | Total Calculated lnQ |
| lnq_elec | Electronic Contribution to lnQ |
| lnq_rot | Rotational Contribution to lnQ |
| lnq_trans | Translational Contribution to lnQ |
| lnq_vib | Vibrational Contribution to lnQ |
| local_pol_alie | Avg deviation from mean ALIE on isodensity surface |
| local_pol_esp | Local polarity on isodensity surface |
| lumo | LUMO energy (set to None for open-shell calcs), in Hartree |
| lumo_alpha | Alpha LUMO energy (set to None for closed-shell calcs), in Hartree |
| lumo_beta | Beta LUMO energy (set to None for closed-shell calcs), in Hartree |
| max_alie | Maximum ALIE value on isodensity surface |
| max_esp | Maximum ESP value on isodensity surface |
| mean_alie | Mean ALIE value on isodensity surface |
| mean_esp | Mean ESP value on isodensity surface |
| mean_neg_alie | Mean negative ALIE value on isodensity surface |
| mean_neg_esp | Mean negative ESP value on isodensity surface |
| mean_pos_alie | Mean positive ALIE value on isodensity surface |
| mean_pos_esp | Mean positive ESP value on isodensity surface |
| min_alie | Minimum ALIE value on isodensity surface |
| min_esp | Minimum ESP value on isodensity surface |
| nops_on | Indicates a NOPS calculation |
| nuclear_repulsion | Nuclear Repulsion Energy, in Hartree |
| opt_excited_state_energy_1 | Energy of first excited state geometry optimization |
| orb_ener_alpha | Alpha Orbital Energies for UHF calculations, in Hartrees |
| orb_ener_beta | Beta Orbital Energies for UHF calculations, in Hartrees |
| orb_ener_rhf | Orbital Energies for RHF calculations, in Hartrees |
| orb_symm_alpha | Alpha Orbital Energies for UHF calculations |
| orb_symm_beta | Beta Orbital Energies for UHF calculations |
| orb_symm_rhf | Orbital Energies for RHF calculations |
| oscillator_strengths | Excited state oscillator strengths |
| polar_alpha | Polarizability |
| polar_beta | First-Order Hyperpolarizability |
| polar_gamma | Second-Order Hyperpolarizability |
| raman_activity | Raman Activities of Normal Modes |
| raman_intensity | Raman Intensities of Normal Modes |
| reaction_coord | Reaction coordinate Number |
| reduced_mass | Reduced Masses of Normal Modes |
| rotational_constants | Rotational constants of molecule |
| rotational_strength | Rotational Strengths of Normal Modes |
| s2 | Spin: >S**2> |
| scf_energy | SCF Energy, in Hartree |
| sig_neg_alie | Variance of negative ALIE on isodensity surface |
| sig_neg_esp | Variance of negative ESP on isodensity surface |
| sig_pos_alie | Variance of positive ALIE on isodensity surface |
| sig_pos_esp | Variance of positive ESP on isodensity surface |
| sig_tot_alie | Total ALIE variance on isodensity surface |
| sig_tot_esp | Total ESP variance on isodensity surface |
| singlet_excitation_energies | Restricted Singlet Electronic excitation energies, in eV |
| singlet_oscillator_strengths | Singlet excited state oscillator strengths |
| sm_iter | Iteration number of string method |
| sm_point | Num of points along string method string |
| solution_phase_energy | Solution Phase Energy, in Hartree |
| solvation_energy | Solvation Energy, in Hartree |
| spin_splitting_score | Ligand field spin-splitting score for DBLOC calculations |
| symmetry | Symmetries of Normal Modes |
| symmetry_number | symmetry number for molecule |
| sz2 | Spin: Sz*>Sz+1> |
| total_lo_correction | Total localized orbital energy correction |
| transition_state_components | Transition State Components |
| triplet_excitation_energies | Restricted triplet electronic excitation energies, in eV |
| triplet_oscillator_strengths | Triplet excitation energy oscillator strengths |
| zero_point_energy | Zero Point Energy, in Hartree |
| zvar | a mapping of scan variable names to values |
| Keyword | Description |
|---|---|
| _sm_n_points | number of string method points |
| basis | Basis Set |
| charge | Molecular charge of Input Structure |
| coords_frozen | Number of frozen coordinates |
| coords_harmonic | number of harmonic constraints |
| coords_ind | Number of independent coordinates |
| coords_nred | Number of non-redundant coordinates |
| coords_opt | Number of optimization coordinates |
| fatal_error | Error message in the event the job failed |
| fatal_errorno | Error number in the event the job failed |
| functional | DFT Functional |
| geopt_stuck | Whether the geopt or tsopt got stuck |
| glibc | Reported glibc version |
| host | Job Host |
| job_id | Job ID |
| lastexe | Last Jaguar Executable Used |
| mae_in | Maestro input file |
| mae_out | Maestro output file |
| method | Calculation Type |
| mol_weight | Molecular weight of input geometry, in amu |
| multiplicity | Spin Multiplicity of Input Structure |
| nbasis | Number of Basis Functions |
| nelectron | Number of Electrons |
| point_group | Molecular point group of the input molecule |
| point_group_used | Point group used in the calculation |
| qm_atoms | Number of QM Atoms |
| status | Job status - set to 0, 1, or 2 corresponding to UNKNOWN, OK, or SPLAT respectively |
| stoichiometry | Stoichiometry of input geometry |
| symmetrized | Whether the geometry has been symmetrized or not |
| ts_component_descriptions | Descriptions of the transition state vector components |
| Keyword | Description |
|---|---|
| all | Returns all properties found in output file. |
| all_descriptor_numeric | Returns all numeric Molecular/Atomic properties found in output file. |
| all_numeric | Returns all numeric properties found in output file. |