Phase Database Structure

The database is stored in a directory, dbName.phdb, which contains all the files associated with the database. The files and directories that are standard parts of the database are described in Table 1. You should not generally need to modify these files, but you may find it useful to examine them. In addition to these files, the database can contain subset files, named subset_phase.inp.

You should not create or store other databases inside the database directory, as this is likely to lead to job failure.

Table 1. Database files and directories

File or Directory

Description

database.sqlite

An SQLite database with tables that hold global information about the database:

dprop_name_table—Holds the name of the property used to detect and eliminate duplicates, s_phase_Unique_SMILES.

dprop_values_table—Holds mol_id and s_phase_Unique_SMILES for each database record.

props_table—Holds mol_id, title, num_confs, has_sites, plus all imported and computed properties for each database record. This table does not exist unless an "extract" job has been run, and it must be regenerated when changes to the database are made.

props_map_table—Mappings of the column names in props_table to the CT property names in the block database files stored in database_ligands/. This table always exists, but is empty if properties have not been extracted.

summary_table —Holds mol_id, title, num_confs, and has_sites for each database record. has_sites is 1 or 0, depending on whether or not sites are stored for a given record.

database_dbversion

Version file. Holds the Phase version number, the method of creation (always "CL"), and the storage format ("SQLite").

database_feature.ini

Feature definitions.

database_history.log

History of changes to the database, containing the date and the command issued.

database_info.log

Detailed information about changes to the database.

database_master_phase.inp

Master subset file.

database_props.csv

Copy of the data in props_table in CSV format. This file is only present if an extract job has been run.

database_props_map.csv

Copy of the mappings in props_map_table in CSV format. This file is only present if an extract job has been run with -map.

database_props_stats.csv

Statistics for the properties in props_table: count, min, max avg. count is the number of records for which the property exists. For string properties, avg is an empty value. This file is only present if an extract job has been run with -stats.

database_summary.csv

Copy of the data in summary_table in CSV format.

database_ligands

All block data is held under this directory. There is one SQLite file for each block of 5000 records, e.g., database_ligands/block_1/block_struct_1.sqlite. Structures, sites, CT-level properties, and 2D/3D indexes are stored in this file. When an extract job is run, the properties in these block files are extracted and written to the top-level file database.sqlite.

database_restart

This directory contains files that are required to restart a phase_database job. The files in this directory are updated as a job progresses. When the job and all of its subjobs finish normally, this directory is removed automatically.