shape_screen Command Help
Command: $SCHRODINGER/shape_screen
usage: shape_screen [-h] -shape <shape> -screen <screen>
[Similarity Algorithm Options]
[Shape Treatment Options]
[Alignment Options]
[File Screening Options]
[Database Screening Options]
[Conformer Treatment Options]
[Filtering Options]
[Reporting Options]
[Job Control Options]
Performs a CPU shape screen of one or more Maestro files, SD files, or a
Phase database. To run on the GPU, use $SCHRODINGER/shape_screen_gpu.
Each conformer of a given molecule is aligned to the provided shape query
in numerous ways, and a similarity is computed based on approximate or exact
hard-sphere overlap, with exact overlap being used when running in "classic"
mode. The conformer and alignment of a given molecule that yields the highest
shape similarity to the query is written to the file <jobname>_align.maegz,
with shape similarity stored in the property r_phase_Shape_Sim.
Copyright Schrodinger LLC, All Rights Reserved.
options:
-h, --help Show this message and exit.
Required Arguments:
-shape <shape> Shape query. May be a Maestro file, SD file, Phase
included volumes file (.ivol) with 3 or more spheres,
or a Phase pharmacophore hypothesis (.phypo) with 3 or
more features. Multiple shape queries may be provided
in a Maestro or SD file.
-screen <screen> Structures to screen. May be a Maestro file, SD file,
list file (.list) with the names of one or more
Maestro or SD files (one name per line, all files of
the same type and compression state), or a Phase
database (.phdb) specified using an absolute path. If
screening existing conformers in one or more files,
consecutive structures with identical titles and
connectivities will be treated as conformers of a
single molecule, unless -distinct, -connect, or
-stereo is specified. Use -flex to generate conformers
on-the-fly for each structure.
Similarity Algorithm Options:
[-classic] [-map <m>] [-alt <n>] [-norm {1,2,3,4}]
-classic Run in classic mode, which utilizes exact hard-sphere
overlap. Classic mode is automatically triggered when
any of the following options are used: -report <m>,
-table, -force <atoms_file>, -xvol <file>. Approximate
hard-sphere overlap is the default and preferred mode
since screens can be twice as fast.
-map <m> Maximum number of primary screen->query sphere
mappings. The number of potential alignments scales as
<m>!/(3!(<m>-3)!). Must lie between 8 and 12 (default:
8).
-alt <n> Maximum number of alternative mappings to each primary
mapping. The number of potential alignments scales as
(<n>+1)**3. Must be 1 or 2 (default: 1).
-norm {1,2,3,4} Similarity normalization scheme. For shape query A and
screening structure B, similarity is computed as
O(A,B)/norm(A,B), where O(A,B) is the overlap between
A and B, and norm(A,B) is a function of the self-
overlaps O(A,A) and O(B,B): 1->max{O(A,A), O(B,B)},
2->min{O(A,A), O(B,B)}, 3->O(A,A), 4->O(B,B) (default:
1).
Shape Treatment Options:
[-atomtypes {mmod,element,qsar} [-dual]] [-atomweights <prop>] [-hydrogens] |
[-pharm [-fd <fdfile>] [-rad <radfile>] [-proj]]
Note that there are two mutually exclusive groups of options: one for
atom-based treatment and one for pharmacophore-based treatment. Options
from the two groups may not be mixed.
-atomtypes {mmod,element,qsar}
Compute overlap only between atoms of the same type.
Supported typing schemes are: mmod - MacroModel atom
types, element - elemental types, qsar - Phase QSAR
atom types. Not valid when <shape> contains included
volumes or a pharmacophore hypothesis.
-dual Report similarities computed with and without atom
typing, based on the alignment obtained from using
atom typing. Each output structure will contain the
additional property r_phase_Shape_Sim_Pure for the
similarity computed without atom typing.
-atomweights <prop> Use a real atom-level property in <shape> to weight
overlap with the shape query atoms. Valid only when
<shape> is a Maestro file.
-hydrogens Consider hydrogens attached to non-carbon atoms. By
default, all hydrogens are ignored. Not valid when
<shape> contains included volumes or a pharmacophore
hypothesis.
-pharm Treat each structure as a set of pharmacophore
features, computing overlap only between features of
the same type. Not valid when <shape> contains
included volumes. Switched on automatically when
<shape> is a pharmacophore hypothesis.
-fd <fdfile> Pharmacophore feature definitions file. If omitted,
default definitions in the Schrodinger installation
are used if screening a Maestro or SD file, while
database definitions are used if screening a Phase
database. If <shape> is a pharmacophore hypothesis,
this option is ignored, and the hypothesis feature
definitions are used.
-rad <radfile> Pharmacophore feature radii file. Each line in
<radfile> should contain a feature type
(A,D,H,N,P,R,Q,X,Y,Z) followed by a radius between 1.0
and 4.0, with one or more spaces separating the two
fields. The default radius is 1.0 for feature type Q
and 2.0 for all other types. If <shape> is a
pharmacophore hypothesis that contains feature radii,
this option is ignored, and the hypothesis radii are
used.
-proj Differentiate projected features Q according to
whether they are associated with an acceptor, donor,
or aromatic ring. This treatment is consistent with
shape_screen_gpu and can significantly speed up
screening because fewer Q-Q overlaps must be computed.
The default is to treat all projected features as
equivalent. Ignored when running in classic mode.
Alignment Options:
[-align <smarts>] | [-force <atoms_file>] | [-inplace]
-align <smarts> Align screening structures to a substructure of the
shape query. <smarts> may be a single SMARTS pattern,
or it may be the name of a list file (.list) that
contains a SMARTS for each shape query, with one
SMARTS per line. A given SMARTS is matched in every
possible way to the shape query and screening
structure, a least-squares alignment is performed for
each match, and the alignment yielding the highest
similarity is retained. Not valid when <shape>
contains included volumes or a pharmacophore
hypothesis.
-force <atoms_file> Attempt to force the alignment of one or two atoms in
each screening structure by adding them to the best
mapping found using the standard approach.
<atoms_file> can be a text file with comma-separated
lists of atoms that should be superimposed, or a list
file (.list) that holds the names of one or more files
containing comma-separated lists, with one file name
per line. Must provide a .list file if <shape>
contains multiple queries. Each comma-separated file
should contain a list of shape query atoms on the
first line, followed by a line of corresponding atom
numbers for each molecule in <screen>, which must be a
Maestro or SD file. Use of this option automatically
triggers classic mode.
-inplace Compute similarities without aligning.
File Screening Options:
[-title <prop>] [[-distinct] | [-connect] | [-stereo]]
By default, consecutive structures with the same title and connectivity are
treated as conformers of the same molecule.
-title <prop> Use an alternate property as the source of titles.
-distinct Treat each structure as a distinct molecule, making no
attempt to perceive conformers.
-connect Consider connectivities only (not titles) when
perceiving conformers. Ignored when running in classic
mode.
-stereo Consider stereochemistry when perceiving conformers.
Consecutive structures with the same connectivity will
be treated as conformers only if they have the same
stereochemistry. Titles are not considered. Ignored
when running in classic mode.
Database Screening Options:
[-isub <in>] [-osub <out>]
-isub <in> Screen a subset of a Phase database. The file
<in>_phase.inp must contain the applicable LIGAND_NAME
records.
-osub <out> Create a subset file <out>_phase.inp with LIGAND_NAME
records for the structures in <jobname>_align.maegz.
Conformer Treatment Options:
[-flex [-sample {rapid,thorough,rdkit}] [-max <n>] [-ewin <deltaE>] [-append]]
[-limit <m>]
-flex Generate conformers on-the-fly. Existing conformers
are screened by default.
-sample {rapid,thorough,rdkit}
Conformational sampling method (default: rapid).
-max <n> Maximum number of conformers to generate (default:
100).
-ewin <deltaE> Conformational energy window in kJ/mol (default:
16.0).
-append Append generated conformers to existing conformer(s).
The default is to discard existing conformers.
-limit <m> Screen no more than the first <m> conformers provided
or generated.
Filtering Options:
[-filter <sim> [-advance]] [-xvol <file>]
-filter <sim> Discard any screening structure with a similarity
below <sim>.
-advance When multiple shape queries are provided, advance to
the next screening structure as soon as the filter is
satisfied for any query.
-xvol <file> Apply excluded volumes to each generated alignment and
discard if clashes are found. Any reported alignment
will be the one that yields the highest similarity
while avoiding clashes. The provided file may be any
one of the following: (1) standard excluded volumes
file (.xvol); (2) hydrogen-sensitive excluded volumes
file (.ev) created when -hydrogens is used with any of
the create_xvol* utilities; (3) Phase pharmacophore
hypothesis (.phypo) that contains excluded volumes;
(4) a list file (.list) with the names of one or more
files of the previous three types. A list file would
normally be used only if multiple shape queries are
provided and each query has its own excluded volumes.
If <shape> is a pharmacophore hypothesis, this option
MUST be supplied with <file> equal to <shape> in order
to apply excluded volumes in <shape>. This prevents
inadvertent application of excluded volumes that may
have been added to the hypothesis as a routine step in
pharmacophore model development. Use of this option
automatically triggers classic mode.
Reporting Options:
[-v, -verbose] [-osd] [-sort [-keep <n>]] [-best] [-report <m> [-redun <tol>]]
[-table [-only]] [-write_report [-limit_pdf <p>] [-only_pdf]]
-v, -verbose Verbose output.
-osd Output aligned structures to <jobname>_align.sdfgz.
-sort Sort output structures by decreasing similarity to the
query. Structures are output in the order supplied by
default.
-keep <m> Cap the number of sorted structures output. No cap is
enforced by default.
-best When multiple shape queries are supplied, output only
the alignment to the shape query yielding the highest
similarity.
-report <m> Output up to <m> alignments per input structure. If
multiple shape queries are provided, up to <m>
aligments are output for each query. Classic mode is
automatically triggered if <m> is greater than 1.
-redun <tol> Reject an alignment of a given conformer if all of its
atoms are within <tol> angstroms of another alignment
of that same conformer. Different conformers are not
compared for redundancy (default: 0.5).
-table Create a table of comma-separated similarities
<jobname>_sim.csv for the output structures, where
shape queries span the columns. Use of this option
automatically triggers classic mode.
-only Create only the table of similarities, not the file of
alignments <jobname>_align.maegz.
-write_report Create a shape screen report that consists of a
searchable 3D database of hits <jobname>_report.vsdb
and a document of annotated 2D hits
<jobname>_report.pdf. A variety of tasks may be
performed with the database using the
shape_screen_reporter utility or by loading it into
the Hit Analysis Panel of Maestro. Legal only when the
shape query consists of a single structure or a
pharmacophore hypothesis.
-limit_pdf <p> Limit the number hits written to the PDF file. By
default, only the top 500 hits are output, in order of
decreasing pharmacophore shape similarity to the query
structure or hypothesis reference ligand. Take care
when increasing this limit since the resulting PDF
file can be quite large.
-only_pdf Save only the PDF file, not the database. This option
may be appropriate for screens that produce a very
large number of hits since <jobname>_report.vsdb is
typically about three times the size of
<jobname>_align.maegz.
Job Control Options:
[-HOST <host>[:<n>]] [-NSUB <m>] [-TMPDIR <dir>] [-JOB <jobname>] [-NOJOBID]
-HOST <host>[:<n>] Run job remotely on the indicated host entry. Include
:<n> to split the job across <n> CPUs.
-NSUB <m> Subdivide the work assigned to each CPU into <m>
subjobs. The total number of subjobs is <n>*<m>. Not
valid when screening existing conformers in Maestro/SD
files.
-TMPDIR <dir> Store temporary job files in <dir>.
-JOB <jobname> Override default job name, which is derived from
<shape>.
-NOJOBID Do not run under Schrodinger job control. -HOST is not
permitted and -TMPDIR is ignored.