canvasMCS Command Help

Command: $SCHRODINGER/utilities/canvasMCS

canvasMCS - Finds maximum common substructure(s) among a given set of molecules.

Usage: canvasMCS [<job control options>] <program options>

  Job Control Options: [-JOB <jobName> [-HOST <host>]
                                       [-LOCAL]
                                       [-TMPDIR <dir>]
                                       [-WAIT]
                                       [-INTERVAL <n>]
                                       [-NICE]]

    -JOB <jobName> - Job name.  If omitted, no other job control options are
                     permitted.
    -HOST <host>   - Run job on <host>.  Only single-CPU jobs are supported.
    -LOCAL         - Store temporary job files in current directory.
    -TMPDIR <dir>  - Store temporary job files in <dir>.
    -WAIT          - Do not return prompt until job completes.
    -INTERVAL <n>  - Update log file every <n> seconds.
    -NICE          - Run job at reduced priority.

  Program Options: -ismi <smiFile>
                  |-isd  <sdFile>  [-fieldAsName <field>]
                  |-imae <maeFile> [-fieldAsName <field>]
                  |-iproj <projFile>
                  |-icsv <csvFile> [-noHeader]
                                    [-d <delimiter>]
                                    [-smi <SMILESCol>]
                                    [-name <nameCol>]
                  -ocsv <csvOutput>
                  |-opw  <pwOutput>
                  |-osd <sdOutput> [-nodetail]
                  |-omae <maeOutput> [-nodetail]
                  [-min <minMatch>]
                  [-max <maxMatch>]
                  [-stop <size>]
                  [-n <structRange> [-file]]
                  [-rs [<numMols>]]
                  [-limit <numMols>]
                  [-showall [<int>] | -exclusive]
                  [-nodetail]
                  [-sortname]
                  [-addring]
                  [-v3]
                  [-ordinal]
                  [-atomtype <type>]
                  [-timeout] <seconds>
                  [-nobreakring] | [-nobreakaring]
                  [-allH]
                  [-disconnect]
                  [-prochiral]

 -ismi <smiFile>   - Input SMILES file, with one SMILES string per line.
                     An optional structure name may follow the SMILES,
                     with a space or tab separator.
 -isd  <sdFile>    - Input SD file, standard or compressed.
 -imae <maeFile>   - Input Maestro file, standard or compressed.
 -iproj <projFile> - Canvas project file, including absolute path.
 -icsv <csvFile>   - Input CSV file.  By default, the file is expected to
                     contain a column header line and SMILES in the first
                     column.
 -noHeader         - Input CSV file has no header line.
 -d <delimiter>    - Input CSV file delimiter.  The default is ','.  Use
                     -d ' ' for space and -d '	' for tab.  Consecutive
                     space delimiters are treated as one.
 -smi <SMILESCol>  - Input CSV SMILES column, either by name or by index,
                     starting at 1.  By default, SMILES is the first column.
 -name <nameCol>   - Input CSV molecule name column, either by name or by
                     index.  By default, it is the second column.
 -min <minMatch>   - Minimum number of molecules that must match the MCS.
                     The default is all.  If <minMatch> exceeds the number
                     of inputs, it is perceived as requiring all to match.
                     Note that the largest substructure common to at least
                     <minMatch> molecules may actually match additional
                     molecules, so the number of matches reported may be
                     larger than <minMatch>.
 -max <maxMatch>   - Target upper bound on the minimum number of molecules
                     that must match the MCS.  The default is all.  When
                     <maxMatch> is larger than <minMatch>, a series of
                     solutions is produced, where the first solution matches
                     at least <minMatch> molecules, and subsequent solutions
                     match larger numbers of molecules.  Ideally, there will
                     be a unique MCS for each match count between <minMatch>
                     and <maxMatch>, but this rarely happens in practice.
                     Furthermore, the last reported solution may match more
                     than <maxMatch> molecules, for the reasons given above.
 -stop <size>      - Stop processing when MCS Atom + Bond Count falls below
                     this threshold.
 -ocsv <csvOutput> - Output CSV file for MCS results.
 -opw <pwOutput>    - Output MCS for all pairs in columnar format.
 -omae <fileName>  - Output Maestro file for MCS results.
 -osd <fileName>   - Output SD file for MCS results.
 -nodetail         - Omit MCS atom and bond lists in output file.  These are
                     never reported in CSV output.
 -n <structRange>  - The set of input structures to process:
                     1,4       - structures 1 and 4
                     1:10,14   - structures 1 through 10 and 14
                     2:        - structures 2 through the end of file
                     :5,13:18  - structures 1 through 5 and 13 through 18
                     All structures are processed by default.
 -file             - If specified, <structRange> in the above is taken as
                     a filename, which contains range selection.  If used
                     together with -iproj option, this file must be the
                     binary set file written from Canvas GUI.  For other
                     input formats, this file should contain a valid
                     structure range string as specified above in each line.
 -rs [<numMols>]   - Process only a random subset of the input molecules.
                     If <numMols> is omitted, it will be set to sqrt(total).
 -limit <numMols>  - Maximum number of molecules to process (default=2000).
                     Processing more than 2000 may exceed available memory.
 -showall [<int>]  - Output all equivalents for each MCS.  2 by default.
                     Not compatible with -exclusive and -opw. 0 is 'off',
                     1 reports all patterns, 2 (default) reports only unique
                     patterns, and 3 reports these same unique patterns on
                     a single line.
 -exclusive        - If an input molecule matches more than one MCS, report
                     only the match to the largest MCS. Not compatible with
                     -showall.
 -sortname         - Sort output on molecule name.  By default, input order
                     is preserved.
 -addring          - In output MCS SMARTS patterns, mark each atom as cyclic
                     (R) or acyclic (R0).  Off by default.
 -addh             - In output MCS SMARTS patterns, include hydrogen counts for
                     each atom. Off by default.
 -nox              - In output MCS SMARTS patterns, suppress addition of a
                     connectivity qualification [nX3] for pyrrolic nitrogens.
                     Default is on.
 -v3               - Output MDL version 3 SD Format.
 -ordinal          - Use the ordinal position of each structure in the source
                     file as its identifier (e.g. '1' for first, '2' for second)
 -atomtype <type>  - Atom typing scheme. Must be an integer value between
                     1 and 13 or C (details below).  The default is 11.
 -timeout <seconds>- Abort further calculations when cpu time exceeds this
                     integer.
 -nobreakring      - Do not consider partial rings as part of MCS.
 -nobreakaring     - Do not consider partial aromatic rings as part of MCS.
 -allH             - Consider hydrogens as explicit atoms. Implies -addh.
 -disconnect       - Allow the MCS to have one or more disconnections.
 -u                - Use unique SMILES for all SMILES output."
 -prochiral        - Use 3D geometry to distinguish prochirality. Requires.
                     3D all-atom inputs to work properly.

 Atom Typing Schemes
 -------------------
 1 - All atoms equivalent; all bonds equivalent.
 2 - Atoms distinguished by HB acceptor/donor; all bonds equivalent.
 3 - Atoms distinguished by hybridization state; all bonds equivalent.
 4 - Atoms distinguished by functional type: {H}, {C}, {F,Cl}, {Br,I}, {N,0},
     {S}, {other}; bonds by hybridization.
 5 - Mol2 atom types; all bonds equivalent.
 6 - Atoms distinguished by whether terminal, halogen, HB acceptor/donor;
     bonds distinguished by bond order.
 7 - Atomic number and bond order.
 8 - Atoms distinguished by ring size, aromaticity, HB acceptor/donor,
     ionization potential, whether terminal, whether halogen; bonds
     distinguished by bond order.
 9 - Carhart atom types (atom-pairs approach); all bonds equivalent.
10 - Daylight invariant atom types; bonds distinguished by bond order.
11 - Same as 7, but distinguishing aromatic from non-aromatic.
12 - Same as 11, but distinguishing aliphatic atoms by ring/acyclic.
13 - Same as 12, but distinguishing rings by size.
 C - Custom. Must be followed by location of a type definitions file.