canvasSearch Command Help

Command: $SCHRODINGER/utilities/canvasSearch

canvasSearch - Search a list of target molecules against a set
               of queries, composed of either molecules or partial
               structures.  By default, only targets that match all
          the queries are returned.  However, with -require <n>
          option, a user can choose to return targets that match
          at least <n> of the queries.  For example, setting <n>
          to 1 returns targets that match any of the queries.

          This program can also be used to filter a
               target list based on either standard REOS rules or a
               user-defined file containing SMILES queries and minimum
               and maximum number of times a query should be matched.
               Filtering and search can be performed separately or
               sequentially (filtering first).

 Usage: canvasSearch [<job control options>] <program options>

  Job Control Options: [-JOB <jobName> [-HOST <host>[:<n>]]
                        [-MINREC <nrec>]
                                       [-LOCAL]
                        [-TMPDIR <dir>]
                                       [-WAIT]
                                       [-INTERVAL <n>]
                                       [-NICE]]

    -JOB <jobName>     - Job name.  If omitted, no other job control options are
                         permitted.
    -HOST <host>[:<n>] - Run job on <host>.  Include ":<n>" to split across
                         <n> CPUs.
    -MINREC <nrec>     - Minimum number of records per CPU.  Prevents
                         submission of a large number of subjobs that each
          contains only a small number of records.  The default
          is 100.
    -LOCAL             - Store temporary job files in current directory.
    -TMPDIR <dir>      - Store temporary job files in <dir>.
    -WAIT              - Do not return prompt until job completes.
    -INTERVAL <n>      - Update log file every <n> seconds.
    -NICE              - Run job at reduced priority.

  Program Options:   -i<fmt> <inputFile>
           [-n <selection>]
           [-fieldAsName <field>]
                     [-index <indexFile> | -newIndex <indexFile>]
           [-noIndex]
                     [-filter [-reos]
                              [-file <ruleFile> [-d <delimiter>] ]
                              [-maxVio <n>] ]
                     [-helpREOS]
           [-q<format> <queryFile> [-require <n>] ]
                     [-exact ]
           [-o<fmt> <outputFile>]
           [-o<fmt>2 <outputFile>]
                     [-osmi <smiFile> [-u]  | -osd <sdFile>  | -omae <maeFile>]
           [-osmi2 <smiFile> | -osd2 <sdFile> | -omae2 <maeFile>]
           [-no2DCoord]
           [-v3 ]
           [-useXforAromN]
           [-strict]
           [-allowRelative]
           [-matchCount [<csvFile>] [-qmap <queryMapFile>]
            -comment -prefix <queryPrefix> [-showAll] ]

   -i<fmt> <inputFile>   - Input file containing a list of target molecules.
                           -ismi = Each line must start with a SMILES
            string. ID, or name, of the molecule may be followed
                           with a tab or a whitespace character.
            -isd  = SD file as input.
            -imae = Maestro file as input.
   -n <selection>        - Selected molecules in <targetFile> to search. The
                           following are valid <selection> specifications:
                           1:10,14,15 - 1 through 10, 14, and 15
                           1,3,10:    - 1, 3, and 10 through end of file
                           :5,20:30   - 1 through 5, and 20 through 30
                           By default, all molecules are included.
   -fieldAsName <field>  - Field in a SD file (-isd), a Maestro file (-imae)
                           or a Canvas project to be used as name of a target
                           molecule.
   -index <indexFile>    - Use previously-generated index file of all
                           the molecules in <targetFile> in search. A matching
                           fingerprint will be generated for each query.
   -newIndex <indexFile> - Generate index file of both the target molecules
                           and the queries before search. The saved <indexFile> 
                           of the target molecules can be used later with -index
                           option.
   -noIndex              - Do not use any index for search, even if present.
                           Overwrites the above two options.
   -filter               - Filter the target file based on maximum and/or
                           minimum number of counts of a given set of patterns.
   -reos                 - Rapid Elimination of Swill, a set of rules to
                           identify lead-like molecules.
   -file <ruleFile>      - User supplied rule file to use. The file must be of
                           the following format:
                           Each line contains one SMARTS/SMILES string, followed
                           by the minimum and maximum number of allowed counts
                           and optional comment surrounded by double-quotes.
   -d <delimiter>        - Delimiter used to separate each field in <ruleFile>.
                           The default is tab '\t'. Use -d ' ' or " "
                           for space.  If space-delimited, consecutive spaces
                           will be ignored. Note that the use of ',' as a
                           delimiter is NOT supported.
   -maxVio <n>           - Maximum number of violations allowed for the rules.
                           By default <n> is set to 0.
   -helpREOS             - Print out REOS patterns to stdout, each followed by
                           the minimum and maximum number of counts.  Tab is
                           used as delimiter in each line.
   -q<fmt> <queryFile>   - File containing a list of queries:
                           -qsmi = SMILES/SMARTS
                           -qmae = Maestro file
                           -qsd = SD file
                           -qmol MDL mol file
                           By default, all queries must be matched.
   -require <n>          - Minimum number of queries that must be matched.
                           Not valid with -qmol.
   -exact                - Require exact match for each query. Default is to
                           match by substructure.
   -o<fmt> <outputFile>  - <outputFile> contains target molecules that passed
                           filter (if -filter is used) and matched all or the
                           required number of queries.
                           -osmi = If target molecules are supplied as SMILES
                           (-ismi), the original SMILES will be used. Otherwise
                           SMILES strings are generated by Canvas. Overridden by
                           -u which will use unique SMILES.
                           -osd  = SD format. If <outputFile> ends with .sdf.gz
                           or .sd.gz, writes in compressed format.
                           -omae = Maestro format. If <outputFile> ends with
                           .maegz or .gz, writes in compressed format.
   -o<fmt>2 <outputFile> - <outputFile> contains target molecules that do not
                           satisfy the required matches. Not valid when
                           searching with index.
                           -osmi2 = If target molecules are supplied as SMILES
                           (-ismi), the original SMILES will be used. Otherwise
                           SMILES strings are generated by Canvas.
                           -osd2  = SD format. If <outputFile> ends with .sdf.gz
                           or .sd.gz, writes in compressed format.
                           -omae2 = Maestro format. If <outputFile> ends with
                           .maegz or .gz, writes in compressed format.
    -no2DCoord           - Do not generate coordinates in the output <sdFile>
                           or <maeFile> if -ismi <smiFile> is used as input.
                           By default, 2D coordinates are generated in the
                           above case.
   -v3                   - Output MDL version 3 SD Format.
   -matchCount <csvFile> - Calculates the number of matches to each query or
                           filtering pattern, 0 for no match.  Counts are saved
                           in the <csvFile>.  If -filter and -q<smi/mae/sd> are
                           both used, only queries in the latter are listed.  By
                           default, only targets that passed the filter
                           (if -filter) is used, and matched all or the required
                           number of queries, are printed out to <csvFile>.
                           Omission of <csvFile> in concert with -osd or -omae
                           will redirect this output to -osd or -omae.
   -comment              - Set query names to the contents of comments field
                           from the filter file or the query molecule title
                           for other input types. Use an automatically generated
                           name (query1, query2, etc.) if this is blank.
   -prefix <queryPrefix> - Each query in the <csvFile> will be represented by
                           the following format: <queryPrefix>::query<n>.
                           <queryPrefix> can be a search name, such as
                           "my_search1".
   -qmap <queryMapFile>  - This file provides the mapping between the above
                           mentioned <queryPrefix>::query<n> and the actual
                           SMARTS/SMILES query patterns.
   -showAll              - If this option is used with -matchCount, counts for
                           all targets are printed out to <csvFile>.
   -strict               - perform additional validation of each target input
                           structure prior to matching. This will impact
                           performance negatively.
   -allowRelative        - Allow relative stereochemistry matches.
   -useXforAromN         - Add explicit connectivities for all aromatic
                           nitrogens appearing in -qsmi, -qsd, and -qmae.