canvasFPMatrix Command Help
Command: $SCHRODINGER/utilities/canvasFPMatrix
canvasFPMatrix - Generates a pairwise similarity or distance matrix
using binary or scaled fingerprints from one or two
sets of molecules.
Usage: canvasFPMatrix -ifp <fpFile>
-o <binaryFile>
-ocsv <csvFile>
-filter <cutoff> [-all]
-sort avg|best [-caprow <N>]
[-capcol <N>]
[-metric <metricName> | <metric ordinal>|]
[-helpMetric]
[-range <range>]
[-ifp2 <fpFile> [-range2 <range>]]
[-block <blockSize>]
[-forceBinary]
[-alpha <alpha>]
[-beta <beta>]
[-flatten <coeff>]
[-limitOffBits]
[-forceFilter]
[-JOB <jobName> [-HOST <host>]
[-LOCAL]
[-TMPDIR <dir>]
[-WAIT]]
-ifp <fpFile> - Binary fingerprints generated from canvasFPGen.
If only one input file is used, matrix values are
calculated based on pairs within the file.
-o <binaryFile> - Output file of the pairwise matrix, binary format.
-ocsv <csvFile> - Output file of the pairwise matrix, in comma-
separated csv format.
Both formats can be used in subsequent steps.
-filter <cutoff> - Only report rows having at least one value
better than <cutoff>.
Comparisons involving items having the same
identifier are not considered for purposes of this
filtering.
-all - Report only rows where all values are
beyond <cutoff>.
-sort - Sort output rows by value - 'avg' (default) uses a
row-average for ranking, while 'best' uses the
nearest item in each row. Calculation is done after
any filtering. Again, self-comparisons are not
considered.
-caprow <N> - After sorting, report only the top <N> rows.
-capcol <N> - For each reported row, show only the top <N> columns.
-metric <metricName> - Metric type is may be one of the following:
"buser", "cosine", "dice", "dixon", "euclidean"
"hamann", "hamming", "kulczynski", "matching",
"mcConnaughey", "minmax", "modifiedTanimoto",
"patternDifference", "pearson", "petke",
"rogersTanimoto", "shape", "simpson", "size",
"soergel", "tanimoto", "tversky", "variance"
and "yule".
Metrics may also be referenced by ordinal (1-24)
Default is "tanimoto" (21).
-helpMetric - Prints the definition of each metric and exits.
-range <range> - Start and end positions. e.g. 5:10
-ifp2 <fpFile> - The 2nd input file. If set, only pairs between
the 1st and the 2nd files are calculated.
-range2 <range> - Start and end positions in the 2nd file. e.g. 5:8
-block <blockSize> - Maximum number of fingerprints to load in memory
at a time. Default is 1000.
-forceBinary - Ignore any scaled fp values in file(s).
Use binary values in all cases.
-alpha <alplha> - Tversky alpha parameter. Default 0.5.
-beta <beta> - Tversky beta parameter. Default 0.5.
-flatten <coeff> - Gaussian parameter to make output matrix sparse:
sim --> exp[-coeff*(1-sim)^2]
Applicable only to similarity metrics (buser,
cosine, dice, hamann, kulczynski, matching,
mcConnaughey, modifiedTanimoto, pearson, petke,
rogersTanimoto, simpson, tanimoto, tversky,
and yule). Default is no flattening. A reasonable
value is 25.
-limitOffBits - Limit the set of possible off bits. By
default, the number of off bits is limited only
by the fingerprint address space size (2^32 or
2^64), which may yield undesirable behavior for
metrics that incorporate off bits. If this
flag is used, the set of possible off bits will
be limited to those bits that are set by at
least one compound in the fingerprints provided.
Applies only to buser, hamann, matching,
modifiedTanimoto, patternDifference, pearson,
rogersTanimoto, size, shape, variance, and yule.
-forceFilter - Allow comparisons between two inputs that are
of a consistent type, but were created with
different filtering options. The default is to
abort the computation.
-JOB <jobName> - Run under Schrodinger job control using the supplied
job name. If omitted, no other job control options
are permitted.
-HOST <host> - Run job on <host>. Only single-CPU jobs are
supported.
-LOCAL - Store temporary job files in current directory.
-TMPDIR <dir> - Store temporary job files in <dir>.
-WAIT - Do not return prompt until job completes.