Prime Command Input File

The command input file contains a list of keywords that control the execution of the program, one per line. The keywords are listed in the tables below. Generally valid keywords are listed in Table 3 and protocol-specific keywords are listed the following tables. The keywords and values are case-insensitive, except where the case is important, such as for file names.

Table 1. General keywords

Keyword syntax

Description

JOB_TYPE job

Type of job to run. The available options are:

MODEL

Build a structure for a homology model.

REFINE

Refine a structure.

OPLS_VERSION {OPLS2005|OPLS4}

OPLS force field version to use.

REDUN_CUTOFF tol

RMSD cutoff for eliminating redundant conformers, where relevant.

REMOVE_REDUNDANT {true|false}

Where multiple structures can be generated, remove redundant structures, as determined by REDUN_CUTOFF.

Table 2. Keywords for homology model building jobs

Keyword syntax

Description

BUILD_DELETIONS {true|false}

Required. Turn on or off closure of the chain breaks near deletions in the alignment. If not closed, they will remain as chain breaks in the output structure.

BUILD_TRANSITIONS {true|false}

Required. Turn on or off closure of the junctions between templates in multi-template homology modeling jobs. If not closed, they will remain as chain breaks in the output structure.

COMPOSITE_ARRAY string

A string of integers indicating from which template (or, more specifically, from which alignment) coordinates for a given residue should be obtained. Thus, this string of integers must be equal in length to the query sequence being used in the alignment files. COMPOSITE_ARRAY is required for multiple template jobs. It is not required for single template jobs, but if it is present, it is ignored.

FAST_SIDE_OPT {true|false}

Turn on or off fast side chain optimization, in which a set of side chains is selected that has no steric clashes but no energy minimization is performed. Requires SIDE_OPT. Default: false.

KEEP_ROTAMERS {true|false}

Required. Specifies which side chains to optimize. If true, the rotamers for conserved side chains are preserved, and only non-conserved side chains are predicted. If false, all side chains are predicted. This keyword is ignored if SIDE_OPT is false.

KNOWLEDGE_BASED {true|false}

Specify the method used to build the model: knowledge-based (true), which is faster and does not do any minimization, or energy-based (false), which uses the full model-building capabilities. Default: false.

MAX_INSERTION_SIZE n

Specifies the longest insertion in the alignment that will be built. Insertions longer than this will be omitted, and not appear in the output structure. Residue numbering will reflect the full query sequence however. Default: 1000.

MINIMIZE {true|false}

Turn on or off minimization of regions that were not directly copied from the template structure. These regions primarily include portions of the structure involved in closing gaps or building insertions, but also include any side chains optimized due to the SIDE_OPT keyword. Default: true.

MIN_OVERLAP value

Fraction of van der Waals distance used to define a clash. Smaller values allow structures with worse potential atomic overlaps to be generated. Default: 0.70

NUM_OUTPUT_STRUCT n

Number of output model structures to return. Only valid with KNOWLEDGE_BASED true.

SIDE_OPT {true | false}

Turn on or off a side chain optimization stage. Default: true.

template_ALIGN_FILE    filename

Required. The name of the file containing the alignment between this template and the query. Details on the format are provided in the Examples section below.

template_HETERO_i ligand-spec

Required if a ligand is present. Specifies a ligand to include from this template. The format of the specification is
AAA C:###

where AAA is the ligand’s three letter code, C is its chain ID, ### is its residue number. Multiple ligands from the template are numbered using i as an index, starting from zero.

TEMPLATE_NAME template

Required. A label used to identify a given template in subsequent data fields. For example, if TEMPLATE_NAME is set to 1BPJ_A, other template-specific fields are given as 1BPJ_A_STRUCT_FILE, 1BPJ_A_ALIGN_FILE, and so on.The chain to be used is specified by an underscore and letter suffix: for example, for 1BPJ_A, chain A will be used. The chain suffix must be included in the label.

template_NUMBER n

Required. Used to identify a given template in the COMPOSITE_ARRAY (see below). Numbering goes from 1 to 9.

template_STRUCT_FILE filename

Required. The PDB file to be used for this particular template.

TEMPLATE_RESIDUE_NUMBERS{true|false}

Number the residues in the built structure the same as in the template, as far as possible. Does no apply to consensus modeling. When using multiple templates for a single chain, the residue numbers are taken from the first template. Sequential numbering from 1 is used if the attempt to use template numbering fails.

Table 3. Keywords for all refinement jobs

Keyword syntax

Description

ADD_MISSING_SIDE_CHAINS{yes|no}

Add missing side chains in a default extended conformation. No sampling or refinement of the added side chains is performed.

CONSTRAINT_i constraint

Define a constraint on atom positions, distances, angles or dihedral angles. See text for syntax details.

ECUTOFF value

Energy cutoff for return of conformations. Returns all conformations within value kcal/mol of the lowest-energy structure. It can be used in conjunction with NUM_OUTPUT_STRUCT, to impose a maximum on the number of such conformations to return. As with NUM_OUTPUT_STRUCT, it only applies to single loop refinements.

ENTRY list

Comma-separated list of entries in the input structure file to refine. Default: refine all structures.

EXT_DIEL value

Dielectric constant of the continuum solvation model. Default: 80.0. Used with SGB_MOD sgbnp.

INT_DIEL value

Dielectric constant used within the radius of an atom. Default: 1.0.

MIN_OVERLAP value

Fraction of van der Waals distance used to define a clash. Smaller values allow structures with worse potential atomic overlaps to be generated. Valid for LOOP_BLD and HELIX_BLD jobs. Default: 0.70

NUM_OUTPUT_STRUCTn

Number of conformations to return for single loop refinements or cooperative side-chain refinements. This has no effect on helix refinements or jobs composed of more than one refinement (e.g. 2 loops, 1 helix + 1 loop, etc.). All structures are returned in a single structure file in Maestro format. Default: 1.

PAIR_CONSTRAINT_i atom1,atom2,dist[,strength]

Specify a constraint on a pair of atoms. The format of the atom specifications is given at the beginning of this chapter. The value of dist is the target distance between the atoms in angstroms. The value of strength is the coefficient of the harmonic constraint potential in kcal mol−1 Å−2, which has a default value of 350.

PLANARITY_RESTRAINT value

Specify the factor used to multiply the potential for improper torsions, to reduce the amount by which rings can deviate from planarity.

PRIME_TYPE type

Type of calculation to perform when running a refinement job (JOB_TYPE REFINE). The available options are:

SIDE_PRED

Predict side chains (prefix SC)

SIDE_COMBI

Predict side chains cooperatively and exhaustively

LOOP_BLD

Predict loops and helices

EXTENDED

Predict extended loops (6-11 residues)

LONG_LOOP_2

Predict ultra-extended loops (10 or more residues)

LOOP_PAIR

Cooperative refinement of two loops

HELIX_BLD

Rigid-body refinement of helices

REAL_MIN

Minimize energy (prefix MINI)

SITE_OPT

Active site minimization (including ligand)

ENERGY

Energy calculation

RESIDUE_i spec

Specify a residue to refine. The numbering i begins at 0, and increases incrementally. The format of the residue specification spec is given at the beginning of this chapter.

RETAIN_CORRECTIONS {yes|no}

Retain corrections made when duplicate residue numbers are found. The duplications detected are between non-protein residues or between protein and non-protein residues. The non-protein residue is assigned a new, unique chain name temporarily. If you set this keyword to yes, the new chain name is written to the output file. Default: no.

SEED n

The integer to use as the seed for the random number generator if USE_RANDOM_SEED is no. If it is yes, SEED is ignored. Default: -1, to generate a random seed rather than use the supplied seed.

SELECT string

Specify the manner for specifying which residues to refine. Available options are:

all

 Refine all residues

pick

 Refine the specific residues, indicated by included

 RESIDUE_i parameters (see above).

file

 Read list of residues to refine from a file. The file consists of a

 series of lines with a residue specification on each line.

SGB_MOD {sgbnp_ecorr| vsgb2.0|vsgb2.1|chloroform}

Specify the implicit solvation model to use. The choices are:

sgbnp_ecorr a modified version of the standard generalized Born model with several energy corrections developed by the Jacobson lab.
vsgb2.0 version 2 of the variable-dielectric generalized Born model, which incorporates a wide range of residue-dependent effects. Default for OPLS_2005.
vsgb2.1 version 2 of the variable-dielectric generalized Born model, reparameterized specifically for OPLS2.0 (and its successors). Default for OPLS4.
vac vacuum (no solvent model). Default for macrocycle sampling.
chloroform standard generalized Born model incorporating chloroform radius and dielectric.

STRUCT_FILE filename

Input structure file in Maestro format, compressed (.mae.gz, .maegz) or uncompressed (.mae). If the file contains multiple structures, each structure is refined, or the structures specified by ENTRY.

USE_CRYSTAL_SYMMETRY {yes|no}

Set to true if the input structure contains unit cell information and crystal symmetric atoms are to be included in calculation. Default: no.

USE_RANDOM_SEED {yes|no}

Indicates whether to use a random seed for the random number generator. This keyword has no effect on Minimization tasks. Default: no.

USE_MAE_CHARGES {yes|no}

Indicates whether to use atomic partial charges from the Maestro input file for untemplated residues (ligands, cofactors). Default: no.

USE_MEMBRANE {yes|no}

Indicates whether to use the implicit membrane model. The model must be set up from the Setup Membrane panel in Maestro. Default: no.

Table 4. Keywords for loop refinement jobs

Keyword syntax

Description

BURIED_i spec

Constrain the specified residue to be buried (no more than 20% of surface area exposed to solvent).

CA_CONSTRAINT_i    spec,[x,y,z,]value

Specify a constraint on the position of a specific alpha carbon atom. The format of the residue specification spec is given at the beginning of this chapter. The optional coordinates specify the target position; if omitted, the target position is the initial position. The value is the maximum distance that the C-alpha atom can move from the target position, in angstroms. Specifying a target position allows you to move a loop to a desired location.

EXPOSED_i spec

Constrain the specified residue to be exposed (more than 40% of surface area exposed to solvent).

LOOP_i_RES_j spec

Specify residue j for defining loop i. Loops are numbered sequentially, beginning at 0. Two occurrences of this keyword are required, with j=0 (the beginning of the loop) and j=1 (the end of the loop). spec is a residue specification as defined above.

HELIX_BUILD spec1/spec2

Build the specified sequence as an alpha helix. The format of the residue specifications spec1 and spec2 is given at the beginning of this chapter. You should include enough residues on either side of the helical region in the loop prediction to ensure that sufficient flexibility is available to build the loop.

MAX_CA_MOVEMENT value

Specify the maximum distance that any alpha carbon atom in the loop should be allowed to move during refinement. Omitting this parameter indicates that no restriction should be applied.

MEMBRANE {inside|outside}

Place the C-alpha atoms of the loop either inside or outside the membrane in the building stage. Refinement can move the residues across the membrane boundary.

RES_SPHERE value

All side chains that lie within this distance of the loop will also be optimized during refinement. This allows these nearby side chains to “react” to the loop being predicted. Default 0.0.

LOOP_NCLUST number

Specify the number of clusters to score for loop prediction. More clusters should result in more accurate predictions albeit with greater computational cost.

Table 5. Additional keywords for extended loop and multiple loop refinement jobs

Keyword syntax

Description

MAX_CA_REF1value

Specify the maximum distance that any alpha carbon atom in the loop should be allowed to move during refinement stage 1 (Ref1). Omitting this parameter indicates that no restriction should be applied.

MAX_CA_REF2value

Specify the maximum distance that any alpha carbon atom in the loop should be allowed to move during refinement stage 2 (Ref2). Omitting this parameter indicates that no restriction should be applied.

MAX_JOBSn

Maximum number of subjobs that can run at one time. This can be set to a value less than the value of THREADS if sufficient resources are not available. Not available with refinestruct.

THREADSn

Number of simultaneous jobs to be run during selected refinement stages. Determines the sampling level for some protocols. Note: the jobs specified by THREADS do not have to actually run in parallel—see MAX_JOBS.

NUM_FIXED_STAGE n

Specify the number of stages in which backbones of residues at the beginning or end of the loop are fixed. These stages are inserted between the CA refinement stages. Each stage fixes progressively more residues, starting with 1. The fixed residues are distributed in all combinations between the ends of the loop, so, for example, stage 4 has 5 combinations and therefore 5 refinement jobs. Default: 5.

OFAC_INIT_STAGEfactors

Run several initial stages with the specified minimum overlap factors (see MIN_OVERLAP). The factors are specified as a list separated by slashes, with no spaces. One stage is run for each factor. Default: 0.65/0.70/0.75.

PRIME_TYPEn

Type of refinement to be carried out. Can be specified as either a number or text string. Different from refinestruct. Available choices are:

0 or default: Equivalent to running a single loop refinement using refinestruct. Not particularly useful except for completeness and testing purposes. Does not recognize THREADS keyword.

1 or extended: Used for Extended Sampling from the GUI. Does not recognize MIN_OVERLAP and MAX_CA_MOVEMENT keywords, as these are used internally by the protocol. Accepts multiple loops as input which are run sequentially.

2 or long_loop: No longer used.

3 or loop_pair: Used for Cooperative Loop Sampling from the GUI. Exactly two loops must be specified.

4 or long_loop_2: Used for Ultra-extended Sampling from the GUI. Does not recognize the MAX_CA_MOVEMENT keyword, as this is used internally by the protocol. Only one loop can be specified.

STRUCT_FILE

Required. The input structure file in Maestro format, compressed or uncompressed. Same as for refinestruct.

SEG_LISTfilename

Name of a file containing the list of loops to be refined. For the example given for the keyword LOOP_i_RES_j, the file would have a single line:

loop A:10 A:20

Not available with refinestruct.

Table 6. Keywords for active site optimization (SITE_OPT) jobs

Keyword syntax

Description

BACKBONE_LEN n

Number of residues to include in the loop defined for backbone sampling in side-chain prediction. Only used with SAMPLE_BACKBONE. Default: 3.

LIGAND string

Residue specification for the ligand.

MAXCONEANG value

Maximum angular displacement of the CA-CB vector from the initial position when sampling CB positions. Only used with SAMPLE_CBETA. Default: 30°.

NITER_SIDE n

Number of iterations of side-chain prediction to perform. Larger numbers generally give better results. If set to zero, only the initial selection of rotamers without clashes is performed. Default: 1.

NPASSES n

Number of passes through side chain optimization of the protein residues followed by minimization of the protein residues (including the backbone). These two steps are repeated the specified number of times before proceeding to optimization of the protein and the ligand.
Default: 2. Files written from Maestro have NPASSES set to 1.

SAMPLE_BACKBONE {yes|no}

Sample the backbone by running a loop prediction on a set of residues centered on the residue for which the side chain is being refined. The number of residues in the loop is defined by the BACKBONE_LEN keyword. Default: no.

SAMPLE_CBETA {yes|no}

Sample CB positions in a conical region around the initial position. The maximum displacement of the CA-CB vector in this region is set by the MAXCONEANG keyword. Default: no.

Table 7. Keywords for helix rigid refinement (HELIX_BLD) jobs

Keyword syntax

Description

HELIX_0_RES_i spec

Specify helix and region containing helix. Values of i are:

0

start of region containing helix

1

start of helix

2

end of helix

3

end of region containing helix

A keyword with each value of i must be included in the input.

MAX_MOVEMENT value

Maximum amount of movement allowed in the C and the N terminus of the helix. Overrides MAX_C_MOVEMENT and MAX_N_MOVEMENT. Default: 5 Å.

MAX_C_MOVEMENT value

Maximum amount of movement allowed in the C terminus of the helix. Default: 5 Å.

MAX_N_MOVEMENT value

Maximum amount of movement allowed in the N terminus of the helix. Default: 5 Å.

ORIGIN_FOR_ROTATION value

Fraction of the distance from the start to the end of the helix that is used as the origin for rotation. Default: 0.5.

ROLL_HI value

Minimum amount of roll sampled, in degrees. Default: 0.

ROLL_LO value

Maximum amount of roll sampled, in degrees. Default: 0.

ROLL_STEP value

Increment in roll between ROLL_LO and ROLL_HI, in degrees.

SIDE_OPT_DISTANCE_CUTOFF

Distance in angstroms to optimize side chains around the helix region.

Table 8. Keywords for side-chain prediction (SIDE_PRED) jobs

Keyword syntax

Description

NITER_SIDE n

Number of iterations of side-chain prediction to perform. Larger numbers generally give better results. If set to zero, only the initial selection of rotamers without clashes is performed. Default: 1.

Table 9. Keywords for minimization (REAL_MIN) jobs

Keyword syntax

Description

MINIM_NITER n

Maximum number of minimization cycles for the solvent treatment. Each cycle consists of a minimization to the specified RMSD with a particular set of solvent parameters, after which the solvent parameters are updated. The iterative procedure finishes if the energy changes by less than 1 kcal/mol or the number of cycles is exceeded. Default: 2.

MINIM_RMSG value

RMSD gradient convergence threshold. Default 0.01 kcal mol−1 Å−1