PROSITE, syntax, patterns, reactive residues, motif, motifs

Extended PROSITE Syntax for Residue Groups

The syntax for defining a residue group is an extended form of the PROSITE syntax, which allows you to specify secondary structure and some properties:

  • Standard IUPAC one-letter (upper case) codes are used for all amino acids.

  • Lower case x is used for any amino acid.

  • Each element of a pattern is separated with a - symbol.

  • Residues that are permitted at a given position are listed between square brackets, e.g. [ACT] means one of Ala, Cys, or Thr, or in other words, only Ala, Cys, or Thr can appear at this position.

  • Residues that are not permitted at a given position are listed between curly brackets, e.g. {GP} means not Gly and not Pro, or in other words, any residue but Gly or Pro can appear at this position.

  • Repetition is indicated using parentheses, e.g. A(3) means Ala-Ala-Ala, G(2,4) means between 2 to 4 Gly residues.

  • The following lower case characters can be used for residue types:

    • a—acidic residue: [DE]
    • b—basic residue: [KR]
    • o—hydrophobic residue: [ACFILPWVY]
    • p—aromatic residue: [WYF]
  • The following lower case characters can be used to restrict residue types by property:

    • s—solvent-exposed residue
    • h—residue in helical region
    • e—residue in extended (beta strand) region
    • f—flexible residue, defined as having a B-factor above the average over all residues

    These four characters can be appended to a residue type to restrict the type of residue, e.g. Ah means Ala in a helical region.

Some examples of valid and invalid patterns are given below, with comments.

N-{P}-[ST]

Asn-X-Ser or Asn-X-Thr, X is not Pro

N[fs]-{P}[fs]-[ST][fs]

as above, but all residues flexible or solvent exposed

Nfs-{P}fs-[ST]fs

as above, but all residues flexible and solvent exposed

Ns{f}

Asn, solvent exposed and not in flexible region

N[s{f}]

Asn, solvent exposed or not in flexible region

[ab]{K}f{s}

acidic OR basic, except for flexible and non-solvent-exposed Lys

Ahe

Ala, helical and extended - no match is possible

A[he]

Ala, helical or extended

A{he}

Ala, not helical or extended

[ST]

Ser or Thr

ST

Ser and Thr - no match possible