Help for Motif Search

The following databases are supported.

Sigrist CJA, de Castro E, Cerutti L, Cuche BA, Hulo N, Bridge A, Bougueleret L, Xenarios I.
New and continuing developments at PROSITE
Nucl. Acids Res. 41:D344-D347, 2013.

PubMed: 23161676

NCBI-CDD

Marchler-Bauer A, Zheng C, Chitsaz F, Derbyshire MK, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Lanczycki CJ, Lu F, Lu S, Marchler GH, Song JS, Thanki N, Yamashita RA, Zhang D, Bryant SH.
CDD: conserved domains and protein three-dimensional structure
Nucl. Acids Res. 41:D348-D352, 2013.

PubMed: 23197659

Pfam

Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer EL, Tate J, Punta M.
Pfam: the protein families database
Nucl. Acids Res. 42:D222-D230, 2014.

PubMed: 24288371

as well as

User defined profile library
(may contain multiple profile data)

This server not only finds out sequence motifs in your query sequence, but also provides functional and genomic information of the found motifs using DBGET and LinkDB as the hyperlinked annotations. The results will also be presented graphically when the hits are found in PROSITE database.

Click each Motif library database name colomun to see detailed explanation.

Motif Library	Search Engine	Developer	Search Algorithm
PROSITE PATTERN	Motiffind	ICR, Kyoto Univ.
PROSITE PROFILE	Profilefind	ICR, Kyoto Univ.	Dynamic Programing method
NCBI-CDD	RPS-BLAST	National Center for Biotechnology Information	Reverse Position Specific BLAST
Pfam	Hmmscan (HMMER)	Howard Hughes Medical Institute	Profile Hidden Markov Model

Given a profile generated from the multiple sequence alignment, or, retrieved from motif library such as PROSITE or Pfam, you can align a protein sequence with the profile.
The procedure is similar to the one to search against the motif library database, however, you should provide a name of the file containing profile matrix instead of the database names.

Profile Format	Search Engines	Search Algorithm
PROSITE	Profilefind	Dynamic Programing method
Pfam-hmmer	Hmmscan	Profile Hidden Markov Model

Given a profile, protein sequence databases on GenomeNet service are retrieved to find out the protein families that have the same motif. The profile, either in PROSITE or Pfam format, could be calculated in our service from the multiple sequence alignment or retrieved from motif library such as PROSITE or Pfam. The Pfsearch program is used to retrieve with PROSITE format profile and Hmmsearch is used for Pfam format one. Target sequence libraries are Swiss-Prot, RefSeq, PDBSTR, GENES, MGENES, and VGENES.

This allows you to search protein sequence libraries with your patterns. Target sequence libraries are Swiss-Prot, RefSeq PDBSTR, GENES, MGENES, and VGENES. Sequence pattern must be specified in the PROSITE pattern format, namely;

Each residue must be separated with - (minus sign).
x represents any amino acids.
[DE] means either D or E.
{FWY} means any amino acids except for F, W and Y
A(2,3) means that A appears 2 to 3 times consecutively.
The pattern string must be terminated with . (period).

For example, C-x-{C}-[DN]-x(2)-C-x(5)-C-C.

Two type of profile data, either in PROSITE or Pfam format, are calculated from the multiple alignment sequences. using PFMake or HMMBuild, respectively.

You can align your (new) sequence against this generated profile continuously, or save the obtained profile on your local computer to use to search against sequence databases. Additional explanations are shown.

Motif search help page