NSITE Program Description
NSITE-PL: Search for consensus patterns of plant regulatory sequences
NSITE can be used for analysis of regulatory regions and composition of their
functional motifs.
Method description:
The method is based on statistical estimation of expected number
of a nucleotide consensus pattern in a given sequence [1-3]. NSITE-PL searches
for statistically significant functional motifs of plant promoter/regulatory
sequences. Plant functionally motifs is selected from RegSite Database developed
by Softberry Inc. using published data on transcription regulation of plant
genes.
If we find a pattern which has expected number significantly less
than one, it can be expected that analyzed sequence possesses the pattern's
function.
In the output of NSITE, we can see a pattern, its position in
the sequence, accession number, ID, description of motif and binding factor
name from the original database, if available.
Output example:
nsitep Thu Jun 27 20:25:01 EDT 2002
Program N S I T E (Softberry Inc.)
Search for motifs of 432 Regulatory Elements from
RegSite - The Transcription Regulatory Sites Database (Plants)
(http://www.softberry.com)
Number of QUERY Sequences: 1
File of QUERY Sequences: /httpd/tmp/loadrun/pssp.seq.176588
Search PARAMETERS:
Expected Mean Number : 0.0100000
Statistical Significance Level : 0.9500000
Print Query Sequence : No
Special numbering of Query Sequence : No
Variation of Distance between RE Blocks: No
NOTE: RE - Regulatory Element/Consensus
AC - Accession No of RE in RegSite
OS - Organism/Species
BF - Binding Factor or One of them
Mism. - Mismatches
Mean. Exp. Number - Mean Expected Number
Up.Conf.Int. - Upper Confidence Interval
==================================================
QUERY: >Softberry SERVER PAST Sequence
Length of Query Sequence: 2975
Nucleotide Frequencies: A - 0.30 G - 0.20 T - 0.26 C - 0.23
..................................................
RE: 21. AC: RSP00021 /OS: Catharantus roseus /GENE: TDC /RE: GT-1#Box5 /BF: GT-1
Motifs on "-" Strand: Mean Exp. Number 0.00915 Up.Conf.Int. 1 Found 1
389 AAAAAGTAAAgA 378 (Mism.= 1)
..................................................
RE: 34. AC: RSP00034 /OS: Zea mays /GENE: gamma-27kDa zein /RE: P-box (s) /BF: PB
Motifs on "+" Strand: Mean Exp. Number 0.00002 Up.Conf.Int. 1 Found 1
1353 GACGTGTAAAGTAAATTTACAAC 1375 (Mism.= 0)
..................................................
RE: 183. AC: RSP00366 /OS: Nicotiana tabacum /GENE: CHN50 /RE: ERE /BF: TDBA12
Motifs on "-" Strand: Mean Exp. Number 0.00849 Up.Conf.Int. 1 Found 1
2826 TGACTTTCTGAt 2815 (Mism.= 1)
..................................................
RE: 199. AC: RSP00395 /OS: Zea mays /GENE: gamma-27kDa zein /RE: O2-like-box /BF:
Motifs on "+" Strand: Mean Exp. Number 0.00365 Up.Conf.Int. 1 Found 1
1414 TTACGTAGAT 1423 (Mism.= 0)
..................................................
RE: 234. AC: RSP00430 /OS: barley /GENE: Hor2 gene /RE: GSN; hor1-box; /BF: BLZ1;
Motifs on "+" Strand: Mean Exp. Number 0.00918 Up.Conf.Int. 1 Found 1
1221 GTGAGTCAT 1229 (Mism.= 0)
..................................................
RE: 264. AC: RSP00459 /OS: coix /GENE: alpha-coixin /RE: O2u /BF: O2
Motifs on "-" Strand: Mean Exp. Number 0.00384 Up.Conf.Int. 1 Found 1
992 TTGACTAGGA 983 (Mism.= 0)
..................................................
RE: 295. AC: RSP00491 /OS: Zea mays /GENE: Zc2 /RE: Zc2 A/T-1 /BF: nuclear factor
Motifs on "+" Strand: Mean Exp. Number 0.00000 Up.Conf.Int. 1 Found 1
771 CATATGTTTTATTAAAacAAAaTTTATC 798 (Mism.= 3)
..................................................
RE: 296. AC: RSP00492 /OS: Zea mays /GENE: Zc2 /RE: Zc2 A/T-2 /BF: nuclear factor
Motifs on "+" Strand: Mean Exp. Number 0.00000 Up.Conf.Int. 1 Found 10
789 AaAatTtatcATATATATATATATATATATATATATATATAT 830 (Mism.= 7)
791 AatTtatcATATATATATATATATATATATATATATATATAT 832 (Mism.= 6)
793 tTtatcATATATATATATATATATATATATATATATATATAT 834 (Mism.= 5)
795 tatcATATATATATATATATATATATATATATATATATATAT 836 (Mism.= 4)
797 tcATATATATATATATATATATATATATATATATATATATAT 838 (Mism.= 2)
799 ATATATATATATATATATATATATATATATATATATATATAT 840 (Mism.= 0)
801 ATATATATATATATATATATATATATATATATATATATATAa 842 (Mism.= 1)
803 ATATATATATATATATATATATATATATATATATATATAata 844 (Mism.= 3)
805 ATATATATATATATATATATATATATATATATATATAatata 846 (Mism.= 5)
807 ATATATATATATATATATATATATATATATATATAatataAa 848 (Mism.= 6)
Motifs on "-" Strand: Mean Exp. Number 0.00000 Up.Conf.Int. 1 Found 10
848 tTtatatTATATATATATATATATATATATATATATATATAT 807 (Mism.= 6)
846 tatatTATATATATATATATATATATATATATATATATATAT 805 (Mism.= 5)
844 tatTATATATATATATATATATATATATATATATATATATAT 803 (Mism.= 3)
842 tTATATATATATATATATATATATATATATATATATATATAT 801 (Mism.= 1)
840 ATATATATATATATATATATATATATATATATATATATATAT 799 (Mism.= 0)
838 ATATATATATATATATATATATATATATATATATATATATga 797 (Mism.= 2)
836 ATATATATATATATATATATATATATATATATATATATgata 795 (Mism.= 4)
834 ATATATATATATATATATATATATATATATATATATgataAa 793 (Mism.= 5)
832 ATATATATATATATATATATATATATATATATATgataAatT 791 (Mism.= 6)
830 ATATATATATATATATATATATATATATATATgataAatTtT 789 (Mism.= 7)
..................................................
Totally 27 motifs of 8 different REs have been found
=========================================================================================
References:
1. Shahmuradov K.A. Kolchanov N.A.Solovyev V.V.Ratner V.A.
Enhancer-like structures in middle repetitive sequences of the
eukaryotic genomes.
Genetics (Russ),22, 357-368,(1986).
2. Solovyev V.V., Kolchanov N.A. 1994,
Search for functional sites using consensus
In Computer analysis of Genetic macromolecules. (eds. Kolchanov N.A., Lim H.A.),
World Scientific, p.16-21.
3. Solovyev V.V. (2002) Structure, Properties and Computer Identification
of Eukaryotic genes. In Bioinformatics from Genomes to Drugs. V.1. Basic Technologies.
(ed. Lengauer T.), p. 59 - 111.
|