![]() |
![]() ![]() |
![]() | ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
|
OLIGSThe program makes statistical calculations on oligonucleotides (4-nucleotides ) and shows the ones of significant differences to expected mean. Input data The input file should be in FASTA format and may contain several sequences. Alphabet. The allowed symbols: "ACGTUacgtu" and "NnyYrRBbDdHhKkWwSsMmVv". The symbols to be skipped: "0123456789; \n\r\t\0-". All other symbols are not allowed. Algorithm
For each defined L the array that contains the number of oligos is built. The sequential
number of oligo is used as an index for this array. The total number of oligos is a value of the array. Output data Example for program output: Oligs 1.6 Copyright (c) 2005-2006 Softberry Num seqs=32 Nucleotides=46705 Average seq length=1459.5 A=25.1% C=24.7% G=24.8% T=25.4% N=0.000000% Other=0.000000% Output least frequent oligs, direction=direct, seq_shift=0, seq_step=1 deviation multiplier=3.000000 #olig,total olig counter,expected number,deviation,deviation multiplier,unique sequences counter,norm deviate Length 2 oligs=46673 TA 2174 2976.6 52.8 -15.2 32 0.046547 CG 2461 2858.0 51.8 -7.7 32 0.052692 GT 2609 2939.8 52.5 -6.3 32 0.055861 AC 2579 2893.8 52.1 -6.0 32 0.055219 GG 2662 2868.7 51.9 -4.0 32 0.056996 Length 3 oligs=46641 TAG 412 737.4 26.9 -12.1 32 0.008821 CTA 446 734.7 26.9 -10.7 32 0.009549 GTA 511 737.4 26.9 -8.4 32 0.010941 TAC 509 734.7 26.9 -8.4 31 0.010898 CGT 519 725.6 26.7 -7.7 32 0.011112 GGG 508 710.7 26.5 -7.7 32 0.010877 GTC 539 725.6 26.7 -7.0 32 0.011541 ACG 549 716.9 26.6 -6.3 32 0.011755 GAC 551 716.9 26.6 -6.2 32 0.011797 CCC 545 702.8 26.3 -6.0 32 0.011669 CGG 550 708.1 26.4 -6.0 32 0.011776 TTA 608 755.7 27.3 -5.4 32 0.013018 ATA 607 746.7 27.1 -5.2 31 0.012996 TAT 626 755.7 27.3 -4.8 32 0.013403 ACC 595 714.3 26.5 -4.5 32 0.012740 TAA 627 746.7 27.1 -4.4 32 0.013425 GGT 619 728.3 26.8 -4.1 32 0.013253 TCA 631 734.7 26.9 -3.9 32 0.013510 AGT 640 737.4 26.9 -3.6 32 0.013703 CCG 611 705.4 26.4 -3.6 32 0.013082 ACT 651 734.7 26.9 -3.1 32 0.013939 Length 4 oligs=46609 CTAG 73 182.0 13.5 -8.1 26 0.001563 GGGG 71 176.1 13.2 -7.9 24 0.001520 TAGG 83 182.7 13.5 -7.4 24 0.001777 CCTA 85 181.3 13.4 -7.2 26 0.001820 CGTA 92 182.0 13.5 -6.7 26 0.001970 TAGT 104 187.2 13.7 -6.1 26 0.002227 TTAG 105 187.2 13.7 -6.0 25 0.002248 ACGT 101 182.0 13.5 -6.0 29 0.002163 TACG 104 182.0 13.5 -5.8 22 0.002227 TAGA 108 185.0 13.6 -5.7 27 0.002312 TCTA 111 186.5 13.6 -5.5 27 0.002377 GGTA 110 182.7 13.5 -5.4 24 0.002355 ACTA 112 184.3 13.5 -5.3 29 0.002398 ACCC 106 176.3 13.3 -5.3 26 0.002270 GTCA 111 182.0 13.5 -5.3 26 0.002377 TAAC 113 184.3 13.5 -5.3 29 0.002419 CTAT 115 186.5 13.6 -5.2 29 0.002462 ATAG 115 185.0 13.6 -5.2 26 0.002462 CGGT 111 179.8 13.4 -5.1 30 0.002377 CGTC 111 179.1 13.4 -5.1 29 0.002377 CGGG 109 175.4 13.2 -5.0 29 0.002334 GATA 118 185.0 13.6 -4.9 27 0.002526 TATC 120 186.5 13.6 -4.9 30 0.002569 TACC 116 181.3 13.4 -4.9 26 0.002484 TAGC 117 182.0 13.5 -4.8 27 0.002505 TTAC 121 186.5 13.6 -4.8 28 0.002591 GTAG 119 182.7 13.5 -4.7 28 0.002548 ATAC 123 184.3 13.5 -4.5 26 0.002634 GGGT 121 180.4 13.4 -4.4 26 0.002591 CCCT 120 178.4 13.3 -4.4 29 0.002569 CGCG 117 174.8 13.2 -4.4 26 0.002505 GGTC 122 179.8 13.4 -4.3 29 0.002612 CTAA 126 184.3 13.5 -4.3 31 0.002698 GACC 120 177.0 13.3 -4.3 27 0.002569 TAAG 127 185.0 13.6 -4.3 30 0.002719 GTCT 127 184.2 13.5 -4.2 30 0.002719 CTTA 129 186.5 13.6 -4.2 31 0.002762 GTAA 128 185.0 13.6 -4.2 28 0.002741 ACGG 122 177.6 13.3 -4.2 30 0.002612 GACT 126 182.0 13.5 -4.2 31 0.002698 TCAT 130 186.5 13.6 -4.1 29 0.002783 AGAC 125 179.8 13.4 -4.1 28 0.002676 GTAT 132 187.2 13.7 -4.0 25 0.002826 CCCG 121 174.1 13.2 -4.0 28 0.002591 TACT 132 186.5 13.6 -4.0 29 0.002826 TGAC 129 182.0 13.5 -3.9 30 0.002762 CCGG 123 174.8 13.2 -3.9 27 0.002634 ACCG 125 177.0 13.3 -3.9 29 0.002676 ATTA 136 189.6 13.7 -3.9 29 0.002912 CCCC 123 173.5 13.1 -3.8 25 0.002634 AGTC 132 182.0 13.5 -3.7 26 0.002826 GTAC 132 182.0 13.5 -3.7 26 0.002826 CTAC 132 181.3 13.4 -3.7 31 0.002826 TCAC 132 181.3 13.4 -3.7 30 0.002826 CATA 135 184.3 13.5 -3.6 27 0.002890 AGTA 137 185.0 13.6 -3.5 29 0.002933 GCGT 136 179.8 13.4 -3.3 29 0.002912 GCTA 138 182.0 13.5 -3.3 28 0.002955 TCGT 140 184.2 13.5 -3.3 31 0.002998 GTTA 143 187.2 13.7 -3.2 29 0.003062 GAGT 140 182.7 13.5 -3.2 29 0.002998 TCGG 138 179.8 13.4 -3.1 31 0.002955 Detailed description for output data: The program version and name are shown in the first string: Oligs 1.6 Copyright (c) 2005-2006 Softberry Num seqs=32 Nucleotides=46705 Average seq length=1459.5 A=25.1% C=24.7% G=24.8% T=25.4% N=0.000000% Other=0.000000%
Further there is an information on input file: Output least frequent oligs, direction=direct, seq_shift=0, seq_step=1 deviation multiplier=3.000000
Further there are defined input parameters: #olig,total olig counter,expected number,deviation,deviation multiplier,unique sequences counter,norm deviate
Further there is a hint for table of oligos on each column:
Length 3 oligs=46641
Further there are tables of oligos of different length.
TAG 412 737.4 26.9 -12.1 32 0.008821 CTA 446 734.7 26.9 -10.7 32 0.009549 GTA 511 737.4 26.9 -8.4 32 0.010941
Further there is the table with 5 column's values sorted by descending.
|
![]() | © 2021 www.softberry.com |