HomeAll softwareProductsNew ProductsServicesManagement teamCorporate ProfileContact

Test online

Gene finding
Gene finding with similarity
Gene finding in Bacteria
Gene finding in Viruses
Next Generation
Gene search
Gene explorer
Promoter
Protein location
RNA structure
Protein structure
3d-explorer
SeqMan
Multiple alignment
Analysis of expression data
Plant promoter database
Search and map repeats
Extracting known SNPs

 

 

CYS_REC: The Program for Predicting SS-bonding States of Cysteines and disulphide briges in Protein Sequences.

The program performs prediction of SS-bonding states of cysteines and locating of disulphide briges in proteins.

Methodology

Procedure: The sequence is processed in steps.

  1. Secondary structure is predicted for a query sequence.
  2. Amino acid fragment as well as fragment of secondary structure in ±10 positions interval of each cysteine is compared with such fragments of training sets using prepared log-odds matrix, and the maximal score is defined for each set.
  3. Scores of comparisons with profiles (weight matrices) constructed on positive (bounded) and negative examples are calculated for a given fragment.
  4. Value of linear discriminant function is calculated based on 4 the most significant amino acid properties.
  5. The resulting score computed as a linear combination of five scores listed above is used for the recognition of SS-bonding states of cysteines.
  6. A neural network calculates some scores for each possible pair of cisteines forming a 'Matrix of pair scores'.
  7. A pattern of possible pairs of bounded cysteines is defined for maximum of sum of the scores of the matrix.

Input Format

Fasta formatted sequence divided by lines ≤ 80 positions in lengths is accepted.

Specially prepared alignment without gaps in the first sequence is accepted too.


Example of alignment:

T0129
    5  182

MLISHSDLNQQLKSAGIGFNATELHGFLSGLLCGGLKDQSWLPLLYQFSN
---SYSDFSQQLKTAGIALSAAELHGFLTGLICGGIHDQSWQPLLFQFTN
-LPTYPSLALALSQQAVALTPAEMHGLISGMLCGGSKDNGWQTLVHDLTN
----YDEMNRFLNQQGAGLTPAEMHGLISGMICGGNNDSSWQPLLHDLTN
----YNEMNQYLNQQGTGLTPAEMHGLISGMICGGNDDSSWLPLLHDLTN

DNHAYPTGLVQPVTELYEQISQTLSDVEGFTFELGLTEDENVFTQADSLS
ENHAYPTALLQEVTQIQQHISKKLADIDGFDFELWLPENEDVFTRADALS
EGVAFPQALSLPLQQLHEATQEALEN-EGFMFQLLIPEGEDVFDRADALS
EGLAFGHELAQALRKMHAATSDALED-DGFLFQLYLPEDVSVFDRADALA
EGMAFGHELAQALRKMHSATSDALQD-DGFLFQLYLPDDVSVFDRADALA

DWANQFLLGIGLAQPELAKEKGEIGEAVDDLQDICQLGYDEDDNEEELAE
EWTNHFLLGLGLAQPKLDKEKGDIGEAIDDLHDICQLGYDESDDKEELSE
GWVNHFLLGLGMLQPKLAQVKDEVGEAIDDLRNIAQLGYDEDEDQEELAQ
GWVNHFLLGLGVTQPKLDKVTGETGEAIDDLRNIAQLGYDESEDQEELEM
GWVNHFLLGLGVTQPKLDKVTGETGEAIDDLRNIAQLGYDEDEDQEELEM

ALEEIIEYVRTIAMLFYSHFNEGEIESKPVLH
ALEEIIEYVRTLACLLFTHFQPQLPEQKPVLH
SLEEVVEYVRVAAILCHIEFTQQKPTAKPTLH
SLEEIIEYVRVAALLCHDTFTRQQPTAKPTLH
SLEEIIEYVRVAALLCHDTFTHPQPTAKPTLH

Output Format

Query sequence

Positions of cysteines which are predicted to form disulfide bonds, matrix of pair scores results of SS-bonding states predictions, the most probable pattern of pairs.


Example of output:


CYS_REC Version 2. Recognition of SS-bounded cysteines >1AC5_ length=483 LPSSEEYKVAYELLPGLSEVPDPSNIPQMHAGHIPLRSEDADEQDSSDLEYFFWKFTNNDSNGNVDRPLIIWLNGGPGCSS MDGALVESGPFRVNSDGKLYLNEGSWISKGDLLFIDQPTGTGFSVEQNKDEGKIDKNKFDEDLEDVTKHFMDFLENYFKIF PEDLTRKIILSGESYAGQYIPFFANAILNHNKFSKIDGDTYDLKALLIGNGWIDPNTQSLSYLPFAMEKKLIDESNPNFKH LTNAHENCQNLINSASTDEAAHFSYQECENILNLLLSYTRESSQKGTADCLNMYNFNLKDSYPSCGMNWPKDISFVSKFFS TPGVIDSLHLDSDKIDHWKECTNSVGTKLSNPISKPSIHLLPGLLESGIEIVLFNGDKDLICNNKGVLDTIDNLKWGGIKG FSDDAVSFDWIHKSKSTDDSEEFSGYVKYDRNLTFVSVYNASHMVPFDKSLVSRGIVDIYSNDVMIIDNNGKNVMITT 7 cysteines are found in positions: 79 251 271 293 308 345 386 Matrix of pair scores POS: 79 251 271 293 308 345 79: -999 -21 -4 8 18 143 251: -21 -999 155 7 -3 -12 271: -4 155 -999 13 -20 -15 293: 8 7 13 -999 133 -8 308: 18 -3 -20 133 -999 -7 345: 143 -12 -15 -8 -7 -999 CYS 79 is SS-bounded Score= 56.7 CYS 251 is SS-bounded Score= 53.2 CYS 271 is SS-bounded Score= 47.0 CYS 293 is SS-bounded Score= 68.1 CYS 308 is SS-bounded Score= 63.9 CYS 345 is SS-bounded Score= 60.7 CYS 386 is not SS-bounded Score= -70.7 The most probable pattern of pairs: 79-345, 251-271, 293-308,

Performance: 3000 positive and 3000 negative examples (i.e ± 10 fragments surrounding bounded and not bounded cysteines) were prepared from PDB sequences that were not participated in the training. An accuracy of SS-bonding states recognition by combined function on this control set was ~90%.

© 2020 www.softberry.com