![]() |
![]() ![]() |
![]() | ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
New and updated gene identification programs: Fgenesh+ and Prot_map in comparison with GeneWise.Softberry significantly improved its gene prediction with protein support programs. New Prot_map program can be used to generate a set of gene in new organism and use them to learn parameters for gene prediction programs fgenesh and fgenesh+. It is very useful to find pseudogenes by selection corrupted genes generated by mapping known proteins. Speed of processing sequences
Prot_map mapping of Human protein set of 55946 proteins on chromosome 19 (~59 MB) takes just 90 min (best hit for each protein) and 148 min (all significant hits for each protein) Accuracy comparisonComparison of accuracy of gene prediction by ab initio Fgenesh and prediction with protein support by Fgenesh+ or GenWise and Prot_map - mapping protein to human DNA is done on large set of human genes with using mouse or drosophila homologous proteins. We can see that Fgenesh+ shows the best performance with mouse proteins. With Drosophila proteins ab initio prediction Fgenesh works better than GeneWise for all ranges of similarity and Fgenesh+ is the best predictor if similarity is higher 60%. Sn ex, Sensitivity on exon level (exact exon predictions); Sno ex, sensitivity with exon overlap; Sp ex, specificity, exon level; Sn nuc, seisitivity, nucleotides; Sp nuc, specificity, nucleotides; CC, correlation coefficient; %CG, percent of genes predicted completely correctly (no missing and no extra exons, and all exon boundaries are predicted exactly correctly). Gene prediction with mouse protein support:
1. Similarity level > 90% - 921 sequences
2. 80% < similarity level < 90% - 1441 sequences
2. 60% < similarity level < 80% - 1425 sequences
3. 0% < similarity level < 60% - 259 sequences
Gene prediction with Drosophila proteins with similarity ranging from 22% to 98% and coverage in both proteins > 75%:
1. Similarity level > 80% - 66 sequences.
2. 60% < similarity level < 80% - 290 sequences
2. 40% < similarity level < 60% - 653 sequences
|
![]() | © 2023 www.softberry.com |