Recognition of Gene Acceptor Site via Ensemble Boosting
|
 |
|
Post a Comment
|
 |
|
|
|
|
ABSTRACT:
The complete identification of human genes involves determining subsequences that generate proteins (exons) and those that do not (introns). In RNA splicing, the problem of recognizing the gene acceptor site is concerned with the recognition of the boundaries between these regions, where the current procedure employed by researchers is the GU-AG rule: exon/GU-intron-AG/exon. However, the GU-AG motifs occur so frequently that a typical intron will contain several GUs and AGs within it, resulting in many false sites being recognized. This work investigates the use of a boosting ensemble of neural network classifiers for identification of gene acceptor sites in the human genome. A published dataset of primate exon-intron boundaries was used as a training set for the members of the ensemble to perform two recognition tasks: exon/intron boundaries (EI) and intron/exon boundaries (IE). The ensemble uses the boosting algorithm to pool each member’s output. The proposed boosting ensemble has been applied to recognize gene acceptor junctions of tumor suppressor genes p53 and BRCA1, and an artificial mRNA generated by a random function. The ensemble recognized 5%, 6%, and 11% fewer false sites for p53, BRCA1 and artificial mRNA, respectively, compared to that of the individual recognition by each member.
|
|
|
|
STATISTICS
|
|
Click on # to view
|
|
Citations
|
|
0
|
|
References
|
|
0
|
|
Comments
|
|
0
|
|
Quality
|
|
0/0.00
|
|
Interest
|
|
0/0.00
|
|
View(er)s
|
|
1/90
|
|
|
|
|
|
|
| Prev |
Next |
|