getCITED   
  Home     Search     Add Content     Reports     Help  
Edit Publication | Edit Contributors | Delete Publication | Edit References | Edit Citations
Add to Bookstack | Show Bookstack | Change Bookstack

An Improved Exon-Intron Recognition via a Committee of Machines

Post a Comment
CONTRIBUTORS:
  Author Pabico, Jaderick P. (University of the Philippines Los Banos)
  Author Mojica, Elmer-Rico E. (University of the Philippines Los Banos)
  Author Micor, Jose Rene L.
JOURNAL:
  Transactions of the National Academy of Science and Technology, Philippines, 30(1), 117 - ??.
YEAR: 2008
PUB TYPE: Journal Article
SUBJECT(S): intron; exon; committee machines; machine recognition
DISCIPLINE: Information Systems/Technology
HTTP: http://www.ics.uplb.edu.ph/node/282
LANGUAGE: English
PUB ID: 103-444-121 (Last edited on 2008/10/20 06:11:50 GMT-6)
SPONSOR(S):
 
ABSTRACT:
The human genome consists of a sequence of gene base pairs that generate proteins called exons. Exons are bounded by subsequences, called introns, that are spliced out prior to translation. In RNA splicing, the current procedure followed by researchers to recognize the gene boundaries is the GU-AG heuristic which has the following motif: exon/GU-intron-AG/exon. However, this motif occurs so frequently that a typical intron will contain several GUs and AGs within it, resulting in many false boundaries being recognized. Several methodologies to automate the recognition of these sites have been employed by other researchers, such as support vector machines, hidden Markov models, and artificial neural networks (ANN), where the reported maximum recognition accuracy on a production set is only 81%. A production set is a set of DNA sequences whose intron-exon boundaries are known but where not used in the development of the model. A committee of machines is a computational methodology where the output of multiple models are combined into a single output. The member models' output are combined using several methodologies such as averaging, boosting, bagging and simple majority voting. It has been shown, both theoretically and empirically, that the output of the committee machine is superior to those of its constituent member models. In this effort, we developed a committee of neural network classifiers trained to classify whether a given 60bp long DNA sequence is an intron-exon (IE) boundary (acceptor site), an exon-intron (EI) boundary (donor site), or not (N). Using the same production set used by other researchers, our committee machine was able to recognize 84% of the DNA sequences, improving the recognition rate by 3%.
STATISTICS
Click on # to view
 Citations  
 References  
 Comments  
 Quality      0/0.00 
 Interest      0/0.00 
 View(er)s   2/84 
Quality
  N/A
High
  7
  6
  5
  4
  3
  2
  1
Low
Interest
  N/A
High
  7
  6
  5
  4
  3
  2
  1
Low
Prev | Next

    ABOUT getCITED   |    CONTACT US   |    USER INFO   |    PREFERENCES   |    PRIVACY   |    LOG IN   
Comments? Suggestions? Send them to feedback@getCITED.org.

Copyright © 2000-2006 getCITED Inc. All Rights Reserved.