RNA, Stochastic Context Free Grammars and Classifiers
RNA molecules have proven to have a surprisingly large role in cellular functions of higher organisms. They are also present in much larger numbers than anticipated only a few years ago. Thus annotating and predicting the genome for RNA structures and functions is of great importance. Most RNA gene predictors try to classify a sequence into two classes: RNA gene – background functionless DNA. It is often of interest to classify an RNA gene/structure into functional classes. Typically, this is either done by homology (if two sequences are similar, they probably have the same function) or by some additional classifier algorithm
The project has been written after discussing two projects of classifying microRNA that used classifiers based on a series of selected features. It was felt this was very suboptimal approach for 3 reasons: i. it had no way of using evolutionary information, which has been the great advantage in most other annotation/classification of sequences. The present influx of data will mainly be homologous to existing sequences making evolutionary models a necessity. ii. Classifiers used no structural prediction, which in most other RNA analysis is the key component in analysis. iii. (related to ii) The lack of structure lead to no functional interpretation of differences between types. Clearly the proposed approach needs very different techniques from the classifier approaches mentioned above.
Printer Friendly Version 