Welcome to MTD: Microbial Transcriptome Database

A hidden Markov support vector machine framework incorporating profile geometry learning
for identifying microbial RNA in tiling array data

Wen-Han Yu*, Hedda Høvik+, and Tsute Chen*

Bioinformatics 2010 26:1423-30
* Department of Molecular Genetics, The Forsyth Institute, Boston, MA, USA
+ Department of Oral Biology, Faculty of Dentistry, University of Oslo, Oslo, Norway
Motivation: RNA expression signals detected by high-density genomic tiling microarrays contain comprehensive transcriptomic information of the target organism. Current methods for determining the RNA transcription units are still computation intense and lack the discriminative power. This article describes an efficient and accurate methodology to reveal complicated transcriptional architecture, including small regulatory RNAs, in microbial transcriptome profiles.

Results: Normalized microarray data were first subject to support vector regression to estimate the profile tendency by reducing noise interruption. A hybrid supervised machine learning algorithm, hidden Markov support vector machines, was then used to classify the underlying state of each probe to 'expression' or 'silence' with the assumption that the consecutive state sequence was a heterogeneous Markov chain. For model construction, we introduced a profile geometry learning method to construct the feature vectors, which considered both intensity profiles and changes of intensities over the probe spacing. Also, a robust strategy was used to dynamically evaluate and select the training set based only on prior computer gene annotation. The algorithm performed better than other methods in accuracy on simulated data, especially for small expressed regions with lower (<1) SNR (signal-to-noise ratio), hence more sensitive for detecting small RNAs.

Availability and implementation: Detail implementation steps of the algorithm and the complete result of the transcriptome analysis for a microbial genome Porphyromonas gingivalis W83 can be viewed at http://bioinformatics.forsyth.org/mtd. The HM-SVM computer algorithm can be downloaded HERE

Contact: tchen@forsyth.org
Copyright 2007-2008 The Forsyth Institute
Recommended screen resolution: 1280 x 1024 pix XXXXXX