Academic Journals Database
Disseminating quality controlled scientific knowledge

Pattern statistics on Markov chains and sensitivity to parameter estimation

Author(s): Nuel Grégory

Journal: Algorithms for Molecular Biology
ISSN 1748-7188

Volume: 1;
Issue: 1;
Start page: 17;
Date: 2006;
Original page

Abstract Background: In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,...). Results: In the particular case where pattern statistics (overlap counting only) computed through binomial approximations we use the delta-method to give an explicit expression of σ, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. Conclusion: We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation.
Save time & money - Smart Internet Solutions      Why do you need a reservation system?