Recursive Learning for Sparse Markov Models

J Xiong; V Jääskinen; Jukka Corander

doi:10.1214/15-BA949

Recursive Learning for Sparse Markov Models

J Xiong, V Jääskinen, Jukka Corander

Research output: Contribution to journal › Article › Scientific › peer-review

12 Citations (Scopus)

Abstract

Markov chains of higher order are popular models for a wide variety of applications in natural language and DNA sequence processing. However, since the number of parameters grows exponentially with the order of a Markov chain, several alternative model classes have been proposed that allow for stability and higher rate of data compression. The common notion to these models is that they cluster the possible sample paths used to predict the next state into invariance classes with identical conditional distributions assigned to the same class. The models vary in particular with respect to constraints imposed on legitime partitions of the sample paths. Here we consider the class of sparse Markov chains for which the partition is left unconstrained a priori. A recursive computation scheme based on Delaunay triangulation of the parameter space is introduced to enable fast approximation of the posterior mode partition. Comparisons with stochastic optimization, k-means and nearest neighbor algorithms show that our approach is both considerably faster and leads on average to a more accurate estimate of the underlying partition. We show additionally that the criterion used in the recursive steps for comparison of triangulation cell contents leads to consistent estimation of the local structure in the sparse Markov model.

Original language	Undefined/Unknown
Pages (from-to)	247–263
Number of pages	17
Journal	Bayesian Analysis
Volume	11
Issue number	1
DOIs	https://doi.org/10.1214/15-BA949
Publication status	Published - 2016
MoE publication type	A1 Journal article-refereed

Keywords

clustering
Delaunay triangulation
recursive learning
sequence analysis
sparse Markov chains

Access to Document

10.1214/15-BA949

Cite this

@article{7de1bd79feb14bab86e866ffc69acf72,

title = "Recursive Learning for Sparse Markov Models",

abstract = "Markov chains of higher order are popular models for a wide variety of applications in natural language and DNA sequence processing. However, since the number of parameters grows exponentially with the order of a Markov chain, several alternative model classes have been proposed that allow for stability and higher rate of data compression. The common notion to these models is that they cluster the possible sample paths used to predict the next state into invariance classes with identical conditional distributions assigned to the same class. The models vary in particular with respect to constraints imposed on legitime partitions of the sample paths. Here we consider the class of sparse Markov chains for which the partition is left unconstrained a priori. A recursive computation scheme based on Delaunay triangulation of the parameter space is introduced to enable fast approximation of the posterior mode partition. Comparisons with stochastic optimization, k-means and nearest neighbor algorithms show that our approach is both considerably faster and leads on average to a more accurate estimate of the underlying partition. We show additionally that the criterion used in the recursive steps for comparison of triangulation cell contents leads to consistent estimation of the local structure in the sparse Markov model.",

keywords = "clustering, Delaunay triangulation, recursive learning, sequence analysis, sparse Markov chains, clustering, Delaunay triangulation, recursive learning, sequence analysis, sparse Markov chains, clustering, Delaunay triangulation, recursive learning, sequence analysis, sparse Markov chains",

author = "J Xiong and V J{\"a}{\"a}skinen and Jukka Corander",

year = "2016",

doi = "10.1214/15-BA949",

language = "Odefinierat/ok{\"a}nt",

volume = "11",

pages = "247–263",

journal = "Bayesian Analysis",

issn = "1936-0975",

publisher = "International Society for Bayesian Analysis (ISBA)",

number = "1",

}

TY - JOUR

T1 - Recursive Learning for Sparse Markov Models

AU - Xiong, J

AU - Jääskinen, V

AU - Corander, Jukka

PY - 2016

Y1 - 2016

N2 - Markov chains of higher order are popular models for a wide variety of applications in natural language and DNA sequence processing. However, since the number of parameters grows exponentially with the order of a Markov chain, several alternative model classes have been proposed that allow for stability and higher rate of data compression. The common notion to these models is that they cluster the possible sample paths used to predict the next state into invariance classes with identical conditional distributions assigned to the same class. The models vary in particular with respect to constraints imposed on legitime partitions of the sample paths. Here we consider the class of sparse Markov chains for which the partition is left unconstrained a priori. A recursive computation scheme based on Delaunay triangulation of the parameter space is introduced to enable fast approximation of the posterior mode partition. Comparisons with stochastic optimization, k-means and nearest neighbor algorithms show that our approach is both considerably faster and leads on average to a more accurate estimate of the underlying partition. We show additionally that the criterion used in the recursive steps for comparison of triangulation cell contents leads to consistent estimation of the local structure in the sparse Markov model.

AB - Markov chains of higher order are popular models for a wide variety of applications in natural language and DNA sequence processing. However, since the number of parameters grows exponentially with the order of a Markov chain, several alternative model classes have been proposed that allow for stability and higher rate of data compression. The common notion to these models is that they cluster the possible sample paths used to predict the next state into invariance classes with identical conditional distributions assigned to the same class. The models vary in particular with respect to constraints imposed on legitime partitions of the sample paths. Here we consider the class of sparse Markov chains for which the partition is left unconstrained a priori. A recursive computation scheme based on Delaunay triangulation of the parameter space is introduced to enable fast approximation of the posterior mode partition. Comparisons with stochastic optimization, k-means and nearest neighbor algorithms show that our approach is both considerably faster and leads on average to a more accurate estimate of the underlying partition. We show additionally that the criterion used in the recursive steps for comparison of triangulation cell contents leads to consistent estimation of the local structure in the sparse Markov model.

KW - clustering

KW - Delaunay triangulation

KW - recursive learning

KW - sequence analysis

KW - sparse Markov chains

KW - clustering

KW - Delaunay triangulation

KW - recursive learning

KW - sequence analysis

KW - sparse Markov chains

KW - clustering

KW - Delaunay triangulation

KW - recursive learning

KW - sequence analysis

KW - sparse Markov chains

U2 - 10.1214/15-BA949

DO - 10.1214/15-BA949

M3 - Artikel

SN - 1936-0975

VL - 11

SP - 247

EP - 263

JO - Bayesian Analysis

JF - Bayesian Analysis

IS - 1

ER -