# Labeled directed acyclic graphs: a generalization of context-specific independence in directed graphical models

A1 Originalartikel i en vetenskaplig tidskrift (referentgranskad)

Interna författare/redaktörer

Publikationens författare: Johan Pensar, Henrik Nyman, Timo Koski, Jukka Corander

Publiceringsår: 2014

Tidskrift: Data Mining and Knowledge Discovery

Volym: 29

Artikelns första sida, sidnummer: 503

Artikelns sista sida, sidnummer: 533

Abstrakt

We introduce a novel class of labeled directed acyclic graph (LDAG) models

for finite sets of discrete variables. LDAGs generalize earlier proposals for allowing

local structures in the conditional probability distribution of a node, such that unrestricted

label sets determine which edges can be deleted from the underlying directed

acyclic graph (DAG) for a given context. Several properties of these models are derived,

including a generalization of the concept of Markov equivalence classes. Efficient

Bayesian learning of LDAGs is enabled by introducing an LDAG-based factorization

of the Dirichlet prior for the model parameters, such that the marginal likelihood can

be calculated analytically. In addition, we develop a novel prior distribution for the

model structures that can appropriately penalize a model for its labeling complexity.

A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill

climbing approach is used for illustrating the useful properties of LDAG models for

both real and synthetic data sets.

We introduce a novel class of labeled directed acyclic graph (LDAG) models for finite sets of discrete variables. LDAGs generalize earlier proposals for allowing local structures in the conditional probability distribution of a node, such that unrestricted label sets determine which edges can be deleted from the underlying directed acyclic graph (DAG) for a given context. Several properties of these models are derived, including a generalization of the concept of Markov equivalence classes. Efficient Bayesian learning of LDAGs is enabled by introducing an LDAG-based factorization of the Dirichlet prior for the model parameters, such that the marginal likelihood can be calculated analytically. In addition, we develop a novel prior distribution for the model structures that can appropriately penalize a model for its labeling complexity. A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill climbing approach is used for illustrating the useful properties of LDAG models for both real and synthetic data sets.

Nyckelord

context-specific interaction model, Directed acyclic graph, graphical model, Markov chain Monte Carlo