# Labeled directed acyclic graphs: a generalization of context-specific independence in directed graphical models

A1 Journal article (refereed)

Internal Authors/Editors

Publication Details

List of Authors: Johan Pensar, Henrik Nyman, Timo Koski, Jukka Corander

Publication year: 2014

Journal: Data Mining and Knowledge Discovery

Volume number: 29

Start page: 503

End page: 533

Abstract

We introduce a novel class of labeled directed acyclic graph (LDAG) models

for finite sets of discrete variables. LDAGs generalize earlier proposals for allowing

local structures in the conditional probability distribution of a node, such that unrestricted

label sets determine which edges can be deleted from the underlying directed

acyclic graph (DAG) for a given context. Several properties of these models are derived,

including a generalization of the concept of Markov equivalence classes. Efficient

Bayesian learning of LDAGs is enabled by introducing an LDAG-based factorization

of the Dirichlet prior for the model parameters, such that the marginal likelihood can

be calculated analytically. In addition, we develop a novel prior distribution for the

model structures that can appropriately penalize a model for its labeling complexity.

A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill

climbing approach is used for illustrating the useful properties of LDAG models for

both real and synthetic data sets.

We introduce a novel class of labeled directed acyclic graph (LDAG) models for finite sets of discrete variables. LDAGs generalize earlier proposals for allowing local structures in the conditional probability distribution of a node, such that unrestricted label sets determine which edges can be deleted from the underlying directed acyclic graph (DAG) for a given context. Several properties of these models are derived, including a generalization of the concept of Markov equivalence classes. Efficient Bayesian learning of LDAGs is enabled by introducing an LDAG-based factorization of the Dirichlet prior for the model parameters, such that the marginal likelihood can be calculated analytically. In addition, we develop a novel prior distribution for the model structures that can appropriately penalize a model for its labeling complexity. A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill climbing approach is used for illustrating the useful properties of LDAG models for both real and synthetic data sets.

Keywords

context-specific interaction model, Directed acyclic graph, graphical model, Markov chain Monte Carlo