This function computes the Markov blanket of a set of nodes given a DAG (Directed Acyclic Graph).
Arguments
- dag
a matrix or a formula statement (see details for format) defining the network structure, a directed acyclic graph (DAG).
- node
a character vector of the nodes for which the Markov Blanket should be returned.
- data.dists
a named list giving the distribution for each node in the network, see details.
- data.df
a data frame containing the data for the nodes in the network. Only needed if
dagis a formula statement.
Details
This function returns the Markov Blanket of a set of nodes given a DAG.
The dag can be provided as a matrix where the rows and columns are the nodes names.
The matrix should be binary, where 1 indicates an edge from the column node (parent) to the row node (child).
The diagonal of the matrix should be 0 and the matrix should be acyclic.
The nodes names should be the same as the names of the distributions in data.dists.
Alternatively, the dag can be provided using a formula statement (similar to glm).
This requires the data.dists and data.df arguments to be provided.
A typical formula is ~ node1|parent1:parent2 + node2:node3|parent3.
The formula statement have to start with ~.
In this example, node1 has two parents (parent1 and parent2).
node2 and node3 are children of the same parent (parent3).
The parents names have to exactly match those given in name.
: is the separator between either children or parents,
| separates children (left side) and parents (right side),
+ separates terms, . replaces all the variables in name.
Examples
## Defining distribution and dag
dist <- list(a="gaussian", b="gaussian", c="gaussian", d="gaussian",
e="binomial", f="binomial")
dag <- matrix(c(0,1,1,0,1,0,
0,0,1,1,0,1,
0,0,0,0,0,0,
0,0,0,0,0,0,
0,0,0,0,0,1,
0,0,0,0,0,0), nrow = 6L, ncol = 6L, byrow = TRUE)
colnames(dag) <- rownames(dag) <- names(dist)
mb(dag, node = "b", data.dists = dist)
#> [1] "a" "c" "d" "f" "e"
mb(dag, node = c("b","e"), data.dists = dist)
#> [1] "a" "c" "d" "f" "e" "b"
