With this vignette we aim to provide a basic introduction to the
structure learning of Bayesian networks with the abn
package.
Structure Learning of Bayesian Networks
The structure learning of Bayesian networks is the process of estimating the (in-)dependencies between the variables of the network that results in a directed acyclic graph (DAG) where the nodes represent the variables and the edges represent the dependencies between the variables. Structure learning of Bayesian networks is a challenging problem and there are several algorithms to solve it (see Koller and Friedman (2009) for a comprehensive review).
The abn
package currently offers four distinct
algorithms for Bayesian network structure learning:
mostProbable()
: This exact order-based structure learning algorithm identifies the most probable posterior DAG following the method of Koivisto and Sood (2004). For details see the help page ofmostProbable()
.searchHillClimber()
: The Hill-climber algorithm is a single move algorithm. At each step, an arc is attempted to be added. The new score is computed and compared to the previous network’s score. It deviates slightly from the original method proposed by Heckerman, Geiger, and Chickering (1995) by utilizing a full cache of all possible local combinations as provided bybuildScoreCache()
. The algorithm considers all potential single swaps in the DAG at each step, selecting the swap that maximizes the goodness of fit. While multiple arc changes are considered at each step, arc reversal requires two steps. This implementation (inC
), which is more intuitive with a pre-computed cache of local scores, is optimized for theabn
workflow. For details see the help page ofsearchHillClimber()
.-
searchHeuristic()
: This function is a flexible implementation of multiple greedy heuristic algorithms, particularly well adapted to theabn
framework. It targets multi-random restarts heuristic algorithms and allows the user to select the number of searches and the maximum number of steps within each search. The function implements three algorithms selected with the parameteralgo
:hc
,tabu
, orsa
.algo=hc
: This alternative implementation of the greedy hill-climbing approach that is fully written in R, unlikesearchHillClimber()
andmostProbable()
which are written inC
. It performs a local stepwise optimization, choosing a structural move and computing the score’s change. This function is closer to the MCMCMC algorithm from Madigan, York, and Allard (1995) and Paolo Giudici and Roberto Castello (2003) with a single edge move.algo=tabu
: The same algorithm ashc
is used, but a list of banned moves is computed. The parametertabu.memory
controls the length of the tabu list. This forces the algorithm to escape the local maximum by choosing some not already used moves.algo=sa
: This variant of the heuristic search algorithm is based on simulated annealing described by Metropolis et al. (1953). Some accepted moves could result in a decrease of the network score. The acceptance rate can be monitored with the parametertemperature
.
For more information, refer to the help page
searchHeuristic()
.