Function that returns multiple graph metrics to compare two DAGs or essential graphs, known as confusion matrix or error matrix.

## Arguments

- ref
a matrix or a formula statement (see details for format) defining the reference network structure, a directed acyclic graph (DAG). Note that row names must be set or given in

`node.names`

if the DAG is given via a formula statement.- test
a matrix or a formula statement (see details for format) defining the test network structure, a directed acyclic graph (DAG). Note that row names must be set or given in

`node.names`

if the DAG is given via a formula statement.

## Value

`TP`

True Positive

`TN`

True Negative

`FP`

False Positive

`FN`

False Negative

`CP`

Condition Positive (ref)

`CN`

Condition Negative (ref)

`PCP`

Predicted Condition Positive (test)

`PCN`

Predicted Condition Negative (test)

`True Positive Rate`

$$=\frac{\sum TP}{\sum CP}$$

`False Positive Rate`

$$=\frac{\sum FP}{\sum CN}$$

`Accuracy`

$$=\frac{\sum TP + \sum TN}{Total population}$$

`G-measure`

$$\sqrt {{\frac {TP}{TP+FP}}\cdot {\frac {TP}{TP+FN}}}$$

`F1-Score`

$$\frac{2 \sum TP}{2 \sum TP + \sum FN + \sum FP}$$

`Positive Predictive Value`

$$\frac{\sum TP}{\sum PCP}$$

`False Ommision Rate`

$$\frac{\sum FN}{\sum PCN}$$

`Hamming-Distance`

Number of changes needed to match the matrices.

## Details

This R function returns standard Directed Acyclic Graph comparison metrics. In statistical classification, those metrics are known as a confusion matrix or error matrix.

Those metrics allows visualization of the difference between different DAGs. In the case where comparing TRUTH to learned structure or two learned structures, those metrics allow the user to estimate the performance of the learning algorithm. In order to compute the metrics, a contingency table is computed of a pondered difference of the adjacency matrices od the two graphs.

The `ref`

or `test`

can be provided using a formula statement
(similar to GLM input).
A typical formula is ` ~ node1|parent1:parent2 + node2:node3|parent3`

.
The formula statement have to start with `~`

.
In this example, node1 has two parents (parent1 and parent2).
node2 and node3 have the same parent3.
The parents names have to exactly match those given in `node.names`

.
`:`

is the separtor between either children or parents,
`|`

separates children (left side) and parents (right side),
`+`

separates terms, `.`

replaces all the variables in `node.names`

.

To test for essential graphs (or graphs) in general, the test for DAG
need to be switched off `checkDAG=FALSE`

.
The function `compareEG()`

is a wrapper to `compareDag(, checkDAG=FALSE)`

.

## References

Sammut, Claude, and Geoffrey I. Webb. (2017). Encyclopedia of machine learning and data mining. Springer.

## Examples

```
test.m <- matrix(data = c(0,1,0,
0,0,0,
1,0,0), nrow = 3, ncol = 3)
ref.m <- matrix(data = c(0,0,0,
1,0,0,
1,0,0), nrow = 3, ncol = 3)
colnames(test.m) <- rownames(test.m) <- colnames(ref.m) <- colnames(ref.m) <- c("a", "b", "c")
unlist(compareDag(ref = ref.m, test = test.m))
#> TPR FPR Accuracy FDR
#> 0.5000000 0.1428571 0.7777778 0.5000000
#> G-measure F1-score PPV FOR
#> 0.5000000 2.0000000 0.5000000 0.5000000
#> Hamming-distance
#> 1.0000000
```