Setting up the environment

In [ ]:

```
# you may need to install tikzmagic, try the following:
#
# https://github.com/robjstan/tikzmagic
# https://anaconda.org/conda-forge/tikzmagic
import numpy as np
import tikzmagic
```

Recall the meaning of the following terms:

- Joint probability distribution
- Marginal distribution
- Conditional distribution

Consider the following table defining the joint probability distribution of two variables $A$ and $B$.

A=$\square$ | A=$\bigcirc$ | A = $\clubsuit$ | A = $\heartsuit$ | A = $\triangle$ | |
---|---|---|---|---|---|

B=$p$ |
0.01 | 0.01 | 0.12 | 0.01 | 0.14 |

B=$q$ |
0.03 | 0.15 | 0.01 | 0.01 | 0.01 |

B=$r$ |
0.13 | 0.11 | 0.07 | 0.18 | 0.01 |

Compute the following distributions:

- $p(B=p | A = \bigcirc)$
- $p(B | A = \bigcirc)$
- $p(B)$

You may do this calculation by hand or using python.

In [ ]:

```
# Solution goes here
```

Recall that there are only two rules of probability, the sum rule and the product rule. Using these two rules, prove Bayes rule. $$p(Y|X) = \frac{p(X|Y)p(Y)}{\sum_Y p(X,Y)}$$ Observe that the left hand side is a function of $Y$ and the right hand side is a function of $X$ and $p(Y)$ only.

Using the distribution $p(A,B)$ above, compute the all terms in Bayes rule, and verify your theorem.

In [ ]:

```
# Solution goes here
```

Consider the following problem with 5 random variables.

**A**ches with states (False, True)**B**ronchitis with states (none, mild, severe)**C**ough with states (False, True)**D**isease with states (healthy, carrier, sick, recovering)**E**mergency with states (False, True)

How much memory is needed to store the joint probability distribution if:

- All variables were dependent?
- All variables were independent?

Consider the following graphical model.

In [ ]:

```
%%tikz
\tikzstyle{vertex}=[circle, draw=black, fill=white, line width=0.5mm, minimum size=10pt, inner sep=0pt]
\tikzstyle{edge} = [draw, line width=0.25mm, ->]
\node[vertex,label=left:{Aches}] (a) at (0,0) {};
\node[vertex,label=above:{Bronchitis}] (b) at (1,1) {};
\node[vertex,label=above:{Cough}] (c) at (3,1) {};
\node[vertex,label=left:{Disease}] (d) at (2,0) {};
\node[vertex,label=right:{Emergency}] (e) at (3,-1) {};
\foreach \source/ \dest in {b/a, b/d, c/d, d/e}
\path[edge] (\source) -- (\dest);
```

How much memory is needed to store the joint probability distribution? Identify the conditional independences in the graph. Consider both cases: when variables are observed and when they are unobserved.

*A random variable $X$ is independent of $Y$ given $Z$ (written $X\perp Y | Z$) if and only if
$$p(X|Y,Z) = p(X|Z).$$
Equivalently this can be seen as a generalisation of the factorisation property when you have independence,
\begin{align*}
p(X,Y|Z) & = p(X|Y,Z) p(Y|Z)\
& = p(X|Z) p(Y|Z)
\end{align*}
The first equality above is just the product rule.*

By observing the structure of the graphical model and using the sum rule and product rule, prove the conditional independences that you have identified.

Consider the following tables.

$p(B)$ | B=n | B=m | B=s |
---|---|---|---|

marginal | 0.97 | 0.01 | 0.02 |

$p(C)$ | C=False | C=True |
---|---|---|

marginal | 0.7 | 0.3 |

$p(A | B)$ | B=n | B=m | B=s |
---|---|---|---|---|

A=False |
0.9 | 0.8 | 0.3 | |

A=True |
0.1 | 0.2 | 0.7 |

$p(D | B,C)$ | B=n, C=F | B=m, C=F | B=s, C=F | B=n, C=T | B=m, C=T | B=s, C=T |
---|---|---|---|---|---|---|---|

D=healthy |
0.9 | 0.8 | 0.1 | 0.3 | 0.4 | 0.01 | |

D=carrier |
0.08 | 0.17 | 0.01 | 0.05 | 0.05 | 0.01 | |

D=sick |
0.01 | 0.01 | 0.87 | 0.05 | 0.15 | 0.97 | |

D=recovering |
0.01 | 0.02 | 0.02 | 0.6 | 0.4 | 0.01 |

$p(E | D)$ | D=h | D=c | D=s | D=r |
---|---|---|---|---|---|

E=False |
0.99 | 0.99 | 0.4 | 0.9 | |

E=True |
0.01 | 0.01 | 0.6 | 0.1 |

Compute the following:

- p(A,B,C,D,E)
- p(E)
- p(E|B=s)
- p(E|B=s, C=T)

Note that there are two ways of arriving at the distributions:

- By computing p(A,B,C,D,E) and marginalising and conditioning appropriately
- By only computing the required distributions directly using the graphical model structure. Check that both ways give the same answer.

In [ ]:

```
# Solution goes here
```

- Code a class which represents a PMF, and has a
`sample`

method which draws a sample from the PMF. - Use it to draw samples from $p(C)$ and empirically compute and print the probability that your sampler returns the value
`False`

.

In [ ]:

```
# Solution goes here
```

Convert the Bayesian Network above to a Markov Random Field. Draw the resulting network in tikz.

In [ ]:

```
# Solution goes here
```

Identify the maximal cliques in the graph and write down the corresponding clique potentials. Compare the conditional independence statements of the MRF with the BN.