Finally we come to a classic application of Gibbs sampling. The Ising Model is a simple model in statistical mechanics.
Select $n$ and consider $\mathbb P(x_n = 1 | \text{all other } x_m )$. This is equal to
$$ \frac{\mathbb P(x_n=1 , (x_m))}{\mathbb P(x_n=1 , (x_m)) + \mathbb P(x_n=-1 , (x_m))}. $$Now, we find that
$$ \mathbb P(x_n | (x_m) ) = \exp\Big( \beta x_n \sum_m J_{n,m} x_m + \beta H x_n \Big) \exp\Big( \text{Other terms not involving } x_n \Big). $$The $1/2$ factor has disappeared as we have now fixed an ordering. Thus
$$ \mathbb P(x_n=1 | (x_m)) = \frac{e^{\beta \sum_m J_{n,m} x_m + H} }{e^{\beta \sum_m J_{n,m} x_m + H} + e^{-\beta \sum_m J_{n,m} x_m - H}} = \frac{ 1 }{1 + e^{-2\beta \sum_m J_{n,m} x_m - 2H}} = \frac{1}{1+e^{-2\beta b_n}}, $$say, where $b_n = \sum_m J_{n,m} x_m + H$.
So the Gibbs model is to loop over each $n$, and with this probability set $x_n=1$, otherwise setting it to $-1$. Actually, as in general there will be spacial correlations, it makes more sense to randomly select the $n$ each time.
An alternative to direct sampling is to perform a Metropolis--Hastings step: considering changing $x_n$ from it's current state. If this decreases the energy then we accept the flip; otherwise accept with probability $e^{-\beta \Delta E}$.
It seems that in practise there is not much difference in algorithm.
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
We'll work with a 2D rectangular grid with periodic boundary conditions.
class IsingGrid:
def __init__(self, height, width, extfield, invtemp):
self.width, self.height, self.extfield, self.invtemp = height, width, extfield, invtemp
self.grid = np.zeros([self.width, self.height], dtype=np.int8) + 1
def plot(self):
plt.imshow(self.grid, cmap=cm.gray, aspect="equal", interpolation="none", vmin=-1, vmax=1)
def make_random(self):
self.grid = (np.random.randint(0, 2, size = self.width * self.height).reshape(self.width, self.height) * 2) - 1
def neighbours(self, x, y):
n = []
if x == 0:
n.append( (self.width-1, y) )
else:
n.append( (x-1, y) )
if x == self.width-1:
n.append( (0, y) )
else:
n.append( (x+1, y) )
if y == 0:
n.append( (x, self.height-1) )
else:
n.append( (x, y-1) )
if y == self.height-1:
n.append( (x, 0) )
else:
n.append( (x, y+1) )
return n
def local_energy(self, x, y):
return self.extfield + sum( self.grid[xx,yy] for (xx, yy) in self.neighbours(x, y) )
def total_energy(self):
# Could maybe do some numpy games here, but periodic boundary conditions make this tricky.
# This function is only ever useful for very small grids anyway.
energy = - self.extfield * np.sum(self.grid)
energy += - sum( self.grid[x, y] * sum( self.grid[xx, yy] for (xx, yy) in self.neighbours(x, y) )
for x in range(self.width) for y in range(self.height) ) / 2
return energy
def probability(self):
return np.exp( - self.invtemp * self.total_energy() )
def gibbs_move(self):
n = np.random.randint(0, self.width * self.height)
y = n // self.width
x = n % self.width
p = 1 / (1 + np.exp(-2 * self.invtemp * self.local_energy(x,y)))
if np.random.random() <= p:
self.grid[x,y] = 1
else:
self.grid[x,y] = -1
def from_number(self, n):
"""Convert an integer 0 <= n < 2**(width*height) into a grid."""
binstring = bin(n)[2:]
binstring = "0" * (N - len(binstring)) + binstring
self.grid = np.array([int(x)*2-1 for x in binstring], dtype=np.int8).reshape(self.width, self.height)
def to_number(self):
"""Convert grid into an integer."""
flat = [self.grid[x, y] for x in range(self.width) for y in range(self.height)]
return sum(2**n * (int(x)+1)//2 for n, x in enumerate(reversed(flat)))
gg = IsingGrid(30, 50, 0, .3)
gg.make_random()
gg.plot()
for _ in range(100000):
gg.gibbs_move()
gg.plot()
For a grid with $N$ points, we have $2^N$ states, and so for very small $N$, we can directly sample.
def make_comparison_plot(W, H, field, invtemp, samples=1000000):
# Generate all possible combinations and compute probability.
gg = IsingGrid(W, H, field, invtemp)
N = gg.width * gg.height
prob = []
for n in range(2**N):
gg.from_number(n)
prob.append( gg.probability() )
total = sum(prob)
prob = [ (n, x/total) for n, x in enumerate(prob) ]
prob.sort(key = lambda pair: pair[1])
# Randomly sample a large number of states
for _ in range(10000):
gg.gibbs_move()
count = [0]*(2**N)
for _ in range(samples):
gg.gibbs_move()
count[ gg.to_number() ] += 1
total = sum(count)
count = [ x/total for x in count ]
# Plot
fig, axes = plt.subplots(figsize=(12,6))
axes.set_xlim([0,2**N])
_ = axes.plot( range(2**N), [x for _, x in prob], lw=2 )
_ = axes.scatter( range(2**N), [count[n] for n, _ in prob] )
make_comparison_plot(3, 3, 0, .1)
make_comparison_plot(3, 3, 1, .1)
Sources:
Suppose we have a binary image $Z = Z_{x,y}$ (where for convenience, "binary" will mean $\pm 1$ valued) and that the image has had some "noise" added. We'll assume a "bit-flip" model, so the image we see if $Y = Y_{x,y}$ with
\begin{align*} \mathbb P(Y=y | Z=z) &= \prod_{x,y} \begin{cases} q &: y_{x,y} = z_{x,y}, \\ 1-q &: y_{x,y} \not= z_{x,y}, \end{cases} \\&= \prod_{x,y} \begin{cases} q &: y_{x,y}z_{x,y} = 1, \\ 1-q &: y_{x,y}z_{x,y} = -1, \end{cases}. \end{align*}thanks to my meaning of "binary".
Here presumably $q$ is close to $1$. This seems like a hopeless problem unless we make some assumptions on the structure of $Z$. Let's assume that $Z$ is "clumpy", so there is high correlation between each value $Z_{x,y}$ and it's neighbours. This allows us to apply the Ising model, as we'll see.
Indeed, with $i=(x,y)$ and again setting $J_{i,j} = 1$ only when $i,j$ are neighbours in a 2D grid, then let us assume that
$$ \mathbb P(Z=z) \propto \exp\Big( \sum_{i,j} J_{i,j} z_i z_j \Big). $$That is, an Ising model with zero external field. We can now apply Bayes rule,
$$ \mathbb P(Z=z|Y=y) \propto \mathbb P(Y=y | Z=z) \mathbb P(Z=z). $$Now, for $a,b\in\{\pm1\}$ we have that
$$ \begin{cases} q &: ab = 1 \\ 1-q &: ab=-1 \end{cases} = q^{(ab+1)/2} (1-q)^{(1-ab)/2} = \Big( \frac{q}{1-q} \Big)^{ab} q^{1/2} (1-q)^{1/2}. $$Thus, with $h = \log(q/(1-q)) = \operatorname{logit}(q)$,
$$ \log P(Z=z|Y=y) \propto \exp\Big( \sum_{i,j} J_{i,j} z_i z_j + \sum_i h z_i y_i \Big). $$This is exactly an Ising model with a varying external field $(hy_i)$. To get the conventions correct, we set $\beta=2$ and halve $h$.
Finding the MAP means checking all possible values for $z=(z_i)$, which is an exponentially sized search space. Instead, we could consider using the mean of the posterior, which we can estimate by MCMC methods: if $(x_i)$ are iid samples of $X$ then $\mathbb E(X) \approx \frac1n \sum_{i=1}^n x_i$.
So our plan of attack is to re-write our Ising model to allow a varying external field... The following is a slightly quick'n'dirty subclass.
class IsingGridVaryingField(IsingGrid):
def __init__(self, height, width, extfield, invtemp):
super().__init__(height, width, 0, invtemp)
self.vextfield = extfield
def local_energy(self, x, y):
return self.vextfield[x,y] + sum( self.grid[xx,yy] for (xx, yy) in self.neighbours(x, y) )
I'll use skimage
to load our PNG file, but I guess you could use pillow
etc.
import skimage.io
image = skimage.io.imread("image_small.png")
image = (image[:,:,0].astype(np.int) * 2) - 1 # Black and white so just grab one RGB channel
fig, axes = plt.subplots(figsize=(10,6))
axes.imshow(image, cmap=cm.gray, aspect="equal", interpolation="none", vmin=-1, vmax=1)
<matplotlib.image.AxesImage at 0x108a7828>
q = 0.9
noise = np.random.random(size = image.size).reshape(image.shape) > q
noisy = np.array(image)
noisy[noise] = -noisy[noise]
fig, axes = plt.subplots(figsize=(10,6))
axes.imshow(noisy, cmap=cm.gray, aspect="equal", interpolation="none", vmin=-1, vmax=1)
<matplotlib.image.AxesImage at 0xcecd3c8>
def IsingDeNoise(noisy, q, burnin = 50000, loops = 500000):
h = 0.5 * np.log(q / (1-q))
gg = IsingGridVaryingField(noisy.shape[0], noisy.shape[1], h*noisy, 2)
gg.grid = np.array(noisy)
# Burn-in
for _ in range(burnin):
gg.gibbs_move()
# Sample
avg = np.zeros_like(noisy).astype(np.float64)
for _ in range(loops):
gg.gibbs_move()
avg += gg.grid
return avg / loops
avg = IsingDeNoise(noisy, 0.9)
avg[avg >= 0] = 1
avg[avg < 0] = -1
avg = avg.astype(np.int)
fig, axes = plt.subplots(ncols=2, figsize=(11,6))
axes[0].imshow(avg, cmap=cm.gray, aspect="equal", interpolation="none", vmin=-1, vmax=1)
axes[1].imshow(noisy, cmap=cm.gray, aspect="equal", interpolation="none", vmin=-1, vmax=1)
<matplotlib.image.AxesImage at 0x143fe438>
Notice that this is dependent on a "guessed" value for $q$ (though for the sort of images this will work on, we can probably estimate $q$). Here's what happens for other values...
avg = IsingDeNoise(noisy, 0.99)
avg[avg >= 0] = 1
avg[avg < 0] = -1
avg = avg.astype(np.int)
fig, axes = plt.subplots(ncols=2, figsize=(11,6))
axes[0].imshow(avg, cmap=cm.gray, aspect="equal", interpolation="none", vmin=-1, vmax=1)
axes[1].imshow(noisy, cmap=cm.gray, aspect="equal", interpolation="none", vmin=-1, vmax=1)
<matplotlib.image.AxesImage at 0x142d2f28>
avg = IsingDeNoise(noisy, 0.5)
avg[avg >= 0] = 1
avg[avg < 0] = -1
avg = avg.astype(np.int)
fig, axes = plt.subplots(ncols=2, figsize=(11,6))
axes[0].imshow(avg, cmap=cm.gray, aspect="equal", interpolation="none", vmin=-1, vmax=1)
axes[1].imshow(noisy, cmap=cm.gray, aspect="equal", interpolation="none", vmin=-1, vmax=1)
<matplotlib.image.AxesImage at 0x14454860>