Recall the network from last lecture:
We identified edge (B,D) as a good candidate to remove to create two clusters based on betweenness.
Today we'll discuss an alternative clustering approach based on graph cuts.
What makes a good cut?
The cut-set {(H,C)} is smaller than the set {(B,D),(C,G)}.
Why might we not like cut {(H,C)}?
A good cut is:
How to quantify these?
Vol(S): volume of a set of S nodes is the number of edges with at least one end in S.
Vol({A,B,C})=6
Let Cut(S,T) be the number of edges in cut-set of cut C=(S,T) (the cut size).
The normalized cut value for (S,T) is:
NCV(S,T)=Cut(S,T)Vol(S)+Cut(S,T)Vol(T)Example:
Consider the cut where S={H} and T={A,B,C,D,E,F,G}.
Example:
Consider the cut where S={A,B,C,H} and T={D,E,F,G}.
So, if one part of the cut has a small volume, the normalized cut value will be large.
Problem: How do we identify the cut with the smallest normalized cut value?
Like most good things, it is NP-hard.
Linear algebra to the rescue!
Below, we describe a way to approximate the optimal cuts using eigenvalue decomposition.
** Representing Graphs with Matrices **
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline
import matplotlib.pyplot as plt
import networkx as nx
graph = nx.Graph()
graph.add_edges_from([('A', 'B'), ('A', 'C'), ('B', 'C'), ('B', 'D'), ('D', 'E'), ('D', 'F'), ('D', 'G'), ('E', 'F'), ('G', 'F')])
nx.draw(graph, with_labels=True)
# Print the adjacency matrix.
def adjacency_matrix(graph):
return nx.adjacency_matrix(graph, sorted(graph.nodes()))
adjacency = adjacency_matrix(graph)
print('Adjacency matrix:\n', adjacency.todense())
Adjacency matrix: [[0 1 1 0 0 0 0] [1 0 1 1 0 0 0] [1 1 0 0 0 0 0] [0 1 0 0 1 1 1] [0 0 0 1 0 1 0] [0 0 0 1 1 0 1] [0 0 0 1 0 1 0]]
** What data structure should we use to store the adjacency matrix? **
E.g., list of tuples [(0,1), (0,2), (1,2), ...]
# Print the degree matrix.
import numpy as np
def degree_matrix(graph):
degrees = dict(graph.degree()).items()
degrees = sorted(degrees, key=lambda x: x[0])
degrees = [d[1] for d in degrees]
return np.diag(degrees)
degree = degree_matrix(graph)
print('Degree matrix:\n', degree)
Degree matrix: [[2 0 0 0 0 0 0] [0 3 0 0 0 0 0] [0 0 2 0 0 0 0] [0 0 0 4 0 0 0] [0 0 0 0 2 0 0] [0 0 0 0 0 3 0] [0 0 0 0 0 0 2]]
Laplacian matrix L=D−A
def laplacian_matrix(graph):
return degree_matrix(graph) - adjacency_matrix(graph)
laplacian = laplacian_matrix(graph)
print('Laplacian matrix:\n', laplacian)
Laplacian matrix: [[ 2 -1 -1 0 0 0 0] [-1 3 -1 -1 0 0 0] [-1 -1 2 0 0 0 0] [ 0 -1 0 4 -1 -1 -1] [ 0 0 0 -1 2 -1 0] [ 0 0 0 -1 -1 3 -1] [ 0 0 0 -1 0 -1 2]]
Properties of Laplacian matrix:
** Recall that a matrix is a way to represent a linear transformation.**
For an arbitrary vector v, what does Lv mean?
# For a smaller graph.
graph2 = nx.Graph()
graph2.add_edges_from([('A', 'B'), ('B', 'C')])
nx.draw(graph2, with_labels=True)
laplacian2 = laplacian_matrix(graph2)
print(laplacian2)
[[ 1 -1 0] [-1 2 -1] [ 0 -1 1]]
Let v be a map from nodes to real values.
E.g., v={f(A),f(B),f(C)}
Then, Lv is:
(1−10−12−10−11)(f(A)f(B)f(C))=(f(A)−f(B)−f(A)+2f(B)−f(C)−f(B)+f(C))More generally:
Lv[i]=[deg(i)∗(f(i)−average of f on neighbors of i)]
print(laplacian2.dot([1,2,3]))
[[-1 0 1]]
print(laplacian2.dot([3,2,1]))
[[ 1 0 -1]]
** Review of Linear Algebra**
A vector v of dimension n is an eigenvector of a square (n×n) matrix A if and only if Av=λv
where λ is a scalar, called an eigenvalue.
In otherwords, v is just a linear scaling of A.
Assume A has n linearly independent eigenvectors, {v1…vn}.
eigenvector matrix: Let V be a square matrix where column i is vi
Then A can be factorized as A=VΛV−1 where
eigenvalue matrix: Λ is a diagonal matrix of eigenvalues: Λ[i,i]=λi.
(c.f. Principal Component Analysis.)
What happens when we compute the eigenvalue decomposition of the Laplacian L?
print(laplacian2.dot([1,1,1]))
[[0 0 0]]
What about second eigenvector?
# Library to compute eigenvectors:
from numpy.linalg import eigh
help(eigh)
Help on function eigh in module numpy.linalg: eigh(a, UPLO='L') Return the eigenvalues and eigenvectors of a complex Hermitian (conjugate symmetric) or a real symmetric matrix. Returns two objects, a 1-D array containing the eigenvalues of `a`, and a 2-D square array or matrix (depending on the input type) of the corresponding eigenvectors (in columns). Parameters ---------- a : (..., M, M) array Hermitian or real symmetric matrices whose eigenvalues and eigenvectors are to be computed. UPLO : {'L', 'U'}, optional Specifies whether the calculation is done with the lower triangular part of `a` ('L', default) or the upper triangular part ('U'). Irrespective of this value only the real parts of the diagonal will be considered in the computation to preserve the notion of a Hermitian matrix. It therefore follows that the imaginary part of the diagonal will always be treated as zero. Returns ------- w : (..., M) ndarray The eigenvalues in ascending order, each repeated according to its multiplicity. v : {(..., M, M) ndarray, (..., M, M) matrix} The column ``v[:, i]`` is the normalized eigenvector corresponding to the eigenvalue ``w[i]``. Will return a matrix object if `a` is a matrix object. Raises ------ LinAlgError If the eigenvalue computation does not converge. See Also -------- eigvalsh : eigenvalues of real symmetric or complex Hermitian (conjugate symmetric) arrays. eig : eigenvalues and right eigenvectors for non-symmetric arrays. eigvals : eigenvalues of non-symmetric arrays. Notes ----- .. versionadded:: 1.8.0 Broadcasting rules apply, see the `numpy.linalg` documentation for details. The eigenvalues/eigenvectors are computed using LAPACK routines ``_syevd``, ``_heevd``. The eigenvalues of real symmetric or complex Hermitian matrices are always real. [1]_ The array `v` of (column) eigenvectors is unitary and `a`, `w`, and `v` satisfy the equations ``dot(a, v[:, i]) = w[i] * v[:, i]``. References ---------- .. [1] G. Strang, *Linear Algebra and Its Applications*, 2nd Ed., Orlando, FL, Academic Press, Inc., 1980, pg. 222. Examples -------- >>> from numpy import linalg as LA >>> a = np.array([[1, -2j], [2j, 5]]) >>> a array([[ 1.+0.j, -0.-2.j], [ 0.+2.j, 5.+0.j]]) >>> w, v = LA.eigh(a) >>> w; v array([0.17157288, 5.82842712]) array([[-0.92387953+0.j , -0.38268343+0.j ], # may vary [ 0. +0.38268343j, 0. -0.92387953j]]) >>> np.dot(a, v[:, 0]) - w[0] * v[:, 0] # verify 1st e-val/vec pair array([5.55111512e-17+0.0000000e+00j, 0.00000000e+00+1.2490009e-16j]) >>> np.dot(a, v[:, 1]) - w[1] * v[:, 1] # verify 2nd e-val/vec pair array([0.+0.j, 0.+0.j]) >>> A = np.matrix(a) # what happens if input is a matrix object >>> A matrix([[ 1.+0.j, -0.-2.j], [ 0.+2.j, 5.+0.j]]) >>> w, v = LA.eigh(A) >>> w; v array([0.17157288, 5.82842712]) matrix([[-0.92387953+0.j , -0.38268343+0.j ], # may vary [ 0. +0.38268343j, 0. -0.92387953j]]) >>> # demonstrate the treatment of the imaginary part of the diagonal >>> a = np.array([[5+2j, 9-2j], [0+2j, 2-1j]]) >>> a array([[5.+2.j, 9.-2.j], [0.+2.j, 2.-1.j]]) >>> # with UPLO='L' this is numerically equivalent to using LA.eig() with: >>> b = np.array([[5.+0.j, 0.-2.j], [0.+2.j, 2.-0.j]]) >>> b array([[5.+0.j, 0.-2.j], [0.+2.j, 2.+0.j]]) >>> wa, va = LA.eigh(a) >>> wb, vb = LA.eig(b) >>> wa; wb array([1., 6.]) array([6.+0.j, 1.+0.j]) >>> va; vb array([[-0.4472136 +0.j , -0.89442719+0.j ], # may vary [ 0. +0.89442719j, 0. -0.4472136j ]]) array([[ 0.89442719+0.j , -0. +0.4472136j], [-0. +0.4472136j, 0.89442719+0.j ]])
eig_vals, eig_vectors = eigh(laplacian)
# round them for prettier printing
eig_vals = np.round(eig_vals, 2)
eig_vectors = np.round(eig_vectors, 2)
print('eigen values\n', eig_vals)
print('eigen vectors\n',np.round(eig_vectors, 2))
eigen values [-0. 0.4 2. 3. 3.34 4. 5.26] eigen vectors [[ 0.38 0.49 -0. -0.71 -0.32 0. -0.11] [ 0.38 0.3 0. 0. 0.75 -0. 0.45] [ 0.38 0.49 -0. 0.71 -0.32 0. -0.11] [ 0.38 -0.21 0. 0. 0.39 -0. -0.81] [ 0.38 -0.36 0.71 -0. -0.17 -0.41 0.19] [ 0.38 -0.36 -0. -0. -0.17 0.82 0.19] [ 0.38 -0.36 -0.71 -0. -0.17 -0.41 0.19]]
print('first eigen value=%g' % eig_vals[0])
print('first (normalized) eigen vector=\n%s' % eig_vectors[:,0])
first eigen value=-0 first (normalized) eigen vector= [0.38 0.38 0.38 0.38 0.38 0.38 0.38]
Eigen vectors have been normalized to unit length by
v=v′||v′||v = np.array([1, 1, 1, 1, 1, 1, 1])
v / (np.sqrt(np.sum(v**2)))
array([0.37796447, 0.37796447, 0.37796447, 0.37796447, 0.37796447, 0.37796447, 0.37796447])
print('second eigen value=%.2f' % eig_vals[1])
print('second eigen vector=\n%s' % eig_vectors[:,1])
nx.draw(graph, with_labels=True)
second eigen value=0.40 second eigen vector= [ 0.49 0.3 0.49 -0.21 -0.36 -0.36 -0.36]
What do you notice about the second eigenvector?
(Note that first vector component corresponds to node 'A', the second to 'B', etc.)
We can think of eigenvectors as the cluster assignment to each node.
E.g., nodes with negative values are in cluster 1, nodes with positive values are in cluster 2.
From eigenvalues to clustering
Method 1:
From eigenvalues to clustering
Method 2:
Partitioning into more than 2 components?
Interesting property of the second smallest eigenvalue -- it is a measure of the density of the graph.
The growth rate of the second smallest eigenvalue as the density of the graph increases.
See https://samidavies.wordpress.com/2016/09/20/whats-up-with-the-graph-laplacian/
Second eigenvector indicates which vertices are in the dense parts of the graph.
A | B | C | D | E | F | G |
---|---|---|---|---|---|---|
0.49 | 0.3 | 0.49 | -0.21 | -0.36 | -0.36 | -0.36 |
nx.draw(graph, with_labels=True)
print('third eigen value=%.2f' % eig_vals[2])
print('third eigen vector=\n%s' % eig_vectors[:,2])
third eigen value=2.00 third eigen vector= [-0. 0. -0. 0. 0.71 -0. -0.71]
v1:{A,B,C},{D,E,F,G}
v2:{A,B,D,E,F},{C,G}
print('fourth eigen value=%.2f' % eig_vals[3])
print('fourth eigen vector=\n%s' % eig_vectors[:,3])
fourth eigen value=3.00 fourth eigen vector= [-0.71 0. 0.71 0. -0. -0. -0. ]
v1:{A,B,C},{D,E,F,G}
v2:{A,B,C,D,E},{C,G}
v3:{A,E,F,G},{B,C,D}
Method 3:
# get first row of eigenvector matrix.
# this is a vector to represent node A
eig_vectors[0,:]
array([ 0.38, 0.49, -0. , -0.71, -0.32, 0. , -0.11])
# get second row of eigenvector matrix.
# this is a vector to represent node B
eig_vectors[1,:]
array([ 0.38, 0.3 , 0. , 0. , 0.75, -0. , 0.45])
# get second row of eigenvector matrix.
# this is a vector to represent node C
eig_vectors[2,:]
array([ 0.38, 0.49, -0. , 0.71, -0.32, 0. , -0.11])
# plot each point using 2nd and 3rd eigen vector
def plot_by_eigenvectors(eig_vectors):
plt.figure()
for i, name in enumerate(['A', 'B', 'C', 'D', 'E', 'F', 'G']):
v1 = eig_vectors[i, 1]
v2 = eig_vectors[i, 2]
plt.annotate(name, xy=(v1, v2))
print(name, v1, v2)
plt.xlim(-.5,.5)
plt.ylim(-.8,.8)
plt.xlabel('second eigenvector value')
plt.ylabel('third eigenvector value')
plt.show()
plot_by_eigenvectors(eig_vectors)
A 0.49 -0.0 B 0.3 0.0 C 0.49 -0.0 D -0.21 0.0 E -0.36 0.71 F -0.36 -0.0 G -0.36 -0.71
from IPython.core.display import HTML
HTML(open('../custom.css').read())