Notebook

CS579: Lecture 06¶

** Graph Partitioning **

Dr. Aron Culotta
Illinois Institute of Technology

Recall the network from last lecture:

network

We identified edge $(B,D)$ as a good candidate to remove to create two clusters based on betweenness.

Today we'll discuss an alternative clustering approach based on graph cuts.

A cut $C=(S,T)$ partitions the vertices $V$ of a graph into two subsets $S$ and $T$ .
The cut-set of $C$ is the set of edges that have one endpoint in $S$ and the other in $T$

What makes a good cut?

partition

The cut-set $\{(H,C)\}$ is smaller than the set $\{(B,D), (C,G)\}$ .

Why might we not like cut $\{(H,C)\}$ ?

A good cut is:

small
balanced

How to quantify these?

$Vol(S)$ : volume of a set of $S$ nodes is the number of edges with at least one end in $S$ .

partition

$Vol(\{A,B,C\})=6$

Let $Cut(S,T)$ be the number of edges in cut-set of cut $C=(S,T)$ (the cut size).

The normalized cut value for $(S,T)$ is:

$NCV(S,T) = \frac{Cut(S,T)}{Vol(S)} + \frac{Cut(S,T)}{Vol(T)}$

partition

Example:

Consider the cut where $S=\{H\}$ and $T=\{A,B,C,D,E,F,G\}$ .

$Cut(S,T) = 1$
$Vol(S) = 1$
$Vol(T) = 11$
$NCV(S,T) = \frac{1}{1} + \frac{1}{11} = 1.09$

partition

Example:

Consider the cut where $S=\{A,B,C,H\}$ and $T=\{D,E,F,G\}$ .

$Cut(S,T) = 2$
$Vol(S) = 6$
$Vol(T) = 7$
$NCV(S,T) = \frac{2}{6} + \frac{2}{7} = 0.62$

So, if one part of the cut has a small volume, the normalized cut value will be large.

Problem: How do we identify the cut with the smallest normalized cut value?

Like most good things, it is NP-hard.

Linear algebra to the rescue!

Below, we describe a way to approximate the optimal cuts using eigenvalue decomposition.

** Representing Graphs with Matrices **

Adjacency matrix $A$ : $A[i, j] = 1$ iff there is an edge from $i$ to $j$ .
Degree matrix $D$ : $D[i, i] = deg(i)$ , else 0

In [24]:

import warnings
warnings.filterwarnings("ignore")
%matplotlib inline 
import matplotlib.pyplot as plt
import networkx as nx
graph = nx.Graph()
graph.add_edges_from([('A', 'B'), ('A', 'C'), ('B', 'C'), ('B', 'D'), ('D', 'E'), ('D', 'F'), ('D', 'G'), ('E', 'F'), ('G', 'F')])
nx.draw(graph, with_labels=True)

In [25]:

# Print the adjacency matrix.
def adjacency_matrix(graph):
    return nx.adjacency_matrix(graph, sorted(graph.nodes()))

adjacency = adjacency_matrix(graph)
print('Adjacency matrix:\n', adjacency.todense())

Adjacency matrix:
 [[0 1 1 0 0 0 0]
 [1 0 1 1 0 0 0]
 [1 1 0 0 0 0 0]
 [0 1 0 0 1 1 1]
 [0 0 0 1 0 1 0]
 [0 0 0 1 1 0 1]
 [0 0 0 1 0 1 0]]

** What data structure should we use to store the adjacency matrix? **

Naive 2d array: $O(V^2)$ space, for $V$ vertices.
But, graphs are sparse ( $E << V^2$ ), for $E$ edges.
Instead, only store non-zeros

E.g., list of tuples [(0,1), (0,2), (1,2), ...]

Space = $O(E)$
We'll look at these sparse matrices in more detail later.

In [26]:

# Print the degree matrix.
import numpy as np

def degree_matrix(graph):
    degrees = dict(graph.degree()).items()
    degrees = sorted(degrees, key=lambda x: x[0])
    degrees = [d[1] for d in degrees]
    return np.diag(degrees)

degree = degree_matrix(graph)
print('Degree matrix:\n', degree)

Degree matrix:
 [[2 0 0 0 0 0 0]
 [0 3 0 0 0 0 0]
 [0 0 2 0 0 0 0]
 [0 0 0 4 0 0 0]
 [0 0 0 0 2 0 0]
 [0 0 0 0 0 3 0]
 [0 0 0 0 0 0 2]]

Laplacian matrix $L = D - A$

Adjacency matrix $A$ : $A[i, j] = 1$ iff there is an edge from $i$ to $j$ .
Degree matrix $D$ : $D[i, i] = deg(i)$ , else 0

In [27]:

def laplacian_matrix(graph):
    return degree_matrix(graph) - adjacency_matrix(graph)

laplacian = laplacian_matrix(graph)
print('Laplacian matrix:\n', laplacian)

Laplacian matrix:
 [[ 2 -1 -1  0  0  0  0]
 [-1  3 -1 -1  0  0  0]
 [-1 -1  2  0  0  0  0]
 [ 0 -1  0  4 -1 -1 -1]
 [ 0  0  0 -1  2 -1  0]
 [ 0  0  0 -1 -1  3 -1]
 [ 0  0  0 -1  0 -1  2]]

Properties of Laplacian matrix:

rows sum to 0; columns sum to 0
symmetric
positive-semidefinite ( $z^TLz \ge 0$ $\forall$ nonzero $z$ )

** Recall that a matrix is a way to represent a linear transformation.**

For an arbitrary vector $v$ , what does $Lv$ mean?

In [28]:

# For a smaller graph.
graph2 = nx.Graph()
graph2.add_edges_from([('A', 'B'), ('B', 'C')])
nx.draw(graph2, with_labels=True)

In [29]:

laplacian2 = laplacian_matrix(graph2)
print(laplacian2)

[[ 1 -1  0]
 [-1  2 -1]
 [ 0 -1  1]]

Let $v$ be a map from nodes to real values.

E.g., $v = \{f(A), f(B), f(C)\}$

Then, $Lv$ is:

$\begin{pmatrix} 1 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 1 \end{pmatrix} \begin{pmatrix} f(A)\\ f(B)\\ f(C)\\ \end{pmatrix} = \begin{pmatrix} f(A) - f(B)\\ -f(A) + 2f(B) - f(C)\\ -f(B) + f(C) \end{pmatrix}$

More generally:

$Lv[i]= [deg(i) ∗ (f(i) −$ average of $f$ on neighbors of $i)]$

In [30]:

print(laplacian2.dot([1,2,3]))

[[-1  0  1]]

In [31]:

print(laplacian2.dot([3,2,1]))

[[ 1  0 -1]]

** Review of Linear Algebra**

A vector $\mathbf{v}$ of dimension $n$ is an eigenvector of a square $(n×n)$ matrix $A$ if and only if

$A \mathbf{v} = \lambda \mathbf{v}$

where $\lambda$ is a scalar, called an eigenvalue.

In otherwords, $\mathbf{v}$ is just a linear scaling of $A$ .

Assume $A$ has $n$ linearly independent eigenvectors, $\{\mathbf{v}_1 \ldots \mathbf{v}_n\}$ .

eigenvector matrix: Let $V$ be a square matrix where column $i$ is $\mathbf{v}_i$

Then A can be factorized as $A=V \Lambda V^{-1}$ where

eigenvalue matrix: $\Lambda$ is a diagonal matrix of eigenvalues: $\Lambda[i, i]=\lambda_i$ .

(c.f. Principal Component Analysis.)

source

What happens when we compute the eigenvalue decomposition of the Laplacian $L$ ?

Smallest eigenvalue λ0=0
- Corresponding eigenvector $\mathbf{v}_0 = \mathbf{1}$

In [32]:

print(laplacian2.dot([1,1,1]))

[[0 0 0]]

What about second eigenvector?

In [11]:

# Library to compute eigenvectors:
from numpy.linalg import eigh
help(eigh)

Help on function eigh in module numpy.linalg:

eigh(a, UPLO='L')
    Return the eigenvalues and eigenvectors of a complex Hermitian
    (conjugate symmetric) or a real symmetric matrix.
    
    Returns two objects, a 1-D array containing the eigenvalues of `a`, and
    a 2-D square array or matrix (depending on the input type) of the
    corresponding eigenvectors (in columns).
    
    Parameters
    ----------
    a : (..., M, M) array
        Hermitian or real symmetric matrices whose eigenvalues and
        eigenvectors are to be computed.
    UPLO : {'L', 'U'}, optional
        Specifies whether the calculation is done with the lower triangular
        part of `a` ('L', default) or the upper triangular part ('U').
        Irrespective of this value only the real parts of the diagonal will
        be considered in the computation to preserve the notion of a Hermitian
        matrix. It therefore follows that the imaginary part of the diagonal
        will always be treated as zero.
    
    Returns
    -------
    w : (..., M) ndarray
        The eigenvalues in ascending order, each repeated according to
        its multiplicity.
    v : {(..., M, M) ndarray, (..., M, M) matrix}
        The column ``v[:, i]`` is the normalized eigenvector corresponding
        to the eigenvalue ``w[i]``.  Will return a matrix object if `a` is
        a matrix object.
    
    Raises
    ------
    LinAlgError
        If the eigenvalue computation does not converge.
    
    See Also
    --------
    eigvalsh : eigenvalues of real symmetric or complex Hermitian
               (conjugate symmetric) arrays.
    eig : eigenvalues and right eigenvectors for non-symmetric arrays.
    eigvals : eigenvalues of non-symmetric arrays.
    
    Notes
    -----
    
    .. versionadded:: 1.8.0
    
    Broadcasting rules apply, see the `numpy.linalg` documentation for
    details.
    
    The eigenvalues/eigenvectors are computed using LAPACK routines ``_syevd``,
    ``_heevd``.
    
    The eigenvalues of real symmetric or complex Hermitian matrices are
    always real. [1]_ The array `v` of (column) eigenvectors is unitary
    and `a`, `w`, and `v` satisfy the equations
    ``dot(a, v[:, i]) = w[i] * v[:, i]``.
    
    References
    ----------
    .. [1] G. Strang, *Linear Algebra and Its Applications*, 2nd Ed., Orlando,
           FL, Academic Press, Inc., 1980, pg. 222.
    
    Examples
    --------
    >>> from numpy import linalg as LA
    >>> a = np.array([[1, -2j], [2j, 5]])
    >>> a
    array([[ 1.+0.j, -0.-2.j],
           [ 0.+2.j,  5.+0.j]])
    >>> w, v = LA.eigh(a)
    >>> w; v
    array([0.17157288, 5.82842712])
    array([[-0.92387953+0.j        , -0.38268343+0.j        ], # may vary
           [ 0.        +0.38268343j,  0.        -0.92387953j]])
    
    >>> np.dot(a, v[:, 0]) - w[0] * v[:, 0] # verify 1st e-val/vec pair
    array([5.55111512e-17+0.0000000e+00j, 0.00000000e+00+1.2490009e-16j])
    >>> np.dot(a, v[:, 1]) - w[1] * v[:, 1] # verify 2nd e-val/vec pair
    array([0.+0.j, 0.+0.j])
    
    >>> A = np.matrix(a) # what happens if input is a matrix object
    >>> A
    matrix([[ 1.+0.j, -0.-2.j],
            [ 0.+2.j,  5.+0.j]])
    >>> w, v = LA.eigh(A)
    >>> w; v
    array([0.17157288, 5.82842712])
    matrix([[-0.92387953+0.j        , -0.38268343+0.j        ], # may vary
            [ 0.        +0.38268343j,  0.        -0.92387953j]])
    
    >>> # demonstrate the treatment of the imaginary part of the diagonal
    >>> a = np.array([[5+2j, 9-2j], [0+2j, 2-1j]])
    >>> a
    array([[5.+2.j, 9.-2.j],
           [0.+2.j, 2.-1.j]])
    >>> # with UPLO='L' this is numerically equivalent to using LA.eig() with:
    >>> b = np.array([[5.+0.j, 0.-2.j], [0.+2.j, 2.-0.j]])
    >>> b
    array([[5.+0.j, 0.-2.j],
           [0.+2.j, 2.+0.j]])
    >>> wa, va = LA.eigh(a)
    >>> wb, vb = LA.eig(b)
    >>> wa; wb
    array([1., 6.])
    array([6.+0.j, 1.+0.j])
    >>> va; vb
    array([[-0.4472136 +0.j        , -0.89442719+0.j        ], # may vary
           [ 0.        +0.89442719j,  0.        -0.4472136j ]])
    array([[ 0.89442719+0.j       , -0.        +0.4472136j],
           [-0.        +0.4472136j,  0.89442719+0.j       ]])

In [33]:

eig_vals, eig_vectors = eigh(laplacian)
# round them for prettier printing
eig_vals = np.round(eig_vals, 2)
eig_vectors = np.round(eig_vectors, 2)
print('eigen values\n', eig_vals)
print('eigen vectors\n',np.round(eig_vectors, 2))

eigen values
 [-0.    0.4   2.    3.    3.34  4.    5.26]
eigen vectors
 [[ 0.38  0.49 -0.   -0.71 -0.32  0.   -0.11]
 [ 0.38  0.3   0.    0.    0.75 -0.    0.45]
 [ 0.38  0.49 -0.    0.71 -0.32  0.   -0.11]
 [ 0.38 -0.21  0.    0.    0.39 -0.   -0.81]
 [ 0.38 -0.36  0.71 -0.   -0.17 -0.41  0.19]
 [ 0.38 -0.36 -0.   -0.   -0.17  0.82  0.19]
 [ 0.38 -0.36 -0.71 -0.   -0.17 -0.41  0.19]]

In [34]:

print('first eigen value=%g' % eig_vals[0])
print('first (normalized) eigen vector=\n%s' % eig_vectors[:,0])

first eigen value=-0
first (normalized) eigen vector=
[0.38 0.38 0.38 0.38 0.38 0.38 0.38]

Eigen vectors have been normalized to unit length by

$v = \frac{v'}{||v'||}$

In [35]:

v = np.array([1, 1, 1, 1, 1, 1, 1])
v / (np.sqrt(np.sum(v**2)))

Out[35]:

array([0.37796447, 0.37796447, 0.37796447, 0.37796447, 0.37796447,
       0.37796447, 0.37796447])

In [36]:

print('second eigen value=%.2f' % eig_vals[1])
print('second eigen vector=\n%s' % eig_vectors[:,1])
nx.draw(graph, with_labels=True)

second eigen value=0.40
second eigen vector=
[ 0.49  0.3   0.49 -0.21 -0.36 -0.36 -0.36]

What do you notice about the second eigenvector?

(Note that first vector component corresponds to node 'A', the second to 'B', etc.)

We can think of eigenvectors as the cluster assignment to each node.

E.g., nodes with negative values are in cluster 1, nodes with positive values are in cluster 2.

From eigenvalues to clustering

Method 1:

Compute decomposition and examine second eigenvector
Place nodes of positive values in cluster 1, negative values in cluster 2.

From eigenvalues to clustering

Method 2:

Compute decomposition and examine second eigenvector
Place nodes of values greather than τ in cluster 1, and the remainder in cluster 2.
- The threshold $\tau$ will determine balance of clustering
- E.g., for $\tau=-.3$ ?

Partitioning into more than 2 components?

Just recursively repeat the above procedure.

Or, consider multiple eigen vectors simultaneously.

Interesting property of the second smallest eigenvalue -- it is a measure of the density of the graph.

The growth rate of the second smallest eigenvalue as the density of the graph increases.

See https://samidavies.wordpress.com/2016/09/20/whats-up-with-the-graph-laplacian/

Second eigenvector indicates which vertices are in the dense parts of the graph.

A	B	C	D	E	F	G
0.49	0.3	0.49	-0.21	-0.36	-0.36	-0.36

In [17]:

nx.draw(graph, with_labels=True)

In [37]:

print('third eigen value=%.2f' % eig_vals[2])
print('third eigen vector=\n%s' % eig_vectors[:,2])

third eigen value=2.00
third eigen vector=
[-0.    0.   -0.    0.    0.71 -0.   -0.71]

$\mathbf{v}_1: \{A,B,C\}, \{D,E,F,G\}$

$\mathbf{v}_2: \{A,B,D,E,F\}, \{C,G\}$

In [38]:

print('fourth eigen value=%.2f' % eig_vals[3])
print('fourth eigen vector=\n%s' % eig_vectors[:,3])

fourth eigen value=3.00
fourth eigen vector=
[-0.71  0.    0.71  0.   -0.   -0.   -0.  ]

$\mathbf{v}_1: \{A,B,C\}, \{D,E,F,G\}$

$\mathbf{v}_2: \{A,B,C,D,E\}, \{C,G\}$

$\mathbf{v}_3: \{A,E,F,G\}, \{B,C,D\}$

Method 3:

Represent each node by all of its eigenvectors
Cluster the result using a standard algorithm like $k$ -means

In [39]:

# get first row of eigenvector matrix.
# this is a vector to represent node A
eig_vectors[0,:]

Out[39]:

array([ 0.38,  0.49, -0.  , -0.71, -0.32,  0.  , -0.11])

In [21]:

# get second row of eigenvector matrix.
# this is a vector to represent node B
eig_vectors[1,:]

Out[21]:

array([ 0.38,  0.3 ,  0.  ,  0.  ,  0.75, -0.  ,  0.45])

In [22]:

# get second row of eigenvector matrix.
# this is a vector to represent node C
eig_vectors[2,:]

Out[22]:

array([ 0.38,  0.49, -0.  ,  0.71, -0.32,  0.  , -0.11])

In [40]:

# plot each point using 2nd and 3rd eigen vector
def plot_by_eigenvectors(eig_vectors):
    plt.figure()
    for i, name in enumerate(['A', 'B', 'C', 'D', 'E', 'F', 'G']):
        v1 = eig_vectors[i, 1]
        v2 = eig_vectors[i, 2]
        plt.annotate(name, xy=(v1, v2))
        print(name, v1, v2)
    plt.xlim(-.5,.5)
    plt.ylim(-.8,.8)
    plt.xlabel('second eigenvector value')
    plt.ylabel('third eigenvector value')
    plt.show()
    
plot_by_eigenvectors(eig_vectors)

A 0.49 -0.0
B 0.3 0.0
C 0.49 -0.0
D -0.21 0.0
E -0.36 0.71
F -0.36 -0.0
G -0.36 -0.71

In [1]:

from IPython.core.display import HTML
HTML(open('../custom.css').read())

Out[1]: