# Manifold Learning with Isomap¶


This tour explores the Isomap algorithm for manifold learning.

The <http://waldron.stanford.edu/~isomap/ Isomap> algorithm is introduced in

A Global Geometric Framework for Nonlinear Dimensionality Reduction, J. B. Tenenbaum, V. de Silva and J. C. Langford, Science 290 (5500): 2319-2323, 22 December 2000.

In [1]:
from __future__ import division

import numpy as np
import scipy as scp
import pylab as pyl
import matplotlib.pyplot as plt

from nt_toolbox.general import *
from nt_toolbox.signal import *

import warnings
warnings.filterwarnings('ignore')

%matplotlib inline


## Graph Approximation of Manifolds¶

Manifold learning consist in approximating the parameterization of a manifold represented as a point cloud.

First we load a simple 3D point cloud, the famous Swiss Roll.

Number of points.

In [2]:
n = 1000


Random position on the parameteric domain.

In [3]:
from numpy import random
x = random.rand(2,n)


Mapping on the manifold.

In [4]:
v = 3*np.pi/2*(.1 + 2*x[0,:])
X  = np.zeros([3,n])
X[1,:] = 20*x[1,:]
X[0,:] = - np.cos(v)*v
X[2,:] = np.sin(v)*v


Parameter for display.

In [5]:
ms = 200
el = 20; az = -110


Display the point cloud.

In [6]:
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(15,11))

#swiss roll
ax.scatter(X[0,:], X[1,:], X[2,:], c=plt.cm.jet((X[0,:]**2+X[2,:]**2)/100), s=ms, lw=0, alpha=1)

#params
ax.set_xlim(np.min(X[0,:]),np.max(X[0,:]))
ax.set_ylim(np.min(X[1,:]),np.max(X[1,:]))
ax.set_zlim(np.min(X[2,:]),np.max(X[2,:]))
ax.axis("off")
ax.view_init(elev=el, azim=az)


Compute the pairwise Euclidean distance matrix.

In [7]:
D1 = np.repeat(np.sum(X**2, 0)[:,np.newaxis], n, 1)
D1 = np.sqrt(D1 + np.transpose(D1) - 2*np.dot(np.transpose(X), X))


Number of NN for the graph.

In [8]:
k = 6


Compute the k-NN connectivity.

In [9]:
DNN, NN = np.sort(D1), np.argsort(D1)
NN = NN[:,1:k+1]
DNN = DNN[:,1:k+1]


In [10]:
from scipy import sparse

B = np.tile(np.arange(0,n),(k,1))
A = sparse.coo_matrix((np.ones(k*n),(np.ravel(B, order="F"), np.ravel(NN))))


Weighted adjacency (the metric on the graph).

In [11]:
W = sparse.coo_matrix((np.ravel(DNN),(np.ravel(B, order="F"), np.ravel(NN))))


Display the graph.

In [12]:
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(15,11))

#swiss roll
ax.scatter(X[0,:], X[1,:], X[2,:], c=plt.cm.jet((X[0,:]**2+X[2,:]**2)/100), s=ms, lw=0, alpha=1)

#graph
I,J,V = sparse.find(A)
xx = np.vstack((X[0,I],X[0,J]))
yy = np.vstack((X[1,I],X[1,J]))
zz = np.vstack((X[2,I],X[2,J]))

for i in range(len(I)):
ax.plot(xx[:,i], yy[:,i], zz[:,i], color="black")

#params
ax.axis("off")
ax.set_xlim(np.min(X[0,:]),np.max(X[0,:]))
ax.set_ylim(np.min(X[1,:]),np.max(X[1,:]))
ax.set_zlim(np.min(X[2,:]),np.max(X[2,:]))
ax.view_init(elev=el, azim=az)

plt.show()


## Floyd Algorithm to Compute Pairwise Geodesic Distances¶

A simple algorithm to compute the geodesic distances between all pairs of points on a graph is Floyd iterative algorithm. Its complexity is $\mathcal O(n^3)$ where $n$ is the number of points. It is thus quite slow for sparse graph, where Dijkstra runs in $\mathcal O(n^2\log(n))$.

Floyd algorithm iterates the following update rule, for $k=1,\dots,n$

$D(i,j) \leftarrow \min(D(i,j), D(i,k)+D(k,j))$,

with the initialization $D(i,j)=W(i,j)$ if $W(i,j)>0$, and $D(i,j)=Inf$ if $W(i,j)=0$.

Make the graph symmetric.

In [13]:
D = W.toarray()
D = (D + np.transpose(D))/2.


Initialize the matrix.

In [14]:
D[D == 0] = np.float("inf")


Add connexion between a point and itself.

In [15]:
D = D - np.diag(np.diag(D))
D[np.isnan(D)] = np.float("inf")


Exercise 1

Implement the Floyd algorithm to compute the full distance matrix $D$, where $D(i,j)$ is the geodesic distance between

In [16]:
run -i nt_solutions/shapes_7_isomap/exo1

In [17]:
## Insert your code here.


Find index of vertices that are not connected to the main manifold.

In [18]:
Iremove = np.where(D[:,0] == np.float("Inf"))


Remove Inf remaining values (disconnected components).

In [19]:
D[D == np.float("Inf")] = 0


## Isomap with Classical Multidimensional Scaling¶

Isomap perform the dimensionality reduction by applying multidimensional scaling.

Please refers to the tours on Bending Invariant for detail on Classical MDS (strain minimization).

Exercise 2

Perform classical MDS to compute the 2D flattening.

In [20]:
run -i nt_solutions/shapes_7_isomap/exo2

In [21]:
## Insert your code here.


Redess the points using the two leading eigenvectors of the covariance matrix (PCA correction).

In [22]:
[L, U] = linalg.eig(np.dot(Xstrain, np.transpose(Xstrain))/n)
Xstrain1 = np.dot(np.transpose(U), Xstrain)


Remove problematic points.

In [23]:
Xstrain1[:,Iremove] = np.float("inf")


Display the final result of the dimensionality reduction.

In [24]:
#plot size
plt.figure(figsize = (15,6))

#plot points
plt.scatter(Xstrain1[0,:], Xstrain1[1,:], ms, c=plt.cm.jet((X[0,:]**2+X[2,:]**2)/100), lw=0, alpha=1)

#plot vertices
I,J,V = sparse.find(A)
xx = np.vstack((Xstrain1[0,I], Xstrain1[0,J]))
yy = np.vstack((Xstrain1[1,I], Xstrain1[1,J]))

for i in range(len(I)):
plt.plot(xx[:,i], yy[:,i], color="black")

#params
plt.axis("off")
plt.xlim(np.min(Xstrain1[0,:]-1),np.max(Xstrain1[0,:])+1)
plt.ylim(np.min(Xstrain1[1,:]-1),np.max(Xstrain1[1,:])+1)

plt.show()