Notebook

12.5.2 Shifting the Inverse Power Method¶

With this notebook, we demonstrate how the Inverse Power Method can be accelerated by shifting the matrix.

Be sure to make a copy!!!!

We start by creating a matrix with known eigenvalues and eigenvectors

How do we do this?

We want a matrix that is not deficient, since otherwise the Shifted Inverse Power Method may not work.
Hence, $ A = V \Lambda V^{-1} $ for some diagonal matrix $ \Lambda $ and nonsingular matrix $ V $. The eigenvalues are then on the diagonal of $ \Lambda $ and the eigenvectors are the columns of $ V $.
So, let's pick the eigenvalues for the diagonal of $ \Lambda $ and let's pick a random matrix $ V $ (in the hopes that it has linearly independent columns) and then let's see what happens.

Experiment by changing the eigenvalues! What happens if you make the second entry on the diagonal equal to -4? Or what if you set 2 to -1?

In [ ]:

import numpy as np
import laff
import flame

Lambda = np.matrix( ' 4., 0., 0., 0;\
                      0., 3., 0., 0;\
                      0., 0., 2., 0;\
                      0., 0., 0., 1' )

lambda0 = Lambda[ 0,0 ]

V = np.matrix( np.random.rand( 4,4 ) )

# normalize the columns of V to equal one

for j in range( 0, 4 ):
    V[ :, j ] = V[ :, j ] / np.sqrt( np.transpose( V[:,j] ) * V[:, j ] )

A = V * Lambda * np.linalg.inv( V )

print( 'Lambda = ' )
print( Lambda)

print( 'V = ' )
print( V )

print( 'A = ' )
print( A )

The idea is as follows:

The eigenvalues of $ A $ are $ \lambda_0, \ldots, \lambda_3 $ with

$$ \vert \lambda_0 \vert > \vert \lambda_1 \vert > \vert \lambda_2 \vert > \vert \lambda_3 \vert > 0 $$

and how fast the iteration converges depends on the ratio

$$ \left\vert \frac{\lambda_3}{\lambda_2} \right\vert . $$

Now, if you pick a value, $ \mu $ close to $ \lambda_3 $, and you iterate with $ A - \mu I $ (which is known as shifting the matrix/spectrum by $ \mu $) you can greatly improve the ratio $$ \left\vert \frac{\lambda_3-\mu}{\lambda_2-\mu} \right\vert . $$

Try different values of $ \mu$. What if you pick $ \mu \approx 2 $?
What if you pick $ \mu = 0.8 $?

In [ ]:

# Pick a random starting vector

x = np.matrix( np.random.rand( 4,1 ) )

# We should really compute a factorization of A, but let's be lazy, and compute the inverse
# explicitly

mu = 0.8

Ainv = np.linalg.inv( A - mu * np.eye( 4, 4 ) )

for i in range(0,10):
    x = Ainv * x 
    
    # normalize x to length one
    x = x / np.sqrt( np.transpose( x ) * x )
    
    # Notice we compute the Rayleigh quotient with matrix A, not Ainv.  This is because
    # the eigenvector of A is an eigenvector of Ainv
    
    print( 'Rayleigh quotient with vector x:', np.transpose( x ) * A * x / ( np.transpose( x ) * x ))
    print( 'inner product of x with v3     :', np.transpose( x ) * V[ :, 3 ] )
    print( ' ' )

In [ ]:

In the above,

The Rayleigh quotient should converge to 1.0 (quickly if $ \mu \approx 1 $).
The inner product of $ x $ and the last column of $ V $, $ v_{n-1} $, should converge to 1 or -1 since eventually $ x $ should be in the direction of $ v_{n-1} $ (or in the opposite direction).

This time, if you change the "2" on the diagonal to "-1", you still converge to $ v_{n-1} $ because for the matrix $ A - \mu I $, $ -1 - \mu $ is not as small as $ 1 - \mu $ (in magnitude).

You can check this by looking at $ ( I - V_R ( V_R^T V_R )^{-1} V_R^T ) x $, where $V_R $ equals the matrix with $ v_2 $ and $ v_3 $ as its columns, to see if the vector orthogonal to $ {\cal C}( V_R ) $ converges to zero. This is seen in the following code block:

In [ ]:

w = x - V[ :,2:4 ] * np.linalg.inv( np.transpose( V[ :,2:4 ] ) * V[ :,2:4 ] ) * np.transpose( V[ :,2:4 ] ) * x
    
print( 'Norm of component orthogonal: ', np.linalg.norm( w ) )

In [ ]: