#!/usr/bin/env python
# coding: utf-8

# > This is one of the 100 recipes of the [IPython Cookbook](http://ipython-books.github.io/), the definitive guide to high-performance scientific computing and data science in Python.
# 

# # 9.2. Minimizing a mathematical function

# 1. We import the libraries.

# In[ ]:


import numpy as np
import scipy as sp
import scipy.optimize as opt
import matplotlib.pyplot as plt
get_ipython().run_line_magic('matplotlib', 'inline')


# 2. First, let's define a simple mathematical function (the opposite of the **cardinal sine**). This function has many local minima but a single global minimum. (http://en.wikipedia.org/wiki/Sinc_function)

# In[ ]:


f = lambda _: 1-np.sin(_)/_


# 3. Let's plot this function on the interval $[-20, 20]$.

# In[ ]:


x = np.linspace(-20., 20., 1000)
y = f(x)


# In[ ]:


plt.figure(figsize=(5,5));
plt.plot(x, y);


# 4. The `scipy.optimize` module comes with many function minimization routines. The `minimize` function offers a unified interface to many algorithms. The **Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm** (default in `minimize`) gives good results in general. The `minimize` function requires an initial point as argument. For scalar univariate functions, we can also use `minimize_scalar`.

# In[ ]:


x0 = 3
xmin = opt.minimize(f, x0).x


# Starting from $x_0=3$, the algorithm was able to find the actual global minimum, as shown on the following figure.

# In[ ]:


plt.figure(figsize=(5,5));
plt.plot(x, y);
plt.scatter(x0, f(x0), marker='o', s=300);
plt.scatter(xmin, f(xmin), marker='v', s=300);
plt.xlim(-20, 20);


# 5. Now, if we start from an initial point that is further away from the actual global minimum, the algorithm converges towards a *local* minimum only.

# In[ ]:


x0 = 10
xmin = opt.minimize(f, x0).x


# In[ ]:


plt.figure(figsize=(5,5));
plt.plot(x, y);
plt.scatter(x0, f(x0), marker='o', s=300);
plt.scatter(xmin, f(xmin), marker='v', s=300);
plt.xlim(-20, 20);


# 6. Like most function minimization algorithms, the BFGS algorithm is efficient at finding *local* minima, but not necessarily *global* minima, especially on complicated or noisy objective functions. A general strategy to overcome this problem consists in combining such algorithms with an exploratory grid search on the initial points. Another option is to use a different class of algorithms based on heuristics and stochastic methods. A popular example is the **simulated annealing method**.

# In[ ]:


xmin = opt.minimize(f, x0, method='Anneal').x


# In[ ]:


plt.figure(figsize=(5,5));
plt.plot(x, y);
plt.scatter(x0, f(x0), marker='o', s=300);
plt.scatter(xmin, f(xmin), marker='v', s=300);
plt.xlim(-20, 20);


# This time, the algorithm was able to find the global minimum.

# 7. Now, let's define a new function, in two dimensions this time. This function is called the **Lévi function**. It is defined by
# 
# $$f(x,y) = \sin^{2}\left(3\pi x\right)+\left(x-1\right)^{2}\left(1+\sin^{2}\left(3\pi y\right)\right)+\left(y-1\right)^{2}\left(1+\sin^{2}\left(2\pi y\right)\right)$$
# 
# This function is very irregular and may be difficult to minimize in general. It is one of the many **test functions for optimization** that researchers have developed to study and benchmark optimization algorithms. (http://en.wikipedia.org/wiki/Test_functions_for_optimization)

# In[ ]:


def g(X):
    # X is a 2*N matrix, each column contains
    # x and y coordinates.
    x, y = X
    return np.sin(3*np.pi*x)**2+(x-1)**2*(1+np.sin(3*np.pi*y)**2)+(y-1)**2*(1+np.sin(2*np.pi*y)**2)


# 8. Let's display this function with `imshow`, on the square $[-10,10]^2$.

# In[ ]:


n = 200
k = 10
X, Y = np.mgrid[-k:k:n*1j,-k:k:n*1j]


# In[ ]:


Z = g(np.vstack((X.ravel(), Y.ravel()))).reshape(n,n)


# In[ ]:


plt.figure(figsize=(5, 5));
# We use a logarithmic scale for the color here.
plt.imshow(np.log(Z), cmap=plt.cm.hot_r);
plt.xticks([]); plt.yticks([]);


# 9. The BFGS algorithm also works in multiple dimensions.

# In[ ]:


x0, y0 = opt.minimize(g, (8, 3)).x


# In[ ]:


plt.figure(figsize=(5, 5));
plt.imshow(np.log(Z), cmap=plt.cm.hot_r,
           extent=(-k, k, -k, k), origin=0);
plt.scatter(x0, y0, s=100);
plt.xticks([]); plt.yticks([]);


# > You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).
# 
# > [IPython Cookbook](http://ipython-books.github.io/), by [Cyrille Rossant](http://cyrille.rossant.net), Packt Publishing, 2014 (500 pages).