Description: An example of implementing a derivative-based nonlinear solver in Julia and hooking it into MathProgBase. MathProgBase connects solvers to modeling interfaces like JuMP and AMPL, which provide automatic computation of exact first and second-order derivatives.

Author: Miles Lubin

License: Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Newton's method in less than 50 lines

In this notebook, we demonstrate how to implement Newton's method, by querying derivatives through the MathProgBase nonlinear interface. We then demonstrate using this new solver from both JuMP and AMPL.

In [1]:
using MathProgBase

First we define the NewtonSolver object, which holds solver options (here there are none). Solver objects are used to instantiate a NewtonData object, which is the object which then solves the instance of the optimization problem. The NewtonData type is a subtype of the AbstractNonlinearModel type and stores all the instance data we need to run the algorithm.

In [2]:
type NewtonSolver <: MathProgBase.AbstractMathProgSolver
end

type NewtonData <: MathProgBase.AbstractNonlinearModel
    numVar::Int
    d # NLP evaluator
    x::Vector{Float64}
    hess_I::Vector{Int} # 1st component of Hessian sparsity pattern
    hess_J::Vector{Int} # 2nd component of Hessian sparsity pattern
    status::Symbol
end

MathProgBase.NonlinearModel(solver::NewtonSolver) = NewtonData(0,nothing,Float64[],Int[],Int[],:Uninitialized);

This method is called when we're about to solve the optimization problem. It provides basic problem dimensions and lower and upper bound vectors, as well as the AbstractNLPEvaluator object, which is an oracle that can be used by the solver to query function and derivative evaluations.

In [3]:
function MathProgBase.loadproblem!(m::NewtonData, numVar, numConstr, l, u, lb, ub, sense, d::MathProgBase.AbstractNLPEvaluator)
    @assert numConstr == 0                    # we don't handle constraints
    @assert all(l .== -Inf) && all(u .== Inf) # or variable bounds
    @assert sense == :Min                     # or maximization
    
    MathProgBase.initialize(d, [:Grad, :Hess]) # request gradient and hessian evaluations
    
    m.d = d
    m.numVar = numVar
    m.x = zeros(numVar)
    m.hess_I, m.hess_J = MathProgBase.hesslag_structure(d)
end;

And here's the actual implementation of Newton's method. We assume:

  • The domain of the objective function is $\mathbb{R}^n$
  • The objective function is strictly convex, so the Hessian matrix is positive definite.

We also fix the step size to 1 and do not perform a line search.

In [4]:
function MathProgBase.optimize!(m::NewtonData)
    
    iteration = 0

    ∇f = Array(Float64,m.numVar)
    hess_val = Array(Float64, length(m.hess_I)) # array to store nonzeros in Hessian
    MathProgBase.eval_grad_f(m.d, ∇f, m.x) # writes gradient to ∇f vector
    ∇f_norm = norm(∇f)
    while ∇f_norm > 1e-5
        println("Iteration $iteration. Gradient norm $∇f_norm, x = $(m.x)");
        
        # Evaluate the Hessian of the Lagrangian.
        # We don't have any constraints, so this is just the Hessian.
        MathProgBase.eval_hesslag(m.d, hess_val, m.x, 1.0, Float64[])
        
        # The derivative evaluator provides the Hessian matrix
        # in lower-triangular sparse (i,j,value) triplet format,
        # so now we convert this into a Julia symmetric sparse matrix object.
        H = Symmetric(sparse(m.hess_I,m.hess_J,hess_val),:L)
        
        m.x = m.x - H\∇f # newton step
        
        MathProgBase.eval_grad_f(m.d, ∇f, m.x)
        ∇f_norm = norm(∇f)
        iteration += 1
    end
    
    println("Iteration $iteration. Gradient norm $∇f_norm, x = $(m.x), DONE");
    
    m.status = :Optimal  
end;

We implement a few methods to provide starting solutions and query solution status.

In [5]:
MathProgBase.setwarmstart!(m::NewtonData,x) = (m.x = x)
MathProgBase.status(m::NewtonData) = m.status
MathProgBase.getsolution(m::NewtonData) = m.x
MathProgBase.getconstrduals(m::NewtonData) = [] # lagrange multipliers on constraints (none)
MathProgBase.getreducedcosts(m::NewtonData) = zeros(m.numVar) # lagrange multipliers on variable bounds
MathProgBase.getobjval(m::NewtonData) = MathProgBase.eval_f(m.d, m.x);
In [6]:
using JuMP
In [7]:
jumpmodel = Model(solver=NewtonSolver())
@variable(jumpmodel, x)
@variable(jumpmodel, y)
@NLobjective(jumpmodel, Min, (x-5)^2 + (y-3)^2)

status = solve(jumpmodel);
Iteration 0. Gradient norm 11.661903789690601, x = [0.0,0.0]
Iteration 1. Gradient norm 1.9860273225978185e-15, x = [4.999999999999999,2.9999999999999996], DONE

Note that we've converged to the optimal solution in one step. This is expected when applying Newton's method to a strictly convex quadratic objective function.

Now let's hook our solver to AMPL!

AMPL is a commercial algebraic modeling language. You can download a limited demo of AMPL here and extract it into the ampl-demo directory under your home directory to follow along.

In [ ]:
mod = """
var x;
var y;

minimize OBJ:
   (x-5)^2 + (y-3)^2;

write gquad;

"""
fd = open("quad.mod","w")
write(fd, mod)
close(fd)

The proprietary ampl executable translates the .mod file into a .nl file which we can directly read from Julia.

In [ ]:
;~/ampl-demo/ampl quad.mod
In [ ]:
;cat quad.nl | head -n 3

The AmplNLReader and AMPLMathProgInterface packages are not yet officially registered as Julia packages. They're not yet recommended for general use.

But anyway, here's a demo of what they can do:

In [ ]:
using AmplNLReader, AMPLMathProgInterface

Here we read in the .nl file and call the AMPLMathProgInterface.loadamplproblem! method to load the .nl file into our solver which we just wrote. Then we just call optimize! and we're done.

In [ ]:
nlp = AmplNLReader.AmplModel("quad.nl")

m = MathProgBase.model(NewtonSolver())
AMPLMathProgInterface.loadamplproblem!(m, nlp)
MathProgBase.optimize!(m);

Again we converge to the optimal solution in one step, but this time we used AMPL to compute the derivatives of the model instead of JuMP.