ExprOptimization.jl¶

ExprOptimization.jl is a Julia package for optimizing Julia expressions. The package implements algorithms to optimize expression trees derived from a grammar to optimize a user-defined objective function. The package depends on ExprRules.jl.

Installation¶

To install the package:

Pkg.add("ExprOptimization")

Usage¶

To start using the package:

In [1]:

using ExprOptimization

Example -- Symbolic Regression¶

We consider the example of finding an algebraic expression that approximates a given function.

First, we define a grammar:

In [2]:

const grammar = @grammar begin
    Real = x
    Real = Real * Real
    Real = Real + Real
    Real = Real - Real
    Real = |(1:5)
end

Out[2]:

1: Real = x
2: Real = Real * Real
3: Real = Real + Real
4: Real = Real - Real
5: Real = 1
6: Real = 2
7: Real = 3
8: Real = 4
9: Real = 5

Create a custom symbol table and automatically populate it from the grammar. This is not strictly necessary since ExprOptimization.jl can use the global symbol table, but it leads to much better performance.

In [3]:

const S = SymbolTable(grammar)

Out[3]:

Dict{Symbol,Any} with 3 entries:
  :+ => +
  :- => -
  :* => *

Next, we define the ground truth expression and loss function by overloading the loss function in ExprOptimization. The loss function returns the real-valued loss of a given expression tree. The loss is minimized.

In [4]:

ground_truth(x) = x*x + 2x + 1
function loss(tree::RuleNode, grammar::Grammar)
    ex = get_executable(tree, grammar)
    los = 0.0
    for x = -5.0:1.0:5.0
        S[:x] = x
        los += abs2(Core.eval(S,ex) - ground_truth(x))
    end
    los
end

Out[4]:

loss (generic function with 1 method)

Once these are defined, we can use any of the implemented algorithms to perform the optimization.

Monte Carlo¶

Monte Carlo (MC) draws a number of random expression trees from the grammar and returns the one with the best loss.

In [5]:

?MonteCarlo

search: MonteCarlo MonteCarlos

Out[5]:

MonteCarlo

Monte Carlo.

Arguments:¶

num_samples::Int: number of samples
max_depth::Int: maximum depth of derivation tree

In [6]:

using Random
Random.seed!(10)
p = MonteCarlo(20000, 6)
results_mc = optimize(p, grammar, :Real, loss)
(results_mc.expr, results_mc.loss)

Out[6]:

(:((x + 3) * x), 121.0)

In [7]:

display(results_mc.tree, grammar)

Out[7]:

Genetic Programming¶

Genetic Programming (GP) is an evolutionary algorithm for trees.

See: Koza, "Genetic Programming: On the Programming of Computers by Means of Natural Selection", MIT Press, 1992.

In [8]:

?GeneticProgram

search: GeneticProgram GeneticPrograms

Out[8]:

GeneticProgram

Genetic Programming.

Arguments¶

pop_size::Int: population size
iterations::Int: number of iterations
max_depth::Int: maximum depth of derivation tree
p_reproduction::Float64: probability of reproduction operator
p_crossover::Float64: probability of crossover operator
p_mutation::Float64: probability of mutation operator
init_method::InitializationMethod: initialization method
select_method::SelectionMethod: selection method

In [9]:

Random.seed!(1)
p = GeneticProgram(1000,20,6,0.3,0.3,0.4)
results_gp = optimize(p, grammar, :Real, loss)
(results_gp.expr, results_gp.loss)

Out[9]:

(:((x * 2 + (x * x - 3)) + 4), 0.0)

In [10]:

display(results_gp.tree, grammar)

Out[10]:

Grammatical Evolution¶

Grammatical Evolution (GE) is an evolutionary algorithm based on sequentializing the decisions in the derivation tree (e.g., using depth-first traversal order). Optimization is performed over integer arrays using genetic algorithms.

See: C. Ryan, J.J. Collins, M. O'Neil, "Grammatical Evolution: Evolving Programs for an Arbitrary Language", in European Conference on Genetic Programming, Springer, 1998, pp. 83-96.

In [11]:

?GrammaticalEvolution

search: GrammaticalEvolution GrammaticalEvolutions

Out[11]:

GrammaticalEvolution

Grammatical Evolution.

Arguments¶

grammar::Grammar: grammar
typ::Symbol: start symbol
pop_size::Int: population size
iterations::Int: number of iterations
init_gene_length::Int: initial length of genotype integer array
max_gene_length::Int: maximum length of genotype integer array
max_depth::Int: maximum depth of derivation tree
p_reproduction::Float64: probability of reproduction operator
p_crossover::Float64: probability of crossover operator
p_mutation::Float64: probability of mutation operator
select_method::SelectionMethod: selection method (default: tournament selection)
mutate_method::InitializationMethod: mutation method (default: multi-mutate)

In [12]:

Random.seed!(1)
p = GrammaticalEvolution(grammar,:Real,1000,20,10,10,6,0.2,0.4,0.4; select_method=GrammaticalEvolutions.TruncationSelection(300))
results_ge = optimize(p, grammar, :Real, loss)
(results_ge.expr, results_ge.loss)

Out[12]:

(:(x * ((5 + x) - 3) + 1), 0.0)

In [13]:

display(results_ge.tree, grammar)

Out[13]:

Cross-Entropy Method¶

The Cross-Entropy (CE) Method is a population-based optimization algorithm based on repeatedly estimating the probability distribution of good solutions. This implementation uses a probabilistic grammar to represent the distributions.

See: Rubinstein, "Optimization of Computer Simulation Models with Rare Events", European Journal of Operations Research, 99, 89-112, 1197

In [14]:

?CrossEntropy

search: CrossEntropy CrossEntropys

Out[14]:

CrossEntropy

Cross Entropy method.

Arguments¶

pop_size::Int: population size
iterations::Int: number of iterations
max_depth::Int: maximum depth of derivation tree
top_k::Int: top k elite samples used in selection
p_init::Float64: initial value when fitting MLE
init_method::InitializationMethod: Initialization method

In [15]:

Random.seed!(1)
p = CrossEntropy(1000,20,6,500)
results_ce = optimize(p, grammar, :Real, loss)
(results_ce.expr, results_ce.loss)

Out[15]:

(:(x * (2 + x) + 1), 0.0)

In [16]:

display(results_ce.tree, grammar)

Out[16]:

PIPE¶

Probabilistic Incremental Program Evolution (PIPE) is an expression tree optimization algorithm based on the probabilistic prototype tree (PPT) model.

See: Salustowicz and Schmidhuber, "Probabilistic Incremental Program Evolution", Evolutionary Computation, vol. 5, no. 2, pp. 123-141, 1997.

In [17]:

?PIPE

search: PIPE PIPEs Pipe pipeline PipeBuffer partialsortperm partialsortperm!

Out[17]:

PIPE

Probabilistic Incremental Program Evolution. Example parameters from paper are indicated in parentheses)

Arguments:¶

ppt_params::PPT: parameters for PPT (e.g., [0.8, 0.2])
pop_size::Int: population size
iterations::Int: number of iterations
p_elitist::Float64: elitist update probability (e.g., 0.2)
c::Float64: learning rate multiplier (e.g., 0.1)
α::Float64: learning rate (e.g., 0.05)
ϵ::Float64: fitness constant (e.g., 1)
p_mutation::Float64: mutation probability (e.g., 0.2)
β::Float64: mutation rate (e.g., 0.6)
p_threshold::Float64: prune threshold (e.g., 0.999)
max_depth::Int: maximum depth of derivation tree

In [18]:

Random.seed!(3)
p = PIPE(PPT(0.8),1000,20,0.2,0.1,0.05,1,0.2,0.6,0.999,6)
results_pipe = optimize(p, grammar, :Real, loss)
(results_pipe.expr, results_pipe.loss)

Out[18]:

(:(x * (1x - ((5 - 1) - 5)) + (1 + x)), 0.0)

In [19]:

display(results_pipe.tree, grammar)

Out[19]:

In [ ]: