The Design Impact of Multiple Dispatch

As the core paradigm of Julia

@StefanKarpinski presented at Strange Loop on September 19, 2013

What is Multiple Dispatch?

  • dynamic — based on actual run-time type, not static type
  • mulitple — based on all arguments, not just the receiver

Tends to be written as function application:

  • f(a,b,c) ⟸ LIKE THIS
  • a.f(b,c) ⟸ NOT THIS

Multiple Dispatch ≠ Method Overloading

In Java or C++ you can provide these virtual methods:

Parent this: f(Parent that)
Parent this: f(Child  that)
Child  this: f(Parent that)
Child  this: f(Child  that)

Dispatched on this but not on that

  • can dispatch on both using the double dispatch pattern

However, quoting Double Dispatch is a Code Smell:

The presence of Double Dispatch generally means that each type in a hierarchy has special handling code within another hierarchy of types. This approach to representing variant behavior leads to code that is less resilient to future changes as well as being more difficult to extend.

Something smells, but it's not necessarily the double dispatch

  • double dispatch is only a code smell in single dispatch languages

Multiple Dispatch in Julia

Basic dispatch:

In [1]:
f(a::Any, b) = "fallback"
f(a::Number, b::Number) = "a and b are both numbers"
f(a::Number, b) = "a is a number"
f(a, b::Number) = "b is a number"
f(a::Integer, b::Integer) = "a and b are both integers"
Out[1]:
f (generic function with 5 methods)
In [2]:
f(1.5,2)
Out[2]:
"a and b are both numbers"
In [3]:
f(1,"bar")
Out[3]:
"a is a number"
In [4]:
f(1,2)
Out[4]:
"a and b are both integers"
In [5]:
f("foo",[1,2])
Out[5]:
"fallback"

"Diagonal" dispatch:

In [6]:
f{T<:Number}(a::T, b::T) = "a and b are both $(T)s"
Out[6]:
f (generic function with 6 methods)
In [7]:
f(big(1.5),big(2.5))
Out[7]:
"a and b are both BigFloats"
In [8]:
f(big(1),big(2)) #<== integer rule is more specific
Out[8]:
"a and b are both integers"
In [9]:
f("foo","bar") #<== still doesn't apply to non-numbers
Out[9]:
"fallback"

Varargs methods:

In [10]:
f(args::Number...) = "$(length(args))-ary heterogeneous call"
f{T<:Number}(args::T...) = "$(length(args))-ary homogeneous call"
Out[10]:
f (generic function with 8 methods)
In [11]:
f(1)
Out[11]:
"1-ary homogeneous call"
In [12]:
f(1,2,3)
Out[12]:
"3-ary homogeneous call"
In [13]:
f(1,1.5,2)
Out[13]:
"3-ary heterogeneous call"
In [14]:
f() #==> why is it heterogeneous not homogeneous?
Out[14]:
"0-ary heterogeneous call"
In [15]:
f(1,2) #<== previous 2-arg method is more specific
Out[15]:
"a and b are both integers"
In [16]:
f("foo") #<== doesn't apply to non-numbers
no method f(ASCIIString,)
at In[16]:1

Multiple Dispatch in Ruby

Arithmetic operators:

Number + Number    | String + String  | Array + Array
Number - Number    | Time - Time      | Time - Number     | Array - Array
Number * Number    | Array * Integer  | Array * String    | String * Integer
Integer << Integer | String << String | String << Integer

Arrays, Hashes & Strings:

(Array|Hash).fetch(index,default|block)
(Array|Hash).new(object|block) | String.new(string)
(Array|Hash)[int|range]        | String[int|range|regex|string]
(Array|Hash)[int|range]=       | String[int|range|regex|string]=
Array.slice(int|range)         | String.slice(int|range|regex|string)
Array.slice!(int|range)        | String.slice!(int|range|regex|string)

Just Strings:

String.index(string|int|regex)
String.rindex(string|int|regex)
String.sub(pattern,replacement|block)
String.sub!(pattern,replacement|block)
String.gsub(pattern,replacement|block)
String.gsub!(pattern,replacement|block)

Multiple Dispatch in English

Related meanings:

"she goes (home|away)"     go(subj::Noun, where::PlaceAdverb)
"it went (wrong|well)"     go(subj::Noun, how::MannerAdverb)

Default arguments:

"go (home|away|well)"      go(adv::Adverb) = go(Person("addressee"), adv)
"he goes"                  go(subj::Noun)  = go(subj, PlaceAdverb("somewhere"))
"go"                       go()            = go(PlaceAdverb("somewhere"))

Fragment of type hierarchy:

Person <: Noun
PlaceAdverb <: Adverb
MannerAdverb <: Adverb

Multiple Dispatch is Object-Oriented

Not surprising that o.o. languages are emulating multiple dispatch

  • convenient, natural way to exress complex polymorphic behaviors
  • and it's all about dispatch and subtyping of data

But they're missing a really crucial ingredient

  • a bit like having eval without being able to manipulate code as data

Generic Functions are Functional

A generic function is a first-class object

  • can be passed around, e.g. to higher-order functions
  • but remains extensible unlike "classical" functions
In [17]:
f
Out[17]:
f (generic function with 8 methods)
In [18]:
methods(f)
Out[18]:
# 8 methods for generic function "f":
f(a::Integer,b::Integer) at In[1]:5
f{T<:Number}(a::T<:Number,b::T<:Number) at In[6]:1
f{T<:Number}(args::T<:Number...) at In[10]:2
f(a::Number,b::Number) at In[1]:2
f(args::Number...) at In[10]:1
f(a::Number,b) at In[1]:3
f(a,b::Number) at In[1]:4
f(a,b) at In[1]:1
In [19]:
typeof(f)
Out[19]:
Function
In [20]:
f2(x) = f(x,x)

f2("foo")
Out[20]:
"fallback"
In [21]:
f(a::String, b::String) = "a and b are both strings";
f2(x) = f(x,x) #<== to reset method cache (issue #265)

f2("foo")
Out[21]:
"a and b are both strings"

The Expression Problem

Mads Torgersen's paper The Expression Problem Revisited:

To which degree can your application be structured in such a way that both the data model and the set of virtual operations over it can be extended without the need to modify existing code, without the need for code repetition and without runtime type errors.

Basically you want to be able to add:

  1. new types to which you can apply existing operations → easy in o.o., hard in functional
  2. new operations which you can apply to existing types → easy in function, hard in o.o.

Interval Arithmetic

In [22]:
immutable Interval{T<:Real} <: Number
  lo::T
  hi::T
end

(a::Real)..(b::Real) = Interval(a,b)

Base.show(io::IO, iv::Interval) = print(io, "($(iv.lo))..($(iv.hi))")
Out[22]:
show (generic function with 96 methods)
In [23]:
1..2
Out[23]:
(1)..(2)
In [24]:
typeof(ans)
Out[24]:
Interval{Int64} (constructor with 1 method)
In [25]:
sizeof(1..2) #==> two 8-byte ints
Out[25]:
16
In [26]:
(1.5)..(2.5)
Out[26]:
(1.5)..(2.5)
In [27]:
typeof(ans)
Out[27]:
Interval{Float64} (constructor with 1 method)
In [28]:
(1//2)..(2//3)
Out[28]:
(1//2)..(2//3)
In [29]:
typeof(ans)
Out[29]:
Interval{Rational{Int64}} (constructor with 1 method)
In [30]:
sizeof((1//2)..(2//3)) #==> just four 8-byte ints
Out[30]:
32

Allowing end-points of different types

In [31]:
methods(Interval)
Out[31]:
# 1 method for generic function "Interval":
Interval{T<:Real}(lo::T<:Real,hi::T<:Real)
In [32]:
#(0.5)..(1) #<== would be a no method error because of mixed types
In [33]:
Interval(lo::Real, hi::Real) = Interval(promote(lo,hi)...)
Out[33]:
Interval{T<:Real} (constructor with 2 methods)
In [34]:
(0.5)..1
Out[34]:
(0.5)..(1.0)
In [35]:
1..pi
Out[35]:
(1.0)..(3.141592653589793)
In [36]:
e..pi
Out[36]:
(2.718281828459045)..(3.141592653589793)

Extending the + and - operators

Adding and subtracting intervals:

In [37]:
a::Interval + b::Interval = (a.lo + b.lo)..(a.hi + b.hi)
a::Interval - b::Interval = (a.lo - b.hi)..(a.hi - b.lo)
Out[37]:
- (generic function with 107 methods)
In [38]:
(2..3) + (-1..1)
Out[38]:
(1)..(4)
In [39]:
(2..3) - (1..2)
Out[39]:
(0)..(2)

What about arithmetic mixing intervals and numbers?

  • It's tempting to implement start implementing all combinations
  • However, there's a better, more "Julian" way.

Instead, we'll hook Intervals into Julia's promotion system:

In [40]:
import Base: convert, promote_rule #<== allows extending them

convert{T<:Real}(::Type{Interval{T}}, x::Real) = (x = convert(T,x); x..x)
convert{T<:Real}(::Type{Interval{T}}, iv::Interval) =
    convert(T,iv.lo)..convert(T,iv.hi)

promote_rule{A<:Real,B<:Real}(::Type{Interval{A}}, ::Type{B}) = 
    Interval{promote_type(A,B)}
promote_rule{A<:Real,B<:Real}(::Type{Interval{A}}, ::Type{Interval{B}}) =
    Interval{promote_type(A,B)}
Out[40]:
promote_rule (generic function with 126 methods)

Presto! Everything now Just Works™:

In [41]:
1..2 + 1
Out[41]:
(2)..(3)
In [42]:
1..2 + 1.5
Out[42]:
(2.5)..(3.5)
In [43]:
3 - 1..2 + 2//3
Out[43]:
(5//3)..(8//3)
In [44]:
big(2)^100 + (1//3)..(2//3)
Out[44]:
(3802951800684688204490109616129//3)..(3802951800684688204490109616130//3)

Wait? What's going on here?

Let's examine the last computation in detail:

In [45]:
big(2)^100 + (1//3)..(2//3)
Out[45]:
(3802951800684688204490109616129//3)..(3802951800684688204490109616130//3)
In [46]:
typeof(ans)
Out[46]:
Interval{Rational{BigInt}} (constructor with 1 method)
In [47]:
@which big(2)^100 + (1//3)..(2//3)
+(x::Number,y::Number) at promotion.jl:148

The left-hand-side is a BigInt:

In [48]:
big(2)^100
Out[48]:
1267650600228229401496703205376
In [49]:
typeof(ans)
Out[49]:
BigInt (constructor with 7 methods)

The right-hand-side is an interval of rationals of 64-bit integers:

In [50]:
typeof((1//3)..(2//3))
Out[50]:
Interval{Rational{Int64}} (constructor with 1 method)

There are no methods for adding these specific types

  • a very general fallback method defined for adding numbers is called

This method is defined in base/promote.jl:

+(x::Number, y::Number) = +(promote(x,y)...)

The promote function for two arguments looks like this:

promote{T,S}(x::T, y::S) =
    (convert(promote_type(T,S),x), convert(promote_type(T,S),y))

where promote_type is defined in terms of promote_rule — it's a bit complicated.

Now, the following promotion rules exist:

promote_rule{A<:Real,B<:Real}(::Type{Interval{A}}, ::Type{B}) = 
    Interval{promote_type(A,B)}

promote_rule{A<:Integer,B<:Integer}(::Type{Rational{A}}, ::Type{B}) =
    Rational{promote_type(A,B)}

promote_rule{T<:Integer}(::Type{BigInt}, ::Type{T}) = BigInt

These all kick in when computing promote_type, producing this result:

In [51]:
promote_type(Interval{Rational{Int64}}, BigInt)
Out[51]:
Interval{Rational{BigInt}} (constructor with 1 method)

So promote converts both big(2)^100 and (1//3)..(2//3) to Interval{Rational{Int64}}:

In [52]:
convert(Interval{Rational{BigInt}}, big(2)^100)
Out[52]:
(1267650600228229401496703205376//1)..(1267650600228229401496703205376//1)
In [53]:
convert(Interval{Rational{BigInt}}, (1//3)..(2//3))
Out[53]:
(1//3)..(2//3)

After that adding them is just a call to the + method for same-typed Intervals.

Defining a new ± operator

The syntax for ± already exists in the parser, but its behavior is undefined:

In [54]:
:(2 ± 1)
Out[54]:
:(±(2,1))
In [55]:
2 ± 1
± not defined
at In[55]:1

Let's define it:

In [56]:
a::Real     ± b::Real     = (a - b)..(a + b)
a::Interval ± b::Interval = (a.lo - b.hi)..(a.hi + b.hi)
a::Number   ± b::Number   = ±(promote(a,b)...)
Out[56]:
± (generic function with 3 methods)
In [57]:
1 ± 2
Out[57]:
(-1)..(3)
In [58]:
pi - (2 ± 1//2)
Out[58]:
(0.6415926535897931)..(1.6415926535897931)

Notice that:

  • hooking into the promotion system requires a bit of thought
  • but it's a one-time cost – adding new promoting operations is easy
  • types can work together smoothly without knowing about each other

Interval Arithmetic Recap

Basic type definition

immutable Interval{T<:Real} <: Number
  lo::T
  hi::T
end
Interval(lo::Real, hi::Real) = Interval(promote(lo,hi)...)

(a::Real)..(b::Real) = Interval(a,b)

Base.show(io::IO, iv::Interval) = print(io, "($(iv.lo))..($(iv.hi))")

Conversion & promotion rules

import Base: convert, promote_rule

convert{T<:Real}(::Type{Interval{T}}, x::Real) = (x = convert(T,x); x..x)
convert{T<:Real}(::Type{Interval{T}}, iv::Interval) =
    convert(T,iv.lo)..convert(T,iv.hi)

promote_rule{A<:Real,B<:Real}(::Type{Interval{A}}, ::Type{B}) = 
    Interval{promote_type(A,B)}
promote_rule{A<:Real,B<:Real}(::Type{Interval{A}}, ::Type{Interval{B}}) =
    Interval{promote_type(A,B)}

Core arithmetic

a::Interval + b::Interval = (a.lo + b.lo)..(a.hi + b.hi)
a::Interval - b::Interval = (a.lo - b.hi)..(a.hi - b.lo)

a::Real     ± b::Real     = (a - b)..(a + b)
a::Interval ± b::Interval = (a.lo - b.hi)..(a.hi + b.hi)
a::Number   ± b::Number   = ±(promote(a,b)...)

Generic functions elegantly and smoothly solve the expression problem:

  • new Interval type works with built-in + and - operators
  • new ± operator works with built-in Int type

Taking Multiple Dispatch Seriously

Generic function in Julia aren't special – they're the default

  • even really basic things like + are generic functions
In [59]:
+
Out[59]:
+ (generic function with 96 methods)

This means you're free to extend everything

  • not just functions that were planned to be extended

Performance

In [60]:
code_llvm(+,(Int,Int))
define i64 @"julia_+2202"(i64, i64) {
top:
  %2 = add i64 %1, %0, !dbg !10575
  ret i64 %2, !dbg !10575
}
In [61]:
code_llvm(+,(Int,Float64))
define double @"julia_+2208"(i64, double) {
top:
  %2 = sitofp i64 %0 to double, !dbg !10605
  %3 = fadd double %2, %1, !dbg !10605
  ret double %3, !dbg !10605
}

User Code is Also Fast

Let's define a scalar-valued function using the Interval type we just defined:

In [62]:
lo(a,b) = (a ± b).lo
hi(a,b) = (a ± b).hi
Out[62]:
hi (generic function with 1 method)
In [63]:
lo(3,2), hi(3,2)
Out[63]:
(1,5)
In [64]:
code_llvm(lo,(Int,Int))
define i64 @julia_lo2227(i64, i64) {
top:
  %2 = sub i64 %0, %1, !dbg !10684
  ret i64 %2, !dbg !10684
}
In [65]:
code_llvm(hi,(Int,Int))
define i64 @julia_hi2228(i64, i64) {
top:
  %2 = add i64 %1, %0, !dbg !10690
  ret i64 %2, !dbg !10690
}

Even for complex combinations of types, generated code is very efficient:

In [66]:
code_native(hi,(Float64,Rational{Int}))
	.section	__TEXT,__text,regular,pure_instructions
Filename: In[62]
Source line: 2
	push	RBP
	mov	RBP, RSP
Source line: 2
	vcvtsi2sd	XMM1, XMM0, RSI
	vcvtsi2sd	XMM2, XMM0, RDI
	vdivsd	XMM1, XMM2, XMM1
	vaddsd	XMM0, XMM1, XMM0
	pop	RBP
	ret

Design Impact

Since generic functions are open:

  • functions are more like protocols which users can also implement

We're forced to think much harder about the meaning of opearations

  • can't just describe their behavior.

Example:

search(string, pattern, [start])

Search for the first occurance of the given pattern within the given string (or byte array). The pattern may be a single character, a vector or a set of characters, a string, or a regular expression (regular expressions are only allowed on contiguous strings, such as ASCII or UTF-8 strings). The third argument optionally specifies a starting index. The return value is a range of indexes where the matching sequence is found, such that s[search(s,x)] == x. The return value when the pattern is not found is 0 when the pattern is a character, or 0:-1 when searching for a substring or regular expression match.

Meaning is Hard but Powerful

The split function is defined generically in terms of the search abstraction:

function split(str::String, splitter, limit::Integer, keep_empty::Bool)
    strs = String[]
    i = start(str)
    n = endof(str)
    r = search(str,splitter,i)
    j, k = first(r), nextind(str,last(r))
    while 0 < j <= n && length(strs) != limit-1
        if i < k
            if keep_empty || i < j
                push!(strs, str[i:prevind(str,j)])
            end
            i = k
        end
        if k <= j; k = nextind(str,j) end
        r = search(str,splitter,k)
        j, k = first(r), nextind(str,last(r))
    end
    if keep_empty || !done(str,i)
        push!(strs, str[i:])
    end
    return strs
end
split(s::String, spl, n::Integer) = split(s, spl, n, true)
split(s::String, spl, keep::Bool) = split(s, spl, 0, keep)
split(s::String, spl)             = split(s, spl, 0, true)

This gives the ability to:

  • create a new pattern type for matching patterns in strings
    • e.g. context-free grammars, etc.
  • define search(str, pattern) for the new pattern
    • get split functionality for free

Example: spliting on regular expressions

Regex doesn't know anything about split:

In [67]:
methods(split)
Out[67]:
# 5 methods for generic function "split":
split(str::String,splitter,limit::Integer,keep_empty::Bool) at string.jl:1250
split(s::String,spl,keep::Bool) at string.jl:1272
split(s::String,spl,n::Integer) at string.jl:1271
split(s::String,spl) at string.jl:1273
split(str::String) at string.jl:1277

Yet split works when with Regex patterns:

In [68]:
split("foobarbazqux", r"[aeiou]+"i)
Out[68]:
5-element Array{String,1}:
 "f" 
 "b" 
 "rb"
 "zq"
 "x" 

Works because Regex provides the search function

  • and split is defined generically in terms of search

We're Not There Yet

Meaning is hard and there are a lot of "protocols" that we still need to abstract out.

  • There are separate plot functions in every plotting package
  • Ideally, there should be a single commong plot function

These kinds of abstract protocols are "narrow waists"

  • like TCP/IP in networking or byte streams in UNIX

Figuring out the right ones is hard but essential work.