using Gadfly
Semantics inspired by Leland Wilkinson's book.
<img width=200 alt="The Grammar of Graphics cover"src="gog-cover2.png">
The gammar of graphics was developed … in order to produce a flexible system that can create a rich variety of charts as simply as possible, without duplication of methods.
Other GoG implementations: nViZn (Java), ggplot2 (R), Bokeh (Python), ggplot (Python)
On the surface, plotting with Gadfly isn't so different.
set_default_plot_size(10cm, 6cm)
Measure{MeasureNil,MeasureNil}(60.0,MeasureNil(),MeasureNil(),0.0,0.0)
xs = 1:10
ys = rand(10)
plot(x=xs, y=ys)
In Gadfly, this is shorthand for
plot(x=xs, y=ys,
Scale.x_continuous, Scale.y_continuous,
Stat.xticks, Stat.yticks,
Geom.point,
Guide.xlabel("x"), Guide.ylabel("y"),
Guide.xticks, Guide.yticks)
What is all this stuff?
Plots consist of a collection of components that fit together in a coherent way. Connections are defined in terms of aesthetics.
More components can be passed to the plot function to override or extend the defaults.
plot(x=xs, y=ys, Scale.x_log2, Geom.point, Geom.line)
Our one stop shop for example datasets.
using RDatasets
iris = dataset("datasets", "iris")
head(iris)
SepalLength | SepalWidth | PetalLength | PetalWidth | Species | |
---|---|---|---|---|---|
1 | 5.1 | 3.5 | 1.4 | 0.2 | setosa |
2 | 4.9 | 3.0 | 1.4 | 0.2 | setosa |
3 | 4.7 | 3.2 | 1.3 | 0.2 | setosa |
4 | 4.6 | 3.1 | 1.5 | 0.2 | setosa |
5 | 5.0 | 3.6 | 1.4 | 0.2 | setosa |
6 | 5.4 | 3.9 | 1.7 | 0.4 | setosa |
plot
has a convenient form for working with DataFrames.
set_default_plot_size(6inch, 3inch)
Measure{MeasureNil,MeasureNil}(76.19999999999999,MeasureNil(),MeasureNil(),0.0,0.0)
plot(iris, x=:SepalLength, y=:SepalWidth, color=:Species)
Summarizations of input data are handled by statistics, which are decoupled from geometry.
mac = dataset("Zelig", "macro")
set_default_plot_size(6inch, 3inch)
plot(mac, x=:Year, y=:GDP, Stat.histogram2d(xbincount=15, ybincount=15), Geom.rectbin)
plot(mac, x=:Year, y=:Country, color=:GDP,
Scale.continuous_color(minvalue=-15, maxvalue=15), Geom.rectbin)
The plot
function is not artificially restricted to particular types (e.g. floats, integers)
import SIUnits
import SIUnits.ShortUnits
iris[:SepalWidth] = iris[:SepalWidth] .* SIUnits.ShortUnits.cm
iris[:SepalLength] = iris[:SepalLength] .* SIUnits.ShortUnits.cm
plot(iris, x=:SepalLength, y=:SepalWidth, color=:Species)
Warning: New definition isless(SIQuantity{T,0,0,0,0,0,0,0},Number) at /Users/dcjones/.julia/v0.3/SIUnits/src/SIUnits.jl:234 is ambiguous with: isless(Number,Dual{T<:Real}) at /Users/dcjones/.julia/v0.3/DualNumbers/src/dual.jl:109. To fix, define isless(SIQuantity{T,0,0,0,0,0,0,0},Dual{T<:Real}) before the new definition. Warning: New definition isless(Number,SIQuantity{T,0,0,0,0,0,0,0}) at /Users/dcjones/.julia/v0.3/SIUnits/src/SIUnits.jl:237 is ambiguous with: isless(Dual{T<:Real},Number) at /Users/dcjones/.julia/v0.3/DualNumbers/src/dual.jl:110. To fix, define isless(Dual{T<:Real},SIQuantity{T,0,0,0,0,0,0,0}) before the new definition. Warning: New definition *(Any,T<:NonSIUnit{BaseUnit<:SIUnit{m,kg,s,A,K,mol,cd},Unit}) at /Users/dcjones/.julia/v0.3/SIUnits/src/SIUnits.jl:448 is ambiguous with: *(MeasureNil,Any) at /Users/dcjones/.julia/v0.3/Compose/src/measure.jl:41. To fix, define *(MeasureNil,T<:NonSIUnit{BaseUnit<:SIUnit{m,kg,s,A,K,mol,cd},Unit}) before the new definition.
Date types are also plottable.
set_default_plot_size(7inch, 3inch)
Measure{MeasureNil,MeasureNil}(76.19999999999999,MeasureNil(),MeasureNil(),0.0,0.0)
using Datetime
approval = melt(dataset("Zelig", "approval"), [:Month, :Year])
approval = approval[(approval[:variable] .== :Approve) | (approval[:variable] .== :Disapprove),:]
approval[:Date] = [date(y, m) for (y, m) in zip(approval[:Year], approval[:Month])]
plot(approval, x=:Date, y=:value, color=:variable, Geom.point, Geom.line, Guide.title("The Rise and Fall of W."))
Faceting is accomplished by defining a geometry that draws whole plots.
set_default_plot_size(6inch, 4inch)
Measure{MeasureNil,MeasureNil}(101.6,MeasureNil(),MeasureNil(),0.0,0.0)
orchard_sprays = dataset("datasets", "OrchardSprays")
plot(orchard_sprays,
xgroup="Treatment", x="ColPos", y="RowPos", color="Decrease",
Geom.subplot_grid(Geom.point))
set_default_plot_size(6inch, 4inch)
Measure{MeasureNil,MeasureNil}(101.6,MeasureNil(),MeasureNil(),0.0,0.0)
using Gadfly, RDatasets
vocab = dataset("car", "Vocab")
plot(vocab, ygroup=:Sex, x=:Education, y=:Vocabulary, Geom.subplot_grid(Geom.boxplot),
Guide.ylabel("Something Differenc"))
A wide variety utilities for colorspaces, conversions, differences, scale generation, parsing, etc.
using Compose, Color
plot([w -> convert(XYZ, cie_color_match(w)).x,
w -> convert(XYZ, cie_color_match(w)).y,
w -> convert(XYZ, cie_color_match(w)).z], 300, 800,
Scale.discrete_color_manual("red", "green", "blue"),
Guide.xlabel("Wavelength in nm"))
Perceputally uniform colorspace standardized in 1976
ctx = gridstack([compose(context(), rectangle(), fill(LCHab(l, 60, h)))
for l in linspace(10, 70, 20), h in linspace(0, 324, 40)])
draw(SVG(8inch, 2inch), compose(ctx, svgattribute("shape-rendering", "crispEdges")))
What we're stuck with in CSS, HTML, SVG, and most graphics software.
ctx = gridstack([compose(context(), rectangle(), fill(HSL(h, 0.7, l)))
for l in linspace(0.2, 0.7, 20), h in linspace(0, 324, 40)])
draw(SVG(8inch, 2inch), compose(ctx, svgattribute("shape-rendering", "crispEdges")))
distinguishable_colors(10, lchoices=[20.0, 40.0, 60.0, 80.0])
distinguishable_colors(10, lchoices=[20.0, 40.0, 60.0, 80.0], transform=deuteranopic)
Gadfly is built on a declarative graphics package called Compose.
using Compose
c = compose(context(units=UnitBox(0, 0, 50, 50)), stroke("#777"),
(context(), circle(0.33, 0.5, 0.25), fill("royal blue"), fillopacity(0.5)),
(context(), circle(0.66, 0.5, 0.25), fill("orange red"), fillopacity(0.5)))
draw(SVG(4inch, 3inch), c)
function sierpinski(n)
if n == 0
compose(context(), polygon([(1,1), (0,1), (1/10, 0)]))
else
t = sierpinski(n - 1)
compose(context(),
(context(1/4, 0, 1/2, 1/4), t),
(context( 0, 1/2, 1/2, 1/2), t),
(context(1/2, 1/2, 1/2, 1/2), t))
end
end
draw(SVG(5*2inch, 2*sqrt(3/4)inch), hstack([sierpinski(i) for i in 0:5]...))
Real world graphics are defined with a combination of relative and absolute units.
c = compose(context(),
(context(), circle(0.5w, 0.5h, 1cm), fill("bisque")),
(context(), circle(0.5w, 0.5h, 0.5w), fill("tomato")))
draw(SVG(1inch, 1inch), c)
draw(SVG(1.5inch, 1.5inch), c)
draw(SVG(2.0inch, 2.0inch), c)
Compose uses Cairo to draw to generate PS, PDF, and PNG.
We generate our own SVG, which is much more efficient than using Cairo.
pl = plot(sin, 0, 10)
draw(Compose.Image{Compose.SVGBackend}("cairo-plot.svg", 4inch, 4inch), plot(sin, 0, 10))
; gzip -f cairo-plot.svg ; gzip -l cairo-plot.svg.gz
compressed uncompressed ratio uncompressed_name 4678 17349 73.0% cairo-plot.svg
draw(SVG("compose-plot.svg", 4inch, 4inch), plot(sin, 0, 10))
; gzip -f compose-plot.svg ; gzip -l compose-plot.svg.gz
compressed uncompressed ratio uncompressed_name 2203 6180 64.3% compose-plot.svg
Generating our own SVG also lets up make scriptable SVG.
using Color
c = compose(context(),
circle(0.5w, 0.5h, 0.5w), fill("cadet blue"),
jscall(
"""
click(function () {
function rand255() { return Math.round(255 * Math.random()); }
function randrgb() {
return "rgb(" + rand255() + "," + rand255() + "," + rand255() + ")";
}
var element = this;
function f() {
element.animate({fill: randrgb()}, 1000, mina.linear, f);
}
f();
})
"""))
draw(SVGJS(1inch, 1inch), c)
Compose uses Snap.svg to make manipulating SVG easy.
All Javascript code is embedded directly in the SVG file, making completely self-contained, dynamic graphics.
Just add object tags...
<object type="image/svg+xml" data="my-gadfly-plot.svg"></object>
Most plotting tools have very rigid layout, despite a long history pracitcal solutions to layout problems.
Plots are composed on a grid. Each cell may have multiple configurations.
t = table(3, 4, 1:1, 3:3)
for i in 1:3, j in 1:4
t[i, j] = [compose(context(minwidth=1cm, minheight=1cm), rectangle(), fill("sky blue"))]
end
t[1, 3] = [compose(context(), rectangle(), fill("sky blue 3"),
(context(), text(0.5w, 0.5h, "Maximize", hcenter, vcenter),
font("Signika"), fontsize(18pt), fill("white"), stroke(nothing)))]
draw(SVG(4inch, 4inch), compose(context(), t, stroke("white"), strokedash([0.5mm, 1.0mm])))
Choose a set of configurations that maximize the plot cell without anything overlapping.
Plots are responsive to the size they are drawn.
mac = dataset("Zelig", "macro")
pl = plot(mac, x=:Year, y=:Unem, color=:Country, Geom.line)
draw(SVGJS(6inch, 4inch), pl)
draw(SVGJS(6inch, 2.2inch), pl)
It's still possible to draw figures that are too small, but Compose knows and will judge you.
mac = dataset("Zelig", "macro")
pl = plot(mac, x=:Year, y=:Unem, color=:Country, Geom.line)
draw(SVGJS(2inch, 2inch), pl)
WARNING: Graphic cannot be correctly drawn at the given size.
Label layout is a separate problem.
rectwidth = 40mm
rectheight = 15mm
c = compose(context(),
(context(),
circle([0.7, 0.3, 0.9], [0.1, 0.7, 0.9], [2mm]),
fill("steel blue")),
(context(),
rectangle([0.7cx - rectwidth - 1.55mm,
0.3cx + 1.55mm,
0.9cx - rectwidth - 1.55mm],
[0.1cy + 1.55mm,
0.7cy - rectheight - 1.55mm,
0.9cy - rectheight - 1.55mm],
[rectwidth], [rectheight]),
fill("sky blue")), stroke("white"), linewidth(0.5mm))
draw(SVG(5inch, 2inch), c)
Place the labels without overlaps, hiding some if absolutely necessary.
Despite being NP-hard, simulated annealing works well in practice.
set_default_plot_size(14cm, 14cm)
Measure{MeasureNil,MeasureNil}(140.0,MeasureNil(),MeasureNil(),0.0,0.0)
mammals = dataset("MASS", "mammals")
plot(mammals, x=:Body, y=:Brain, label=:Mammal,
Scale.x_log10, Scale.y_log10,
Geom.point, Geom.label, Geom.smooth(method=:lm))