v1 = collect(1:6);
size(v1) # size always returns a tuple
(6,)
length(v1)
6
m1 = reshape(collect(1:6),(3,2))
3x2 Array{Int64,2}: 1 4 2 5 3 6
Because size
returns a tuple the number of rows and columns can be assigned in one statement. A common idiom is
m,n = size(m1)
(3,2)
println("Number of rows = $m and number of columns = $n")
WARNING: readbytes is deprecated, use read instead.
Number of rows = 3 and number of columns = 2
As in R
, matrices and higher-dimensional arrays are stored in column-major ordering. Transpose is indicated by '
. The *
operator applied to arrays denotes matrix multiplication. Element-wise multiplication is indicated by .*
. In general a prefix .
to an operator makes it behave element-wise. Constructors of arrays with specific content include ones()
, zeros()
and eye()
(the identity matrix).
m1 * ones(2)
3-element Array{Float64,1}: 5.0 7.0 9.0
in depwarn
m1'*m1
(
2x2 Array{Int64,2}: 14 32 32 77
::
eye(3)
3x3 Array{Float64,2}: 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0
ASCIIString,
eye(3,4)
::
3x4 Array{Float64,2}: 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0
Symbol
ones(Int,(2,3))
2x3 Array{Int64,2}: 1 1 1 1 1 1
The equivalent of R
's cbind
and rbind
functions are hcat
and vcat
.
m2 = hcat(ones(3),collect(-1.:1.))
3x2 Array{Float64,2}: 1.0 -1.0 1.0 0.0 1.0 1.0
) at ./deprecated.jl:
vcat
is used for concatenating vectors.
vcat([1:3;],ones(2))
64
5-element Array{Float64,1}: 1.0 2.0 3.0 1.0 1.0
In some circumstances you can omit the *
in a multiplication.
m22 = m2'm2
2x2 Array{Float64,2}: 3.0 0.0 0.0 2.0
in readbytes
Unlike in R
, the Julia
expression X'X
is evaluated using only one copy of X
and will always produce a symmetric matrix. In R
the function crossprod
must be called to do this. (Be careful, in Julia
there is an operation called a cross product but it is the cross product from Physics.)
?×
(::
cross(x, y)
×(x,y)
Compute the cross product of two 3-vectors.
Base.PipeEndpoint, ::Vararg
The backslash operator, \
, denotes the solution of a linear system of equations of the form $Ax=b$.
m22\(m2'ones(3))
{
2-element Array{Float64,1}: 1.0 0.0
Any
With a rectangular left-hand side \
denotes a least-squares solution
m2\ones(3)
2-element Array{Float64,1}: 1.0 -0.0
Note that this solution is a bit different from the previous solution because of numerical imprecision in the computation.
Those who haven't studied numerical linear algebra assume that a linear system of equations is solved as $A^{-1}b$. In practice it is almost never necessary to evaluate the inverse of a matrix. Instead the matrix is factored into a product of other matrices that have desirable properties such as being triangular.
Such factorizations include the singular value decomposition (SVD), the QR decomposition (with or without pivoting), the Cholesky factorization (with or without pivoting) for a positive-semidefinite symmetric matrix, the LU factorization and the Eigen decomposition. The names of the constructors of the decomposition usually end in fact
.
m3 = hcat(ones(3),[1:3;])
3x2 Array{Float64,2}: 1.0 1.0 1.0 2.0 1.0 3.0
}) at ./deprecated.jl
m33 = m3'm3
:
2x2 Array{Float64,2}: 3.0 6.0 6.0 14.0
30
cf33 = cholfact(m33)
Base.LinAlg.Cholesky{Float64,Array{Float64,2}} with factor: 2x2 UpperTriangular{Float64,Array{Float64,2}}: 1.73205 3.4641 ⋅ 1.41421
cf33[:L] # extract the lower Cholesky factor
in send_stream
2x2 LowerTriangular{Float64,Array{Float64,2}}: 1.73205 ⋅ 3.4641 1.41421
(
cf33[:U] # extract the upper triangular Cholesky factor, cf33[:L]'
::
2x2 UpperTriangular{Float64,Array{Float64,2}}: 1.73205 3.4641 ⋅ 1.41421
full(cf33) # the matrix (up to round-off) represented by the factorization
Base.
2x2 Array{Float64,2}: 3.0 6.0 6.0 14.0
PipeEndpoint
det(cf33) # determinant of full(cf33), calculated from det(cf33[:U])
,
5.999999999999983
logdet(cf33) # more practical for positive semidefinite
1.7917594692280523
::ASCIIString)
Although one rarely needs the inverse of a general matrix, it is sometimes useful to get the inverse of the matrix represented by the Cholesky factorization. This is feasible because the factor is triangular and determining the inverse of a triangular matrix is easier than doing so for a general matrix. Suppose we want the estimated covariance matrix of the coefficient estimates.
y = rand(3)
at
3-element Array{Float64,1}: 0.268839 0.129757 0.348011
/home/bates/.julia/v0.5/IJulia/src/stdio.jl
β = cf33\(m3'y)
2-element Array{Float64,1}: 0.169697 0.0395858
ŷ = m3*β # note you have to construct ŷ as y\hat<tab> not \hat<tab>y
3-element Array{Float64,1}: 0.209283 0.248869 0.288455
resid = y - ŷ
3-element Array{Float64,1}: 0.0595559 -0.119112 0.0595559
s² = sumabs2(resid)/(3-2)
0.021281438491229213
vcv = s²*inv(cf33)
2x2 Array{Float64,2}: 0.0496567 -0.0212814 -0.0212814 0.0106407
:25 in watch_stream(::Base.PipeEndpoint
The pivoted Cholesky decomposition has better numerical properties than does the unpivoted version and is a rank-revealing decomposition.
cf33 = cholfact(m33, :U, Val{true})
,
Base.LinAlg.CholeskyPivoted{Float64,Union{DenseArray{Float64,2},SubArray{Float64,2,A<:DenseArray{T,N},I<:Tuple{Vararg{Union{Colon,Int64,Range{Int64}}}},LD}}}(2x2 Array{Float64,2}: 3.74166 1.60357 6.0 0.654654,'U',Int32[2,1],2,0.0,0)
::
At present this factorization doesn't have an attractive show
method.
cf33[:U]
2x2 UpperTriangular{Float64,Array{Float64,2}}: 3.74166 1.60357 ⋅ 0.654654
ASCIIString)
cf33[:p] # the pivot vector
at
2-element Array{Int32,1}: 2 1
/home/bates/.julia/v0.5/IJulia/src/stdio.jl
cf33[:P] # the pivot matrix
2x2 Array{Float64,2}: 0.0 1.0 1.0 0.0
rank(cf33)
2
The QR decomposition is a direct (i.e. non-iterative) decomposition used, among other things, to solve least squares problems. It decomposes a rectangular X
into an orthogonal matrix Q
and an upper triangular matrix R
. The matrix Q
, which could be very large if the number of rows of X
is large, is stored implicitly.
qr3 = qrfact(m3)
:41 in (::
Base.LinAlg.QRCompactWY{Float64,Array{Float64,2}}(3x2 Array{Float64,2}: -1.73205 -3.4641 0.366025 -1.41421 0.366025 0.767327,2x2 Array{Float64,2}: 1.57735 -1.28446 6.93333e-310 1.25882)
IJulia
qr3[:R]
.
2x2 Array{Float64,2}: -1.73205 -3.4641 0.0 -1.41421
qr3[:Q] # the implicit representation
3x3 Base.LinAlg.QRCompactWYQ{Float64,Array{Float64,2}}: -0.57735 0.707107 0.408248 -0.57735 1.11022e-16 -0.816497 -0.57735 -0.707107 0.408248
The QR factorization can be used directly for least squares solutions.
β = qr3\y
##4#8)()
2-element Array{Float64,1}: 0.169697 0.0395858
at
What are called the effects
in an R
lm
object are Q'y
qr3[:Q]'y
./task.jl
3-element Array{Float64,1}: -0.431054 -0.0559827 0.145882
There is also a pivoted QR decomposition which is rank-revealing.
qr3 = qrfact(m3, Val{true})
Base.LinAlg.QRPivoted{Float64,Array{Float64,2}}(3x2 Array{Float64,2}: -3.74166 -1.60357 0.421793 0.654654 0.63269 0.859768,[1.26726,1.14995],Int32[2,1])
The singular value decomposition represents X
as U*S*V'
where U
and V
are orthogonal and S
is diagonal. It is related to the eigen decomposition in that the eigen decomposition of X'X
is V*S*S*V'
. The eigenvalues of X'X
are the squares of the singular values of X
.
sv3 = svdfact(m3)
:431 while loading In[5], in expression starting on line 1
Base.LinAlg.SVD{Float64,Float64,Array{Float64,2}}(3x2 Array{Float64,2}: -0.32311 0.853776 -0.547507 0.18322 -0.771904 -0.487337,[4.07914,0.600491],2x2 Array{Float64,2}: -0.402663 -0.915348 0.915348 -0.402663)
search: ×
sv3[:U]
3x2 Array{Float64,2}: -0.32311 0.853776 -0.547507 0.18322 -0.771904 -0.487337
sv3[:V]
2x2 Array{Float64,2}: -0.402663 0.915348 -0.915348 -0.402663
sv3[:S] # vector of singular values
2-element Array{Float64,1}: 4.07914 0.600491
sv3[:Vt]
2x2 Array{Float64,2}: -0.402663 -0.915348 0.915348 -0.402663
This is the thin SVD. If you want the full matrix U
use
sv3 = svdfact(m3; thin=false)
Base.LinAlg.SVD{Float64,Float64,Array{Float64,2}}(3x3 Array{Float64,2}: -0.32311 0.853776 0.408248 -0.547507 0.18322 -0.816497 -0.771904 -0.487337 0.408248,[4.07914,0.600491],2x2 Array{Float64,2}: -0.402663 -0.915348 0.915348 -0.402663)
sv3[:U]
3x3 Array{Float64,2}: -0.32311 0.853776 0.408248 -0.547507 0.18322 -0.816497 -0.771904 -0.487337 0.408248
The eigfact
function has a special method for real symmetric matrices
ef3 = eigfact(m33)
Base.LinAlg.Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}([0.36059,16.6394],2x2 Array{Float64,2}: -0.915348 0.402663 0.402663 0.915348)
ef3[:values]
2-element Array{Float64,1}: 0.36059 16.6394
abs2(sv3[:S]) # same as the eigenvalues but in the opposite order
2-element Array{Float64,1}: 16.6394 0.36059
ef3[:vectors]
2x2 Array{Float64,2}: -0.915348 0.402663 0.402663 0.915348
sv3[:V] # same as the eigenvectors of m33 with permuted rows and columns
2x2 Array{Float64,2}: -0.402663 0.915348 -0.915348 -0.402663
For general matrices we must allow for complex eigenvalues and eigenvectors. If all the eigenvalues and eigenvectors are real they are returned as such
eigfact(reshape([1:9;],(3,3)))
Base.LinAlg.Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}([16.1168,-1.11684,-5.70069e-16],3x3 Array{Float64,2}: -0.464547 -0.882906 0.408248 -0.570796 -0.23952 -0.816497 -0.677044 0.403865 0.408248)
eigfact([0 1;-1 0])
Base.LinAlg.Eigen{Complex{Float64},Complex{Float64},Array{Complex{Float64},2},Array{Complex{Float64},1}}(Complex{Float64}[0.0+1.0im,0.0-1.0im],2x2 Array{Complex{Float64},2}: 0.707107+0.0im 0.707107-0.0im 0.0+0.707107im 0.0-0.707107im)
The LU factorization is the most general direct factorization. It is related to Gaussian elimination.
lu4 = lufact(reshape([1:9;],(3,3)))
Base.LinAlg.LU{Float64,Array{Float64,2}}(3x3 Array{Float64,2}: 3.0 6.0 9.0 0.333333 2.0 4.0 0.666667 0.5 0.0,Int32[3,3,3],3)
lu4[:L] # unit lower triangular factor
3x3 Array{Float64,2}: 1.0 0.0 0.0 0.333333 1.0 0.0 0.666667 0.5 1.0
lu4[:U] # upper triangular factor. Note that it is singular.
3x3 Array{Float64,2}: 3.0 6.0 9.0 0.0 2.0 4.0 0.0 0.0 0.0
lu4[:p] # permutation vector
3-element Array{Int32,1}: 3 1 2
lu4[:P] # permutation matrix
3x3 Array{Float64,2}: 0.0 0.0 1.0 1.0 0.0 0.0 0.0 1.0 0.0
As in R
elements or subarrays are extracted with using indices. To indicate all values of a given index use :
m2[:,2]
3-element Array{Float64,1}: -1.0 0.0 1.0
There are two alternatives to indexing, the use of sub
or slice
.
sub(m2,:,2)
3-element SubArray{Float64,1,Array{Float64,2},Tuple{Colon,Int64},2}: -1.0 0.0 1.0
slice(m2, :, 2)
3-element SubArray{Float64,1,Array{Float64,2},Tuple{Colon,Int64},2}: -1.0 0.0 1.0
sub(m2, 2, :)
1x2 SubArray{Float64,2,Array{Float64,2},Tuple{UnitRange{Int64},Colon},2}: 1.0 0.0
slice(m2, 2, :)
2-element SubArray{Float64,1,Array{Float64,2},Tuple{Int64,Colon},2}: 1.0 0.0
The difference between sub
and slice
is that slice
always drops singleton dimensions whereas sub
only drops trailing singleton dimensions.
Both sub
and slice
return views into the original array. That is, they do not copy. At present using indices produces a copy but that will change.
pointer(m2)
Ptr{Float64} @0x00007fa1a0221000
pointer(m2[:,1])
Ptr{Float64} @0x00007fa1a26e9360
pointer(sub(m2, :, 1))
Ptr{Float64} @0x00007fa1a0221000
pointer(slice(m2, 1, :))
Ptr{Float64} @0x00007fa1a0221000
To convert a matrix or higher-order array to a vector use vec
vec(m2)
6-element Array{Float64,1}: 1.0 1.0 1.0 -1.0 0.0 1.0
The result of an indexing operation is a copy of the elements from the array.
v2 = m2[:,2]
v2[2] = 999.
999.0
m2
3x2 Array{Float64,2}: 1.0 -1.0 1.0 0.0 1.0 1.0
v2
3-element Array{Float64,1}: -1.0 999.0 1.0
The sub
function produces an array view which accesses the actual elements in the original array.
v2 = sub(m2,:,2)
3-element SubArray{Float64,1,Array{Float64,2},Tuple{Colon,Int64},2}: -1.0 0.0 1.0
v2[2] = 1.
1.0
m2
3x2 Array{Float64,2}: 1.0 -1.0 1.0 1.0 1.0 1.0
The sub
function is preferred over explicit indexing if you need a subarray that you are not going to modify (sub
does not create unnecessary copies) and for when you really do want to change the elements of a subarray.
Most linear algebra in Julia
is performed on Matrix{Float64}
objects using the LAPACK and BLAS software. By default Julia
is compiled against OpenBLAS, an accelerated multi-threaded BLAS that also provides accelerated versions of some of the LAPACK functions. For example the LU decomposition and solutions of linear systems using LU are accelerated because they constitute the LAPACK benchmark. Julia
can also be compiled with Intel's MKL (Math Kernel Library) BLAS, when they are available, to give accelerated performance. Understandably these are tuned for best performance on Intel processors.
Julia provides thin wrappers around many LAPACK and BLAS routines if you want maximal control. These mimic the BLAS and LAPACK routines without the annoyingly redundant arguments. Note that many internal linear algebra functions provide mutating versions.
whos(BLAS)
BLAS 2515 KB Module asum 0 bytes Base.LinAlg.BLAS.#asum blascopy! 0 bytes Base.LinAlg.BLAS.#blascopy! dotc 0 bytes Base.LinAlg.BLAS.#dotc dotu 0 bytes Base.LinAlg.BLAS.#dotu gbmv 0 bytes Base.LinAlg.BLAS.#gbmv gbmv! 0 bytes Base.LinAlg.BLAS.#gbmv! gemm 0 bytes Base.LinAlg.BLAS.#gemm gemm! 0 bytes Base.LinAlg.BLAS.#gemm! gemv 0 bytes Base.LinAlg.BLAS.#gemv gemv! 0 bytes Base.LinAlg.BLAS.#gemv! ger! 0 bytes Base.LinAlg.BLAS.#ger! hemm 0 bytes Base.LinAlg.BLAS.#hemm hemm! 0 bytes Base.LinAlg.BLAS.#hemm! hemv 0 bytes Base.LinAlg.BLAS.#hemv hemv! 0 bytes Base.LinAlg.BLAS.#hemv! her! 0 bytes Base.LinAlg.BLAS.#her! her2k 0 bytes Base.LinAlg.BLAS.#her2k her2k! 0 bytes Base.LinAlg.BLAS.#her2k! herk 0 bytes Base.LinAlg.BLAS.#herk herk! 0 bytes Base.LinAlg.BLAS.#herk! iamax 0 bytes Base.LinAlg.BLAS.#iamax nrm2 0 bytes Base.LinAlg.BLAS.#nrm2 sbmv 0 bytes Base.LinAlg.BLAS.#sbmv sbmv! 0 bytes Base.LinAlg.BLAS.#sbmv! scal 0 bytes Base.LinAlg.BLAS.#scal scal! 0 bytes Base.LinAlg.BLAS.#scal! symm 0 bytes Base.LinAlg.BLAS.#symm symm! 0 bytes Base.LinAlg.BLAS.#symm! symv 0 bytes Base.LinAlg.BLAS.#symv symv! 0 bytes Base.LinAlg.BLAS.#symv! syr! 0 bytes Base.LinAlg.BLAS.#syr! syr2k 0 bytes Base.LinAlg.BLAS.#syr2k syr2k! 0 bytes Base.LinAlg.BLAS.#syr2k! syrk 0 bytes Base.LinAlg.BLAS.#syrk syrk! 0 bytes Base.LinAlg.BLAS.#syrk! trmm 0 bytes Base.LinAlg.BLAS.#trmm trmm! 0 bytes Base.LinAlg.BLAS.#trmm! trmv 0 bytes Base.LinAlg.BLAS.#trmv trmv! 0 bytes Base.LinAlg.BLAS.#trmv! trsm 0 bytes Base.LinAlg.BLAS.#trsm trsm! 0 bytes Base.LinAlg.BLAS.#trsm! trsv 0 bytes Base.LinAlg.BLAS.#trsv trsv! 0 bytes Base.LinAlg.BLAS.#trsv!
Operations like A'B
and A\b
call functions with names like Ac_mul_B
and A_ldiv_B
. There are mutating versions of these functions. The Ac
denotes A'
which, in general, is the conjugate transpose of A
. For complex A
there is a distinction between A'
, the conjugate transpose, and A.'
, the transpose. Usually you want A'
. For real matrices A'
and A.'
are the same.
Notice that the type of cf33[:U]
from the first Cholesky example is
typeof(cholfact(m33)[:U])
UpperTriangular{Float64,Array{Float64,2}}
Many operations can be steamlined if it is known that one of the operands is a triangular matrix or, even more specific, a unit triangular matrices. There are also special types for Symmetric
and Diagonal
matrices. Ongoing work will incorporate these matrix types more deeply into the Julia
system.
When fitting a linear mixed model, the core computation can be expressed as solving a penalized least squares problem. In the MixedModels
package there are several different PLSSolver
types defined to facilitate this computation. When there is only a single random-effects term the system to be solved is block diagonal.
The PLSOne
type is defined as
type PLSOne <: PLSSolver # Solver for models with a single random-effects term
Ad::Array{Float64,3} # diagonal blocks
Ab::Array{Float64,3} # base blocks
At::Symmetric{Float64} # lower right block
Ld::Array{Float64,3} # diagonal blocks
Lb::Array{Float64,3} # base blocks
Lt::Base.LinAlg.Cholesky{Float64} # lower right triangle
Zt::SparseMatrixCSC
end
and is updated from a triangular matrix λ as
## update!(s,lambda)->s : update Ld, Lb and Lt
function update!(s::PLSOne,λ::Triangular)
m,n,l = size(s)
n == size(λ,1) || error("Dimension mismatch")
Lt = copy!(s.Lt.UL,s.At.S)
Lb = copy!(s.Lb,s.Ab)
if n == 1 # shortcut for 1×1 λ
lam = λ[1,1]
Ld = map!(x -> sqrt(x*lam*lam + 1.), s.Ld, s.Ad)
Lb = scale!(reshape(Lb,(m,n*l)),lam ./ vec(Ld))
BLAS.syrk!('L','N',-1.0,Lb,1.0,Lt)
else
Ac_mul_B!(λ,reshape(copy!(s.Ld,s.Ad),(n,n*l)))
for k in 1:l
wL = A_mul_B!(sub(s.Ld,:,:,k),λ)
for j in 1:n # Inflate the diagonal
wL[j,j] += 1.
end
_, info = LAPACK.potrf!('L',wL) # i'th diagonal block of L
info == 0 || error("Cholesky failure at L diagonal block $k")
Base.LinAlg.A_rdiv_Bc!(A_mul_B!(sub(s.Lb,:,:,k),λ),Triangular(wL,:L,false))
end
BLAS.syrk!('L','N',-1.0,reshape(Lb,(m,n*l)),1.,Lt)
end
_, info = LAPACK.potrf!('L',Lt)
info == 0 || error("downdated X'X is not positive definite")
s
end
Sparse matrices and sparse vectors are available in the base Julia
system but their description will need to wait for another time.
?sparse
search:
sparse(A)
Convert an AbstractMatrix A
into a sparse matrix.
sparse(I,J,V,[m,n,combine])
Create a sparse matrix S
of dimensions m x n
such that S[I[k], J[k]] = V[k]
. The combine
function is used to combine duplicates. If m
and n
are not specified, they are set to maximum(I)
and maximum(J)
respectively. If the combine
function is not supplied, combine
defaults to +
unless the elements of V
are Booleans in which case combine
defaults to |
. All elements of I
must satisfy 1 <= I[k] <= m
, and all elements of J
must satisfy 1 <= J[k] <= n
.
sparse sparsevec SparseVector SparseArrays SparseMatrixCSC issparse