Multiple Regression¶

This example shows how multiple regression can be performed using statsample and daru.

The lr() shorthand will call the function Statsample::Regression.multiple. It should be noted that internally statsample implements multiple regression using either Ruby methods or GSL methods. This lets statsample run even in the absence of gsl. But ruby implementations of functions are much much slower than those from GSL, and hence it is recomended that you install the rb-gsl or gsl-nmatrix gems before proceeding (these will work only on MRI).

Rb-gsl can be installed from rubygems directly with gem install rb-gsl. To see how to install gsl-nmatrix, see this blog post.

In [8]:
require 'statsample'

Statsample::Analysis.store(Statsample::Regression::Multiple) do
Daru.lazy_update = true

samples=2000
ds = Daru::DataFrame.new({
:a  => rnorm(samples),
:b  => rnorm(samples),
:cc => rnorm(samples),
:d  => rnorm(samples)}, clone: false)
attach(ds)
ds[:y] = a*5+b*3+cc*2+d+rnorm(samples)

# REMEMBER: It is _mandatory_ to call #update after assingnment cycles if your
# operations to be performed as expected.
ds.update
summary lr(ds,:y)

Daru.lazy_update = false
end
Statsample::Analysis.run_batch

Analysis 2015-06-04 00:17:46 +0530
= Statsample::Regression::Multiple
== Multiple reggresion of a,b,cc,d on y
Engine: Statsample::Regression::Multiple::RubyEngine
Cases(listwise)=2000(2000)
R=0.987
R^2=0.975
Std.Error R=1.028
Equation=0.015 + 5.021a + 3.005b + 2.049cc + 1.024d
=== ANOVA
ANOVA Table
+------------+-----------+------+-----------+-----------+-------+
|   source   |    ss     |  df  |    ms     |     f     |   p   |
+------------+-----------+------+-----------+-----------+-------+
| Regression | 81106.039 | 4    | 20276.510 | 19197.346 | 0.000 |
| Error      | 2107.147  | 1995 | 1.056     |           |       |
| Total      | 83213.187 | 1999 | 20277.566 |           |       |
+------------+-----------+------+-----------+-----------+-------+

Beta coefficients
+----------+-------+-------+-------+---------+
|  coeff   |   b   | beta  |  se   |    t    |
+----------+-------+-------+-------+---------+
| Constant | 0.015 | -     | 0.023 | 0.668   |
| a        | 5.021 | 0.779 | 0.023 | 218.235 |
| b        | 3.005 | 0.457 | 0.023 | 128.181 |
| cc       | 2.049 | 0.318 | 0.023 | 89.209  |
| d        | 1.024 | 0.156 | 0.023 | 43.778  |
+----------+-------+-------+-------+---------+