In this chapter, we consider a situation in which we have a metric-predicted variable that is observed for items from one or two groups.
The normal distribution specifies the probability density of a value y, given the values of two parameters, the mean μ and standard deviation σ :
The question is, given the data, how should we allocate credibility to combinations of μ and σ?
Our goal now is to evaluate Equation 16.2 for reasonable choices of the prior distribution, p(μ, σ ).
We have assumed that the data are generated by a normal likelihood function, parameterized by a mean μ and standard deviation σ , and denoted y ∼ normal(y|μ, σ ). For purposes of mathematical derivation, we made unrealistic assumptions that the prior distribution is either a spike on σ or a spike on μ, in order to make three main points:
It is easy to estimate the mean and standard deviation in JAGS.
on σ . The low and high values of the uniform distribution are set to be far outside any realistic value for the data, so that the prior has minimal influence on the posterior.
dataList = list(
y = y ,
Ntotal = length(y) ,
meanY = mean(y) ,
sdY = sd(y)
)
model {
for ( i in 1:Ntotal ) {
y[i] ̃ dnorm( mu , 1/sigmaˆ2 ) # JAGS uses precision
}
mu ̃ dnorm( meanY , 1/(100*sdY)ˆ2 ) # JAGS uses precision
sigma ̃ dunif( sdY/1000 , sdY*1000 )
}
# 책에는 없는 코드. 실습을 위해 data 폴더에 넣은 코드를 실행시키기 위한 작업
cur_dir = getwd()
setwd(sprintf("%s/%s", cur_dir, 'data'))
# Example for Jags-Ymet-Xnom1grp-Mnormal.R
#-------------------------------------------------------------------------------
# Optional generic preliminaries:
#graphics.off() # This closes all of R's graphics windows.
#rm(list=ls()) # Careful! This clears all of R's memory!
#-------------------------------------------------------------------------------
# Load The data file
myDataFrame = read.csv( file="TwoGroupIQ.csv" )
myDataFrame
Score | Group | |
---|---|---|
1 | 102 | Smart Drug |
2 | 107 | Smart Drug |
3 | 92 | Smart Drug |
4 | 101 | Smart Drug |
5 | 110 | Smart Drug |
6 | 68 | Smart Drug |
7 | 119 | Smart Drug |
8 | 106 | Smart Drug |
9 | 99 | Smart Drug |
10 | 103 | Smart Drug |
11 | 90 | Smart Drug |
12 | 93 | Smart Drug |
13 | 79 | Smart Drug |
14 | 89 | Smart Drug |
15 | 137 | Smart Drug |
16 | 119 | Smart Drug |
17 | 126 | Smart Drug |
18 | 110 | Smart Drug |
19 | 71 | Smart Drug |
20 | 114 | Smart Drug |
21 | 100 | Smart Drug |
22 | 95 | Smart Drug |
23 | 91 | Smart Drug |
24 | 99 | Smart Drug |
25 | 97 | Smart Drug |
26 | 106 | Smart Drug |
27 | 106 | Smart Drug |
28 | 129 | Smart Drug |
29 | 115 | Smart Drug |
30 | 124 | Smart Drug |
31 | 137 | Smart Drug |
32 | 73 | Smart Drug |
33 | 69 | Smart Drug |
34 | 95 | Smart Drug |
35 | 102 | Smart Drug |
36 | 116 | Smart Drug |
37 | 111 | Smart Drug |
38 | 134 | Smart Drug |
39 | 102 | Smart Drug |
40 | 110 | Smart Drug |
41 | 139 | Smart Drug |
42 | 112 | Smart Drug |
43 | 122 | Smart Drug |
44 | 84 | Smart Drug |
45 | 129 | Smart Drug |
46 | 112 | Smart Drug |
47 | 127 | Smart Drug |
48 | 106 | Smart Drug |
49 | 113 | Smart Drug |
50 | 109 | Smart Drug |
51 | 208 | Smart Drug |
52 | 114 | Smart Drug |
53 | 107 | Smart Drug |
54 | 50 | Smart Drug |
55 | 169 | Smart Drug |
56 | 133 | Smart Drug |
57 | 50 | Smart Drug |
58 | 97 | Smart Drug |
59 | 139 | Smart Drug |
60 | 72 | Smart Drug |
61 | 100 | Smart Drug |
62 | 144 | Smart Drug |
63 | 112 | Smart Drug |
64 | 109 | Placebo |
65 | 98 | Placebo |
66 | 106 | Placebo |
67 | 101 | Placebo |
68 | 100 | Placebo |
69 | 111 | Placebo |
70 | 117 | Placebo |
71 | 104 | Placebo |
72 | 106 | Placebo |
73 | 89 | Placebo |
74 | 84 | Placebo |
75 | 88 | Placebo |
76 | 94 | Placebo |
77 | 78 | Placebo |
78 | 108 | Placebo |
79 | 102 | Placebo |
80 | 95 | Placebo |
81 | 99 | Placebo |
82 | 90 | Placebo |
83 | 116 | Placebo |
84 | 97 | Placebo |
85 | 107 | Placebo |
86 | 102 | Placebo |
87 | 91 | Placebo |
88 | 94 | Placebo |
89 | 95 | Placebo |
90 | 86 | Placebo |
91 | 108 | Placebo |
92 | 115 | Placebo |
93 | 108 | Placebo |
94 | 88 | Placebo |
95 | 102 | Placebo |
96 | 102 | Placebo |
97 | 120 | Placebo |
98 | 112 | Placebo |
99 | 100 | Placebo |
100 | 105 | Placebo |
101 | 105 | Placebo |
102 | 88 | Placebo |
103 | 82 | Placebo |
104 | 111 | Placebo |
105 | 96 | Placebo |
106 | 92 | Placebo |
107 | 109 | Placebo |
108 | 91 | Placebo |
109 | 92 | Placebo |
110 | 123 | Placebo |
111 | 61 | Placebo |
112 | 59 | Placebo |
113 | 105 | Placebo |
114 | 184 | Placebo |
115 | 82 | Placebo |
116 | 138 | Placebo |
117 | 99 | Placebo |
118 | 93 | Placebo |
119 | 93 | Placebo |
120 | 72 | Placebo |
summary(myDataFrame)
Score Group Min. : 50.00 Placebo :57 1st Qu.: 92.75 Smart Drug:63 Median :102.00 Mean :104.13 3rd Qu.:112.00 Max. :208.00
# For purposes of this one-group example, use data from Smart Drug group:
myData = myDataFrame$Score[myDataFrame$Group=="Smart Drug"]
#-------------------------------------------------------------------------------
# Optional: Specify filename root and graphical format for saving output.
# Otherwise specify as NULL or leave saveName and saveType arguments
# out of function calls.
fileNameRoot = "OneGroupIQnormal-"
graphFileType = "png" #"eps"
#-------------------------------------------------------------------------------
# Load the relevant model into R's working memory:
source("Jags-Ymet-Xnom1grp-Mnormal.R")
********************************************************************* Kruschke, J. K. (2015). Doing Bayesian Data Analysis, Second Edition: A Tutorial with R, JAGS, and Stan. Academic Press / Elsevier. *********************************************************************
Loading required package: coda
#-------------------------------------------------------------------------------
# Generate the MCMC chain:
mcmcCoda = genMCMC( data=myData , numSavedSteps=20000 , saveName=fileNameRoot )
Loading required package: rjags Linked to JAGS 3.4.0 Loaded modules: basemod,bugs
Compiling model graph Resolving undeclared variables Allocating nodes Graph Size: 79 Initializing model Burning in the MCMC chain... Sampling final MCMC chain...
#-------------------------------------------------------------------------------
# Display diagnostics of chain, for specified parameters:
parameterNames = varnames(mcmcCoda) # get all parameter names
for ( parName in parameterNames ) {
diagMCMC( codaObject=mcmcCoda , parName=parName ,
saveName=fileNameRoot , saveType=graphFileType )
}
# 창으로 뜨는 것들을 없앤다.
graphics.off()
#-------------------------------------------------------------------------------
# Get summary statistics of chain:
summaryInfo = smryMCMC( mcmcCoda ,
compValMu=100.0 , ropeMu=c(99.0,101.0) ,
compValSigma=15.0 , ropeSigma=c(14,16) ,
compValEff=0.0 , ropeEff=c(-0.1,0.1) ,
saveName=fileNameRoot )
show(summaryInfo)
Mean Median Mode ESS HDImass HDIlow mu 107.8278271 107.8114667 107.0894331 20097.3 0.95 101.43788497 sigma 25.9588549 25.7791320 25.0124458 11475.4 0.95 21.46449101 effSz 0.3039062 0.3034737 0.2807922 20000.0 0.95 0.04810492 HDIhigh CompVal PcntGtCompVal ROPElow ROPEhigh PcntLtROPE PcntInROPE mu 114.4334683 100 98.95 99.0 101.0 0.41 1.625 sigma 30.6368277 15 100.00 14.0 16.0 0.00 0.000 effSz 0.5540163 0 98.95 -0.1 0.1 0.07 5.650 PcntGtROPE mu 97.965 sigma 100.000 effSz 94.280
# Display posterior information:
plotMCMC( mcmcCoda , data=myData ,
compValMu=100.0 , ropeMu=c(99.0,101.0) ,
compValSigma=15.0 , ropeSigma=c(14,16) ,
compValEff=0.0 , ropeEff=c(-0.1,0.1) ,
pairsPlot=TRUE , showCurve=FALSE ,
saveName=fileNameRoot , saveType=graphFileType )
graphics.off()
beyond what would be accommodated by a normal distribution, it would be useful to be able to describe the data with a more appropriate distribution that has taller or heavier tails than the normal.
dataList = list(
y = y ,
Ntotal = length(y) ,
meanY = mean(y) ,
sdY = sd(y)
)
model {
for ( i in 1:Ntotal ) {
y[i] ̃ dnorm( mu , 1/sigmaˆ2 ) # JAGS uses precision
}
mu ̃ dnorm( meanY , 1/(100*sdY)ˆ2 ) # JAGS uses precision
sigma ̃ dunif( sdY/1000 , sdY*1000 )
}
model {
for ( i in 1:Ntotal ) {
y[i] ̃ dt( mu , 1/sigmaˆ2 , nu ) # JAGS: dt uses precision
}
mu ̃ dnorm( meanY , 1/(100*sdY)ˆ2 )
sigma ̃ dunif( sdY/1000 , sdY*1000 )
nu <- nuMinusOne+1 # nu must be >= 1
nuMinusOne ̃ dexp(1/29) # prior on nu-1
}
# 책에는 없는 코드. 실습을 위해 data 폴더에 넣은 코드를 실행시키기 위한 작업
cur_dir = getwd()
setwd(sprintf("%s/%s", cur_dir, 'data'))
#-------------------------------------------------------------------------------
# Optional generic preliminaries:
graphics.off() # This closes all of R's graphics windows.
rm(list=ls()) # Careful! This clears all of R's memory!
#-------------------------------------------------------------------------------
# Load The data file
myDataFrame = read.csv( file="TwoGroupIQ.csv" )
# For purposes of this one-group example, use data from Smart Drug group:
myData = myDataFrame$Score[myDataFrame$Group=="Smart Drug"]
#-------------------------------------------------------------------------------
# Optional: Specify filename root and graphical format for saving output.
# Otherwise specify as NULL or leave saveName and saveType arguments
# out of function calls.
fileNameRoot = "OneGroupIQrobust-Jags-"
graphFileType = "png" #"eps"
#-------------------------------------------------------------------------------
# Load the relevant model into R's working memory:
source("Jags-Ymet-Xnom1grp-Mrobust.R")
********************************************************************* Kruschke, J. K. (2015). Doing Bayesian Data Analysis, Second Edition: A Tutorial with R, JAGS, and Stan. Academic Press / Elsevier. *********************************************************************
Installing packages into ‘/home/moosung/R/x86_64-pc-linux-gnu-library/3.2’ (as ‘lib’ is unspecified)
Error in contrib.url(repos, type): trying to use CRAN without setting a mirror
#-------------------------------------------------------------------------------
# Generate the MCMC chain:
mcmcCoda = genMCMC( data=myData , numSavedSteps=20000 , saveName=fileNameRoot )
#-------------------------------------------------------------------------------
# Display diagnostics of chain, for specified parameters:
parameterNames = varnames(mcmcCoda) # get all parameter names
for ( parName in parameterNames ) {
diagMCMC( codaObject=mcmcCoda , parName=parName ,
saveName=fileNameRoot , saveType=graphFileType )
}
# 창으로 뜨는 것들을 없앤다.
graphics.off()
#-------------------------------------------------------------------------------
# Get summary statistics of chain:
summaryInfo = smryMCMC( mcmcCoda ,
compValMu=100.0 , ropeMu=c(99.0,101.0) ,
compValSigma=15.0 , ropeSigma=c(14,16) ,
compValEff=0.0 , ropeEff=c(-0.1,0.1) ,
saveName=fileNameRoot )
show(summaryInfo)
# Display posterior information:
plotMCMC( mcmcCoda , data=myData ,
compValMu=100.0 , ropeMu=c(99.0,101.0) ,
compValSigma=15.0 , ropeSigma=c(14,16) ,
compValEff=0.0 , ropeEff=c(-0.1,0.1) ,
pairsPlot=TRUE , showCurve=FALSE ,
saveName=fileNameRoot , saveType=graphFileType )
graphics.off()
data {
int<lower=1> Ntotal ;
real y[Ntotal] ;
real meanY ;
real sdY ;
}
transformed data { // compute the constants for the priors
real unifLo ;
real unifHi ;
real normalSigma ;
real expLambda ;
unifLo <- sdY/1000 ;
unifHi <- sdY*1000 ;
normalSigma <- sdY*100 ;
expLambda <- 1/29.0 ;
}
parameters {
real<lower=0> nuMinusOne ;
real mu ;
real<lower=0> sigma ;
}
transformed parameters {
real<lower=0> nu ;
nu <- nuMinusOne + 1 ;
}
model {
sigma ̃ uniform( unifLo , unifHi ) ;
mu ̃ normal( meanY , normalSigma ) ;
nuMinusOne ̃ exponential( expLambda ) ;
y ̃ student_t( nu , mu , sigma ) ; // vectorized
}
# 책에는 없는 코드. 실습을 위해 data 폴더에 넣은 코드를 실행시키기 위한 작업
cur_dir = getwd()
setwd(sprintf("%s/%s", cur_dir, 'data'))
#-------------------------------------------------------------------------------
# Optional generic preliminaries:
graphics.off() # This closes all of R's graphics windows.
rm(list=ls()) # Careful! This clears all of R's memory!
#-------------------------------------------------------------------------------
# Load The data file
myDataFrame = read.csv( file="TwoGroupIQ.csv" )
# For purposes of this one-group example, use data from Smart Drug group:
myData = myDataFrame$Score[myDataFrame$Group=="Smart Drug"]
#-------------------------------------------------------------------------------
# Load the relevant model into R's working memory:
source("Stan-Ymet-Xnom1grp-Mrobust.R")
#-------------------------------------------------------------------------------
# Optional: Specify filename root and graphical format for saving output.
# Otherwise specify as NULL or leave saveName and saveType arguments
# out of function calls.
fileNameRoot = "OneGroupIQrobust-Stan-"
graphFileType = "png" #"eps"
#-------------------------------------------------------------------------------
# Generate the MCMC chain:
mcmcCoda = genMCMC( data=myData , numSavedSteps=20000 , saveName=fileNameRoot )
#-------------------------------------------------------------------------------
# Display diagnostics of chain, for specified parameters:
parameterNames = varnames(mcmcCoda) # get all parameter names
for ( parName in parameterNames ) {
diagMCMC( codaObject=mcmcCoda , parName=parName ,
saveName=fileNameRoot , saveType=graphFileType )
}
# 창으로 뜨는 것들을 없앤다.
graphics.off()
#-------------------------------------------------------------------------------
# Get summary statistics of chain:
summaryInfo = smryMCMC( mcmcCoda ,
compValMu=100.0 , ropeMu=c(99.0,101.0) ,
compValSigma=15.0 , ropeSigma=c(14,16) ,
compValEff=0.0 , ropeEff=c(-0.1,0.1) ,
saveName=fileNameRoot )
show(summaryInfo)
# Display posterior information:
plotMCMC( mcmcCoda , data=myData ,
compValMu=100.0 , ropeMu=c(99.0,101.0) ,
compValSigma=15.0 , ropeSigma=c(14,16) ,
compValEff=0.0 , ropeEff=c(-0.1,0.1) ,
pairsPlot=TRUE , showCurve=FALSE ,
saveName=fileNameRoot , saveType=graphFileType )
graphics.off()
data {
int<lower=1> Ntotal ;
int x[Ntotal] ; // group identifier
real y[Ntotal] ;
real meanY ;
real sdY ;
}
transformed data {
real unifLo ;
real unifHi ;
real normalSigma ;
real expLambda ;
unifLo <- sdY/1000 ;
unifHi <- sdY*1000 ;
normalSigma <- sdY*100 ;
expLambda <- 1/29.0 ;
}
parameters {
real<lower=0> nuMinusOne ;
real mu[2] ; // 2 groups
real<lower=0> sigma[2] ; // 2 groups
}
transformed parameters {
real<lower=0> nu ;
nu <- nuMinusOne + 1 ;
}
model {
sigma ̃ uniform( unifLo , unifHi ) ; // vectorized 2 groups
mu ̃ normal( meanY , normalSigma ) ; // vectorized 2 groups
nuMinusOne ̃ exponential( expLambda ) ;
for ( i in 1:Ntotal ) {
y[i] ̃ student_t( nu , mu[x[i]] , sigma[x[i]] ) ;
}
}
#-------------------------------------------------------------------------------
# Optional generic preliminaries:
graphics.off() # This closes all of R's graphics windows.
rm(list=ls()) # Careful! This clears all of R's memory!
#-------------------------------------------------------------------------------
# Load The data file
myDataFrame = read.csv( file="TwoGroupIQ.csv" )
# For purposes of this one-group example, use data from Smart Drug group:
myData = myDataFrame$Score[myDataFrame$Group=="Smart Drug"]
#-------------------------------------------------------------------------------
# Load the relevant model into R's working memory:
source("Stan-Ymet-Xnom1grp-Mrobust.R")
#-------------------------------------------------------------------------------
# Optional: Specify filename root and graphical format for saving output.
# Otherwise specify as NULL or leave saveName and saveType arguments
# out of function calls.
fileNameRoot = "OneGroupIQrobust-Stan-"
graphFileType = "png" #"eps"
#-------------------------------------------------------------------------------
# Generate the MCMC chain:
mcmcCoda = genMCMC( data=myData , numSavedSteps=20000 , saveName=fileNameRoot )
#-------------------------------------------------------------------------------
# Display diagnostics of chain, for specified parameters:
parameterNames = varnames(mcmcCoda) # get all parameter names
for ( parName in parameterNames ) {
diagMCMC( codaObject=mcmcCoda , parName=parName ,
saveName=fileNameRoot , saveType=graphFileType )
}
# 창으로 뜨는 것들을 없앤다.
graphics.off()
#-------------------------------------------------------------------------------
# Get summary statistics of chain:
summaryInfo = smryMCMC( mcmcCoda ,
compValMu=100.0 , ropeMu=c(99.0,101.0) ,
compValSigma=15.0 , ropeSigma=c(14,16) ,
compValEff=0.0 , ropeEff=c(-0.1,0.1) ,
saveName=fileNameRoot )
show(summaryInfo)
graphics.off()
# Display posterior information:
plotMCMC( mcmcCoda , data=myData ,
compValMu=100.0 , ropeMu=c(99.0,101.0) ,
compValSigma=15.0 , ropeSigma=c(14,16) ,
compValEff=0.0 , ropeEff=c(-0.1,0.1) ,
pairsPlot=TRUE , showCurve=FALSE ,
saveName=fileNameRoot , saveType=graphFileType )
graphics.off()
myDataFrame = read.csv( file="TwoGroupIQ.csv" )
t.test(Score~Group , data=myDataFrame)
Welch Two Sample t-test data: Score by Group t = -1.958, df = 111.44, p-value = 0.05273 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -15.70602585 0.09366161 sample estimates: mean in group Placebo mean in group Smart Drug 100.0351 107.8413