new-sampling: specifying a new sampling distribution (illustrated with normal)
This example includes a demonstration of how future observations can also be modelled.
Model:
model {
# trick using zeros
for (i in 1:7) {
zeros[i] <- 0
zeros[i] ~ dpois(phi[i]) # likelihood is exp(-phi[i])
phi[i] <- log(sigma[1]) + 0.5 * pow((x[i] - mu[1]) / sigma[1], 2) # -log(likelihood)
}
# check using normal distribution
for (i in 1:7) {
x[i] ~ dnorm(mu[2], prec)
}
prec <- 1 / (sigma[2] * sigma[2])
for (k in 1:2) {
mu[k] ~ dunif(-10, 10)
sigma[k] ~ dunif(0, 10)
}
}
Data:
list(x = c(-1, -0.3, 0.1, 0.2, 0.7, 1.2, 1.7))
Initial values:
list(sigma = c(1, 1), mu = c(0, 0))
The results show a close match between the standard method and the zeros trick:
node mean sd MC error 2.5% median 97.5% start sample
mu[1] 0.3661 0.4893 0.00715 -0.6296 0.364 1.336 1001 5000
mu[2] 0.3668 0.4734 0.007003 -0.5591 0.3677 1.291 1001 5000
sigma[1] 1.191 0.4902 0.01456 0.6293 1.075 2.476 1001 5000
sigma[2] 1.168 0.461 0.009039 0.6265 1.062 2.311 1001 5000
Modelling future observations or missing data - add a missing data point to the data-file
Model:
model {
# trick using zeros
for (i in 1:8) {
zeros[i] <- 0
zeros[i] ~ dpois(phi[i]) # likelihood is exp(-phi[i])
phi[i] <- log(sigma[1]) + 0.5 * pow((x[i] - mu[1]) / sigma[1], 2) # -log(likelihood)
x[i] ~ dflat() # improper uniform prior on x's
}
# check using normal distribution
for (i in 1:8) {
x.rep[i] ~ dnorm(mu[2], prec)
}
prec <- 1 / (sigma[2] * sigma[2])
for (k in 1:2) {
mu[k] ~ dunif(-10, 10)
sigma[k] ~ dunif(0, 10)
}
}
Data:
list(x = c(-1, -0.3, 0.1, 0.2, 0.7, 1.2, 1.7, NA), x.rep=c(-1, -0.3, 0.1, 0.2, 0.7, 1.2, 1.7, NA))
(Note that in this case we need to replicate the dataset in order to compare results.)
Initial values:
list(sigma = c(1, 1), mu = c(0, 0), x = c(NA, NA, NA, NA, NA, NA, NA, 0))
(We need to give an initial value to the missing data, otherwise WinBUGS will try to generate one from the improper prior and crash!)
The agreement is fine but the MC error on the prediction is large and so a long run is necessary:
node mean sd MC error 2.5% median 97.5% start sample
mu[1] 0.3689 0.4948 0.003145 -0.6157 0.3714 1.351 5001 50000
mu[2] 0.368 0.486 0.002319 -0.6091 0.3713 1.326 5001 50000
sigma[1] 1.194 0.5066 0.005429 0.6222 1.075 2.46 5001 50000
sigma[2] 1.187 0.4989 0.003917 0.6224 1.069 2.45 5001 50000
x[8] 0.357 1.407 0.01511 -2.494 0.3608 3.133 5001 50000
x.rep[8] 0.3585 1.381 0.006126 -2.372 0.3584 3.105 5001 50000