This is essentially the example used in Fry, B. 2013. Alternative approaches for solving under-determined isotope mixing problems. MEPS.
Here we use the package simmr
to explore the two
alternatives to aggregating sources in mixing models.
library(simmr, quietly = TRUE)
Linked to JAGS 4.3.2
Loaded modules: basemod,bugs
Attaching package: ‘R2jags’
The following object is masked from ‘package:coda’:
traceplot
# Set the random seed so we get the same set of
# random numbers each time we run this example.
set.seed(1)
# specify the sources
# sources <- data.frame(sources=c("A","B","C","D"),
# muC=c(-5,-5,5,5),sdC=c(1,1,1,1),
# muN=c(-5,5,5,-5),sdN=c(1,1,1,1))
# specify the source names
S_names = c("A","B","C","D")
# specify the source means by binding two vectors
# together into a matrix by columns.
S_means = cbind(c(-5,-5,5,5), c(-5,5,5,-5))
# specify the source standard deviations
S_sds = cbind(c(1,1,1,1), c(1,1,1,1))
# speficy the consumer data at the origin
# Ten consumers for this example around 0 with small sd of error.
consumers <- cbind(dC = rnorm(n = 10, mean = 0, sd = 0.1),
dN = rnorm(n = 10, mean = 0, sd = 0.1) )
# and create the simmr object
# here we have no TDFs or concentration values
simmr_in <- simmr_load(mixtures = consumers,
source_names = S_names,
source_means = S_means,
source_sds = S_sds)
Now we can plot the data to visualise our system and the output,
plot(simmr_in)
We can fit a simmr model using the defaults, and here suppress the
output using the results='hide'
option in the chunk.
simmr_out = simmr_mcmc(simmr_in)
And we should always check for convergence.
# a summary table of convergence diagnostics
summary(simmr_out,type='diagnostics')
Summary for 1
R-hat values - these values should all be close to 1.
If not, try a longer run of simmr_mcmc.
deviance A B C D sd[dC] sd[dN]
1.00 1.01 1.01 1.01 1.01 1.00 1.00
# plot the posterior predictive power
posterior_predictive(simmr_out)
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 20
Unobserved stochastic nodes: 26
Total graph size: 149
Initializing model
|
| | 0%
|
|++++++++++ | 20%
|
|++++++++++++++++++++ | 40%
|
|++++++++++++++++++++++++++++++ | 60%
|
|++++++++++++++++++++++++++++++++++++++++ | 80%
|
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 7%
|
|**** | 9%
|
|****** | 11%
|
|******* | 13%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 27%
|
|************** | 29%
|
|**************** | 31%
|
|***************** | 33%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 47%
|
|************************ | 49%
|
|************************** | 51%
|
|*************************** | 53%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 67%
|
|********************************** | 69%
|
|************************************ | 71%
|
|************************************* | 73%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 87%
|
|******************************************** | 89%
|
|********************************************** | 91%
|
|*********************************************** | 93%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
The results of this toy example are not expected to be overly helpful or meaningful.
summary(simmr_out,type='statistics')
Summary for 1
mean sd
deviance 22.575 5.976
A 0.250 0.077
B 0.250 0.077
C 0.253 0.077
D 0.248 0.077
sd[dC] 0.394 0.224
sd[dN] 0.399 0.229
summary(simmr_out,type='quantiles')
Summary for 1
2.5% 25% 50% 75% 97.5%
deviance 13.563 18.192 21.712 26.090 36.442
A 0.100 0.198 0.250 0.301 0.404
B 0.101 0.197 0.248 0.302 0.403
C 0.103 0.202 0.255 0.305 0.402
D 0.094 0.196 0.247 0.300 0.401
sd[dC] 0.100 0.238 0.351 0.499 0.940
sd[dN] 0.100 0.242 0.353 0.504 0.964
Plot the estimates of the dietary proportions
# Plot the a priori aggregated diet estimatess
plot(simmr_out, type = "density")
Plot the covariance between the estimated dietary proportions in the posterior.
plot(simmr_out, type = "matrix")
We combine the sources C and D before we run the model as is sometimes suggested. We do this by taking the mean of the means, and we square the SDs to make them variances, then add them, and then square-root them to turn them back into SDs again. We dont need to change the consumer data in any way.
# specify the source names
S_names_a_priori = c("A","B","CD")
# take the mean of the last two sources C and D
S_means_a_priori = rbind(S_means[1:2,],
colMeans(S_means[3:4,]) )
# square the sds for sources C and D to convert to variance,
# sum them and convert back to sd
S_sds_a_priori = rbind( S_sds[1:2,], colSums(S_sds[3:4,]^2) ^ 0.5 )
# and create the new simmr object
# here we have no TDFs or concentration values
simmr_in_a_priori <- simmr_load(mixtures = consumers,
source_names = S_names_a_priori,
source_means = S_means_a_priori,
source_sds = S_sds_a_priori)
Plot this newly combined data.
# plot the raw data for the a priori aggregated example
plot(simmr_in_a_priori)
Run the model on the a priori combined data.
# fit the a priori aggregated models
simmr_out_a_priori = simmr_mcmc(simmr_in_a_priori)
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 20
Unobserved stochastic nodes: 5
Total graph size: 115
Initializing model
|
| | 0%
|
|++++++++++ | 20%
|
|++++++++++++++++++++ | 40%
|
|++++++++++++++++++++++++++++++ | 60%
|
|++++++++++++++++++++++++++++++++++++++++ | 80%
|
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 7%
|
|**** | 9%
|
|****** | 11%
|
|******* | 13%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 27%
|
|************** | 29%
|
|**************** | 31%
|
|***************** | 33%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 47%
|
|************************ | 49%
|
|************************** | 51%
|
|*************************** | 53%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 67%
|
|********************************** | 69%
|
|************************************ | 71%
|
|************************************* | 73%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 87%
|
|******************************************** | 89%
|
|********************************************** | 91%
|
|*********************************************** | 93%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
And plot
# Plot the a priori aggregated diet estimatess
plot(simmr_out_a_priori, type = "density")
… and now apparently we are very sure about the contributions of all sources to the diet. There is some correlation between A and B since they need to balance each other out in combination to yield a dB value of 0. You would now incorrectly assume that CD represents pretty much a guaranteed 43% of the diet.
plot(simmr_out_a_priori, type = "matrix")
Far more honest is to fit the model as before, with the sources as we believe them to be a priori and then simply add our prortions together from the posterior distribution. This is made very easy in SIMMR and also MixSIAR with dedicated functions. In fact, MixSIAR also allows easier a priori aggregation by hiding the routine outlined in the preceding section.
# combine sources C and D which are in positions 3 and 4
simmr_out_a_posteriori <-
combine_sources(
simmr_out,
to_combine = simmr_out$input$source_names[c(3,4)],
new_source_name = "CD")
# Plot the a posteriori aggregated diet estimatess
plot(simmr_out_a_posteriori, type = "density")
NA
NA
plot(simmr_out_a_posteriori, type = "matrix")
This result fits much better with what we would predict: that if the model is still not sure about the contribution of the four sources to the mixture, but that it is pretty sure that on average, 50% of the diet is comprised of both C and D. This concept continues until the model is entirely certain, with no error, that the diet is wholly 100% of A+B+C+D.
One thing to experiment with here is the use of the Jeffrey’s prior
of c(0.25, 0.25, 0.25, 0.25)
in place of the default vague
prior c(1, 1, 1, 1)
. This is the nub of the criticism
levelled at the SIMMs by Brett, M. 2016. Resource polygon geometry
predicts Bayesian stable isotope mixing model bias. MEPS.
N.B because simmr no longer uses the Dirichlet prior, this is more difficult and need more thought about how to specify a Jeffries-like or similar specialist prior. The code below won’t work and the prior_control needs to be added.
# simmr_out_specialist_prior = simmr_mcmc(simmr_in,
# prior_control = XYZ)