And now the full code…
So far these still just point-metrics that describe the width of the isotopic niche. That is, they are single numbers for each group, which means that we can’t compare one group to another in a statisical sense as we lack a measure of the uncertainty around each estimate. This is where we can use Bayesian Inference to quantify the error associated with fitting these ellipses to each group, that arises from both the number of samples we have, and also their distribution.
Essentially, what the MCMC algorithm does is generate a distribution of covariance matrices that to a greater or lesser extent (in terms of likelihood) describe the observed data. It does so, as is the general case in Bayesian Inference, by combing the prior probability with the likelihood of the data for a given covariance matrix.
SIBER uses the jags package to fit the Bayesian model and so we need to specify the parameters of the simulation run, including: run length, burn-in period, number of chains etc…
# options for running jags
parms <- list()
parms$n.iter <- 2 * 10^4 # number of iterations to run the model for
parms$n.burnin <- 1 * 10^3 # discard the first set of values
parms$n.thin <- 10 # thin the posterior by this many
parms$n.chains <- 2 # run this many chains
# define the priors
priors <- list()
priors$R <- 1 * diag(2)
priors$k <- 2
priors$tau.mu <- 1.0E-3
# fit the ellipses which uses an Inverse Wishart prior
# on the covariance matrix Sigma, and a vague normal prior on the
# means. Fitting is via the JAGS method.
ellipses.posterior <- siberMVN(siber.example, parms, priors)
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 20
Unobserved stochastic nodes: 3
Total graph size: 35
Initializing model
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 6%
|
|**** | 8%
|
|***** | 10%
|
|****** | 12%
|
|******* | 14%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 26%
|
|************** | 28%
|
|*************** | 30%
|
|**************** | 32%
|
|***************** | 34%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 46%
|
|************************ | 48%
|
|************************* | 50%
|
|************************** | 52%
|
|*************************** | 54%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 66%
|
|********************************** | 68%
|
|*********************************** | 70%
|
|************************************ | 72%
|
|************************************* | 74%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 86%
|
|******************************************** | 88%
|
|********************************************* | 90%
|
|********************************************** | 92%
|
|*********************************************** | 94%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 20
Unobserved stochastic nodes: 3
Total graph size: 35
Initializing model
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 6%
|
|**** | 8%
|
|***** | 10%
|
|****** | 12%
|
|******* | 14%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 26%
|
|************** | 28%
|
|*************** | 30%
|
|**************** | 32%
|
|***************** | 34%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 46%
|
|************************ | 48%
|
|************************* | 50%
|
|************************** | 52%
|
|*************************** | 54%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 66%
|
|********************************** | 68%
|
|*********************************** | 70%
|
|************************************ | 72%
|
|************************************* | 74%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 86%
|
|******************************************** | 88%
|
|********************************************* | 90%
|
|********************************************** | 92%
|
|*********************************************** | 94%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 20
Unobserved stochastic nodes: 3
Total graph size: 35
Initializing model
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 6%
|
|**** | 8%
|
|***** | 10%
|
|****** | 12%
|
|******* | 14%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 26%
|
|************** | 28%
|
|*************** | 30%
|
|**************** | 32%
|
|***************** | 34%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 46%
|
|************************ | 48%
|
|************************* | 50%
|
|************************** | 52%
|
|*************************** | 54%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 66%
|
|********************************** | 68%
|
|*********************************** | 70%
|
|************************************ | 72%
|
|************************************* | 74%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 86%
|
|******************************************** | 88%
|
|********************************************* | 90%
|
|********************************************** | 92%
|
|*********************************************** | 94%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 20
Unobserved stochastic nodes: 3
Total graph size: 35
Initializing model
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 6%
|
|**** | 8%
|
|***** | 10%
|
|****** | 12%
|
|******* | 14%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 26%
|
|************** | 28%
|
|*************** | 30%
|
|**************** | 32%
|
|***************** | 34%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 46%
|
|************************ | 48%
|
|************************* | 50%
|
|************************** | 52%
|
|*************************** | 54%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 66%
|
|********************************** | 68%
|
|*********************************** | 70%
|
|************************************ | 72%
|
|************************************* | 74%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 86%
|
|******************************************** | 88%
|
|********************************************* | 90%
|
|********************************************** | 92%
|
|*********************************************** | 94%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 8
Unobserved stochastic nodes: 3
Total graph size: 23
Initializing model
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 6%
|
|**** | 8%
|
|***** | 10%
|
|****** | 12%
|
|******* | 14%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 26%
|
|************** | 28%
|
|*************** | 30%
|
|**************** | 32%
|
|***************** | 34%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 46%
|
|************************ | 48%
|
|************************* | 50%
|
|************************** | 52%
|
|*************************** | 54%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 66%
|
|********************************** | 68%
|
|*********************************** | 70%
|
|************************************ | 72%
|
|************************************* | 74%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 86%
|
|******************************************** | 88%
|
|********************************************* | 90%
|
|********************************************** | 92%
|
|*********************************************** | 94%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 8
Unobserved stochastic nodes: 3
Total graph size: 23
Initializing model
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 6%
|
|**** | 8%
|
|***** | 10%
|
|****** | 12%
|
|******* | 14%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 26%
|
|************** | 28%
|
|*************** | 30%
|
|**************** | 32%
|
|***************** | 34%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 46%
|
|************************ | 48%
|
|************************* | 50%
|
|************************** | 52%
|
|*************************** | 54%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 66%
|
|********************************** | 68%
|
|*********************************** | 70%
|
|************************************ | 72%
|
|************************************* | 74%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 86%
|
|******************************************** | 88%
|
|********************************************* | 90%
|
|********************************************** | 92%
|
|*********************************************** | 94%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 8
Unobserved stochastic nodes: 3
Total graph size: 23
Initializing model
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 6%
|
|**** | 8%
|
|***** | 10%
|
|****** | 12%
|
|******* | 14%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 26%
|
|************** | 28%
|
|*************** | 30%
|
|**************** | 32%
|
|***************** | 34%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 46%
|
|************************ | 48%
|
|************************* | 50%
|
|************************** | 52%
|
|*************************** | 54%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 66%
|
|********************************** | 68%
|
|*********************************** | 70%
|
|************************************ | 72%
|
|************************************* | 74%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 86%
|
|******************************************** | 88%
|
|********************************************* | 90%
|
|********************************************** | 92%
|
|*********************************************** | 94%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 8
Unobserved stochastic nodes: 3
Total graph size: 23
Initializing model
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 6%
|
|**** | 8%
|
|***** | 10%
|
|****** | 12%
|
|******* | 14%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 26%
|
|************** | 28%
|
|*************** | 30%
|
|**************** | 32%
|
|***************** | 34%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 46%
|
|************************ | 48%
|
|************************* | 50%
|
|************************** | 52%
|
|*************************** | 54%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 66%
|
|********************************** | 68%
|
|*********************************** | 70%
|
|************************************ | 72%
|
|************************************* | 74%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 86%
|
|******************************************** | 88%
|
|********************************************* | 90%
|
|********************************************** | 92%
|
|*********************************************** | 94%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
What we end up with is a range of ellipses that could explain the data, with more of them clustered around the most likely solution. However, one cannot simply take an average across these covariance matrices, as there are strict mathematical properties that must be maintained. The result of this is that it is not possible to plot a mean, median or modal Bayesian Standard Ellipse; instead we must calculate each one of the ellipse’s area, and then present summary statistics of this derived measurement. SIBER contains a function that will automatically loop over all the groups and do this.
The plots below represent the posterior distribution of the SEA_B fitted to each of the 4 groups in our dataset.
In order to test whether one group’s ellipse is smaller or larger than another, we can simply calculate the probability that its posterior distribution is smaller (or larger). This is acheived by comparing each pair of posterior draws for both groups, and dtermining which is smaller in magnitude. We then find the proportion of draws that are smaller, and this is a direct proxy for the probability that one group’s posterior distribution (of ellipse size in this case) is smaller than the other.
Here, we first calculate the proportion, and hence probability, of the SEA.B for group 1 being smaller than the SEA.B for group 2.
Pg1.lt.g2 <- sum( SEA.B[,1] < SEA.B[,2] ) / nrow(SEA.B)
print(Pg1.lt.g2)
[1] 1
So, in this case, all of the estimates for groups 1’s ellipse are smaller than for group 2; although we could probably guess at this given that there appears to be no overlap between then 95% credible intervals of the two groups (see the figure above).
Then we can do exactly the same for groups 1 and 3.
Pg1.lt.g3 <- sum( SEA.B[,1] < SEA.B[,3] ) / nrow(SEA.B)
print(Pg1.lt.g3 )
[1] 1
And then for the other pairings:
Pg1.lt.g4 <- sum( SEA.B[,1] < SEA.B[,4] ) / nrow(SEA.B)
print(Pg1.lt.g4)
[1] 1
Pg2.lt.g3 <- sum( SEA.B[,2] < SEA.B[,3] ) / nrow(SEA.B)
print(Pg2.lt.g3)
[1] 1
Pg3.lt.g4 <- sum( SEA.B[,3] < SEA.B[,4] ) / nrow(SEA.B)
print(Pg3.lt.g4)
[1] 0.011
Pg5.lt.g7 <- sum( SEA.B[,5] < SEA.B[,7] ) / nrow(SEA.B)
print(Pg5.lt.g7)
[1] 0.5715
One can calculate the overlap between two (or more) ellipses. In the first instance, this overlap is simply the area, in units of per mil squared, contained by the shape that lies within the overlapping region. This overlap is most easily calculated by using the SEAc of each ellipse.
The overlap between the SEAc for groups 3 and 4 in Community 1 is given by:
```r
overlap.G3.G4 <- maxLikOverlap(\1.3\, \1.4\, siber.example, p = 0.95, n =100)
<!-- rnb-source-end -->
<!-- rnb-chunk-end -->
<!-- rnb-text-begin -->
And the overlap between SEAc of groups 1.2 and 2.1 is given by:
<!-- rnb-text-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-source-begin eyJkYXRhIjoiYGBgclxub3ZlcmxhcC5DMUcyLkMyRzEgPC0gbWF4TGlrT3ZlcmxhcChcIjEuMlwiLCBcIjIuMVwiLCBzaWJlci5leGFtcGxlLCBwID0gMC45NSwgbiA9IDEwMClcbmBgYCJ9 -->
```r
overlap.C1G2.C2G1 <- maxLikOverlap("1.2", "2.1", siber.example, p = 0.95, n = 100)
One might then wish to calculate the proportion overlap; athough one then runs into a choice as to what the demoninator will be in the equation. You could for instance calculate the proportion of A that overlaps with B, the proporiton of B that overlaps with A, or the proportion of A and B that overlap with each other.
print(prop.of.both)
[1] 0.1024213
A problem with this simple overlap calculation is that it yields a
point-estimate of overlap based on the maximum likelihood estimated
SEA_c. One can instead calculate a distribution of overlap based on the
posterior distirbutions of the fitted ellipses. It can be a bit slow to
calculate this overlap, so you may want to drop the number of
draws
if your computer is slow.
bayes.overlap.G3.G4 <- bayesianOverlap("1.3", "1.4",
ellipses.posterior,
draws = 100,
p.interval = 0.95,
n = 360)
print(bayes.overlap.G3.G4)
NA
NA
And summarise the credible intervals of the Bayesian overlap output. Note that this code does not work well on the small number of posterior draws we are using for this basic example - for one it returns negative values which is not possible, but is arising as the smoother has not got enough information to stay close to or within the positive number range.
# and we can calculate the corresponding credible intervals using
# our code from above again
# call to hdrcde:hdr using lapply()
overlap.credibles <- lapply(
as.data.frame(bayes.overlap.G3.G4),
function(x,...){tmp<-hdrcde::hdr(x)$hdr},
prob = cr.p)
print(overlap.credibles)
$area1
[,1] [,2]
99% 82.2966 329.4816
95% 133.5036 274.6043
50% 175.8780 230.8007
$area2
[,1] [,2]
99% 31.74017 164.1592
95% 54.65753 140.3323
50% 83.00921 111.4760
$overlap
[,1] [,2]
99% -15.3116934 78.72149
95% 0.4820691 55.66641
50% 17.6210416 35.53631