3.7 Computational Bayes

This video belongs to the openHPI course Introduction to Bayesian Data Analysis. Do you want to see more?

3.7 Computational Bayes

Time effort: approx. 9 minutes

An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.

Scroll to current position

00:00What we looked at so far is a model with a log-normal likelihood, but with completely ridiculous priors and we know that
00:08they're ridiculous because of the types of data that the model generates from the prior in the prior predictive distribution.
00:16So, can we come up with a more reasonable set of priors that produce more realistic data?
00:22And this is an important question that we can ask and we can answer this quite easily using these computational tools that
00:29I'm showing you.
00:30So let's say that we assume that the average reading time or reaction time is the exponent of six log milliseconds.
00:41So, whatever that number is, and with a standard vision of 1.5, you could of course plot this now.
00:48Using the dnorm function, you can plot this distribution and that's a very good exercise to get a sense
00:54of whether this is a reasonable prior specification or not, or you could generate data from this normal distribution
01:03rnorm data exponentiated.
01:05And then look at the distribution of that data to see if it reflects your beliefs about average reaction times.
01:13And this is the sigma prior.
01:15It's a truncated prior because you cannot have less than zero milliseconds
01:20standard deviation.
01:21So for this parameter, you will truncate it at zero with the standard deviation of one and there's more to say about truncated
01:27priors, of course, and truncated distributions in general, it's all in the textbook, but we don't need more information than
01:33that right now.
01:34Okay, what will change in the model specification in brms is very little actually.
01:40What we're going to do is we're going to define a log-normal likelihood of course instead of the Gaussian
01:51That's a big change that we're making and we are going to specify the priors in this prior parameter.
01:58We go to 6, 1.5 normal prior on the log scale.
02:02For the intercept.
02:03And we are specifying, this is very important.
02:06We are specifying a normal 0, 1 prior, mean 0, standard deviation 1, for the sigma parameter.
02:14But notice that I don't write anything here about truncation.
02:17I don't need to. One of the beauties of the brms packages that it just takes care of the fact that it knows that
02:25sigma cannot have a negative value.
02:28And so it truncates this prior for you
02:31internally.
02:33If you were writing pure stan code, which you would do if you're doing more advanced Bayesian modeling,
02:38writing customized models, then you would have to pay attention to the prior specifications and so on.
02:44But here brms takes care of this for you.
02:49So notice what I'm doing here, I specified here a new command that you haven't seen before and it says sample_prior = "only"
02:59What this is doing is that although it's going to take the data as input,
03:06it's not gonna look at that data at all, it's only going to produce samples from the prior.
03:12So this is one way to use the brms package to produce prior predictive distributions.
03:16You do have to specify the data frame in the model but the function will not use that data
03:24to produce prior predictive data.
03:26So this is a very convenient function here.
03:28And so one thing you'll notice is that there's another command here that I put in this control command with adapt_delta equal
03:35to 0.9
03:36I had to put this in because the prior predictive data sometimes has some convergence problems and you can change the
03:46parameter settings inside the MCMC sampling algorithm that is used by stan, you can change it so that those convergence
03:55problems don't happen.
03:56I won't discuss this much, but you can look this up in the textbook and also in the stan manual.
04:03It's discussed in quite a lot of detail in the stan manual but we discussed this briefly in the book, so it's not important
04:10right now for us to worry about this control parameter.
04:13All we need to know is that we need this control parameter to generate those prior predictive distributions such
04:20that we get, you know, convergence when we produce the data.
04:24Okay, so what's more interesting however, is what the data looks like under these more I would say semi informative or
04:34regularizing priors.
04:36So in this case the data looks very interesting.
04:39So the observed mean is somewhere here, I believe this line here
04:46should be the observed mean.
04:48It was something like in this area.
04:51If I remember correctly.
04:52And what we are observing looking at here is the prior predictive distribution of the means under repeated generation of
05:01prior predicted data.
05:02And you see that it's a pretty reasonable distribution now.
05:05This is a reasonable reflection of mean reading times under future data sets.
05:12Of course it's a little bit allowing too much to slow reading times,
05:19an average of 10 seconds.
05:21It's probably
05:22not so great.
05:23So it's not optimal.
05:24You could come up with better priors
05:25to obtain more realistic prior predictive distributions, but it's good enough.
05:30It's not crazy like the previous example that I showed you in the previous lecture with those uniform priors, we were all
05:37over the place like in hundreds of seconds or something like that.
05:40That was completely insane.
05:41So this is a lot more reasonable in in comparison.
05:45And here I've computed some more statistics.
05:47These are also in the textbook for distribution of the means of the prior predictive data sets of the minimum and maximum
05:54values.
05:55And you see that the minimum values are pretty much they're all positive.
06:00And they're around 100 milliseconds. There's some variation around that and the maximum values tend to be a little bit too
06:08large.
06:09So one could still fix this model but it's good enough for our purposes because we have sufficient data and the posterior
06:15will not be heavily affected by the prior.
06:18So even though these prior predictive distributions are not spectacular, they are a huge improvement over the uniform prior
06:26driven prior predictive distributions that we saw earlier.
06:30Okay, so relatively speaking, this is a huge improvement.
06:33But we could do better.
06:35But anyway, so that was the prior predictive data using the brm function.
06:39Okay, So what's cool about this
06:41brms packages that you can produce most of the functionality that you would have to do by hand in R or some
06:49other tool, you can produce it through the function
06:53brm.
06:54And so this is a very useful package for doing quick data analysis.
06:59As long as you know what you're doing.
07:00So what we're doing now is that we're gonna fit this model to the data. Now, we're actually gonna fit the model with a log
07:07normal likelihood with these prior specifications.
07:10These are the ones that we just chose.
07:12And what you could do.
07:14I haven't done this here because we're too much text on the screen but you could just simply type the result of
07:22fitting this model.
07:23So this is the object that I created.
07:25This brms object.
07:27If I print this out on the command line now I will get a verbose output which will tell me everything that I need to
07:33know about
07:34the parameters posteriors and the summary statistics and so on.
07:37So you should try that out yourself.
07:39Okay.
07:41So what I'm going to do next is in the next lecture.
07:45I will summarize the results of the analysis and I will also show you how to compare different models.
07:51Using different posterior predictive data.
07:54So the posterior predictive data that different models will generate.
07:57We can compare them to see which model makes more sense for the given data that we have.
08:03That's how we actually do data analysis.
08:05When we're trying to decide on a likelihood and trying to decide on priors.
08:09We look at the prior and the posterior predictive data to decide on which kind of likelihood I'm going to choose or which kind
08:16of priors we're gonna choose.
08:17So that's coming up next.