3.4 Computational Bayes

This video belongs to the openHPI course Introduction to Bayesian Data Analysis. Do you want to see more?

3.4 Computational Bayes

Time effort: approx. 12 minutes

An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.

Scroll to current position

00:00Okay, so now let's take a look at what happens when we specify different priors in the model, that we're looking at.
00:08This simple model of reaction times.
00:10So, as I mentioned in the last lecture, there's a range of priors that you can choose.
00:15You can choose a different set of priors for all the parameters here, I'm just showing you some possibilities for the new
00:22parameters.
00:23So what we're gonna do for the new parameter and so what we're gonna do is that we will change in the brm function.
00:32Remember that in the brm function there was a parameter for priors.
00:38There was a list that you could specify to decide on what the priors are for each of your parameters.
00:44We will simply keep changing these priors in the model and we will look at the posterior distributions of the mu and sigma
00:52parameters as a function of these different prior specifications.
00:57So, this approach of looking at the posterior how the posterior changes as a function of the priors is called the sensitivity
01:04analysis.
01:05Okay, so I'm just giving an example of that.
01:07And the reason we're doing the sensitivity analysis is that our first prior specification was not very satisfactory. Why?
01:17Because the prior predictive distribution was largely nonsensical.
01:21So that means that the model itself is not a great model of reaction times.
01:25So we're going to try to think about some better models.
01:28Okay, so one possibility would be to go even more extreme in terms of uninformative priors and have this kind of uniform
01:36prior which I discussed earlier.
01:38This uninformative prior. And we could study the consequences of that.
01:42There is no harm in deciding on a very big prior to see what happens to the posterior distribution.
01:48So let's compare what happens.
01:49This is the original model that we had fit, I'm just showing you the new parameters.
01:54Okay, just to illustrate the point, this is the original model that I fit with the uniform priors with zero.
02:00And what was that?
02:0160,000 for the mu parameter.
02:03And now I'm using these ridiculously uninformative flat priors that I just fit this one here.
02:10The completely insane prior in the sense that it even allows impossible negative values.
02:15So, but what you'll notice is that the data from this one subject, that the posterior distributions are
02:22pretty similar, they're pretty unaffected by the prior specifications.
02:27So, what this sensitivity analysis is showing is that the posterior is largely unaffected by the prior specification
02:35We can check this even further.
02:37So let's go to the other extreme and choose a relatively informative prior.
02:42So, now I've chosen a prior with a mean of 400, standard deviation of 10, which is very, very tight.
02:49Okay, it's a very strong prior belief that the value is the, you know, the reading times are ranging from 100 plus minus
02:5820 milliseconds
03:00With 95% probability.
03:02So that's a very informative prior.
03:04So let's see if the posterior changes now for mu as a result of this informative prior?
03:09Right as I mentioned earlier, some lectures ago the posterior is going to be a compromise between the prior and the likelihood
03:17But if you have a lot of data, the posterior will be pretty much affected by the likelihood.
03:22So you'll get exactly the same results as a maximum likelihood estimate.
03:29For the mean of the posterior.
03:30So let's look at this again, here's my original model with the uniform prior that I fit earlier in the previous lecture here
03:37is my flat uninformative priors as I showed you a very similar estimate here and now I have an informative prior and if you
03:46notice there's a slight shift in the distribution to the right, okay.
03:51Because I had 166 and 171 as a 95% credible interval.
03:56Now I have 170 and 175.
04:00That shift to the right happened because of the tightly informative prior pushing the posterior bit towards it.
04:07But because we have enough data, I did not push it much.
04:11So we still have a lot of data and that means that the posterior is going to be largely influenced by the likelihood
04:19specification.
04:20Okay, well let's try another example.
04:22Like we could use what I would call principal prior under our terminology, this is a prior which assumes a reasonable mean
04:30but allows a lot more variation
04:35in the mean.
04:38As the prior.
04:40Okay, for the new parameter and again, what I noticed is I'm just showing you all the fits now.
04:44Okay, this was the original fit.
04:46This was the fit with the uninformative flat prior, here's the fit for the intercept parameter and the mu parameter with
04:51the informative prior and finally the principal prior.
04:54So let's compare what happens between the principal, the informative and the principal prior.
05:00So as I showed you the previous slide, with the informative prior I get a slight shift to the right towards the informative
05:07prior when I look at the posterior distribution credible interval here.
05:12Now, if I make the prior much more uncertain by increasing the standard deviation from 20 to 100, as I just did here, notice
05:21this here, I've got 200 mean and standard deviation 100; earlier, I had
05:27400 mean,
05:28and a standard division of 10.
05:30That is an informative prior.
05:32And this is a principal prior in the sense that it sets a principal mean for the mu parameter but allows a lot of uncertainty
05:40What this is expressing this parameter from mu, this prior from mu is expressing is that I think that the prior
05:48mean is about 200, but I'm not very sure about that.
05:51Okay, so what we notice now when I have this kind of principal prior is that the posterior now looks a lot more like the
06:00posterior that I had with the uninformative and the uniform priors earlier.
06:05So the effect of the prior has weakened because of the large standard deviation in the principal prior.
06:10So that's what I'm showing you right now.
06:12This is exactly what I would report in the paper if I were reporting a sensitivity analysis.
06:18So sometimes I do that just to demonstrate how much the posterior distribution changes as a result of different prior specifications
06:27And you can imagine why this is so useful as a data analysis
06:32strategy.
06:34I could now fit different models with priors that reflect different prior beliefs.
06:41So suppose I'm engaged in a scientific argument with an opponent in the field; by opponent
06:47I mean a scientific opponent.
06:48So not a personal enemy or something like that.
06:51So, in that situation, my opponent might have an alternative prior belief about the problem that I'm studying
07:00So what I can do is given my data, I can say, okay, let's take your prior belief and plug it into the model and
07:07see what the posterior gives us.
07:08So what we can do is we can learn, what we can figure out what we can learn from the data, given alternative prior beliefs.
07:17That's what the sensitivity analysis is doing.
07:20The vague priors are expressing the prior assumption that we know nothing about this parameter or very little about this
07:27parameter.
07:27The informative priors are saying that we know quite a lot already.
07:31So we can bring that information into the game.
07:35This is one reason why the Bayesian approach is so useful.
07:38It allows you to think about what you already know and to interpret the data in the light of alternative prior beliefs.
07:47That's the application of a sensitivity analysis in real life data analysis.
07:51Okay.
07:54So, but what we found in this example is that the sensitivity analysis shows that the posterior is not affected much by the
08:02prior specifications.
08:03So you can't get over excited about the fact that there's a four milliseconds slow down here, you know, a shift of four milliseconds
08:10to the right is not really meaningful in terms of changing the meaning of the posterior distribution really.
08:16So, in that sense, it doesn't really affect the posterior.
08:19And this should remind you again, you know, of the discussion we had earlier with the analytical Bayes' that we did with the
08:25conjugate cases that the posterior is a compromise between the prior and the likelihood.
08:30So, I have already hopped on this point quite a bit.
08:33So I won't go on about this, but I just wanted to keep that in mind that we are really talking about what we
08:40have learned from the data in the light of our prior knowledge or beliefs or assumptions.
08:47So in general I would suggest that you carry out such a sensitivity analysis.
08:53I often don't report it in the final published paper but I will certainly do it in my own analysis which I generally put
09:01up on the internet after I published the paper so people can see my analysis.
09:05But what will happen with you?
09:07You know, if you're analyzing lots and lots of data with the Bayesian framework after a while, when you're working on a particular
09:13problem, you will have a very good sense of whether the posterior is going to be sensitive to prior specifications or not
09:20So often with lots of experience, you won't even need to do a sensitivity analysis because you know what's going to happen
09:27The posterior won't be affected much.
09:29And so this is however, in general, when you're working on a new problem, you should think about a sensitivity analysis to
09:37understand what's happening in the light of prior specifications.
09:41Okay.
09:42All right.
09:42So one thing you can do now is to fool around with this model a little bit change the priors.
09:48Try some other priors.
09:49If you look up the stand manual, you will see lots of possibilities that you can choose.
09:56And then what you can do.
09:58We provide the code in the text book.
09:59So you can look at that later, you can produce prior predictive distributions with each prior specification.
10:05I showed you one example of a prior predictive distribution but you can generate a prior predictive distribution with any
10:11set of priors and then look to see, your goal
10:14should be to check whether the prior predictive distributions are reasonable.
10:19Given your intuitive beliefs about what you think the data should look like.
10:22Reaction time data should have certain properties that you know, would make sense even before you've seen any data
10:29So you could look at the prior predictive distributions to diagnose, you know, the model before you've even seen any data.
10:37Okay, so you should of course read the textbook chapter three to get an idea of all the details that I have skipped,
10:47a few little details about prior distribution and so on that you should know about.
10:52And so what we're going to do in the next lecture is we're going to look at what kind of data the model is
11:00going to generate
11:01once we have incorporated the information from the existing data.
11:05So we're going to look at future data from the model to evaluate the plausibility of this model
11:12given the data that we have.
11:13That's the next lecture.