This video belongs to the openHPI course Introduction to Bayesian Data Analysis. Do you want to see more?
An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.
Scroll to current position
- 00:00Alright.
- 00:01So now let's take a look at how we can evaluate this log-normal model with these relatively informative priors.
- 00:09So what we fit in the previous lecture was a simple linear model with reaction time as a function of some intercept, which
- 00:20is the mu parameter.
- 00:21We had the space bar data which recalls reaction times in a button pressing task and we have a log-normal likelihood with
- 00:29these priors.
- 00:30Normal distribution mean 6,
- 00:33Standard deviation 1.5 for the mu parameter for the intercept and a truncated normal with mean 0 and standard deviation 1
- 00:42for sigma.
- 00:43The truncation is done by brms internally because brms
- 00:47as a package knows that sigma cannot have negative values.
- 00:50So the truncation is done internally.
- 00:53So once you have the posterior distribution for the mu parameter for example, typically in a scientific research problem
- 01:01this is the parameter that we are primarily interested in.
- 01:04The sigma is usually a nuisance variable for us.
- 01:07So often there is not much focus on the sigma variable.
- 01:11But what we're interested in is the posterior distribution of the mu parameter given the data.
- 01:16So how would I summarize it?
- 01:18Well, of course you could summarize the posterior in terms of the log on the log scale but it makes more sense to actually
- 01:27look at the posterior on the millisecond scale here.
- 01:30So you can interpret it in terms that you understand.
- 01:32And so one way that you can do this is to basically just take the exponent of the posterior distribution.
- 01:44This entire distribution just exponentiated. And this gives you the median, you know, of the posterior.
- 01:52And and so what you can then do is you can calculate statistics on this distribution and report the mean and
- 02:02the 95% credible interval.
- 02:04So this information I would report in the paper, you know, if I'm summarizing the results of this analysis and this tells
- 02:10me pretty much everything I need to know that, given the data and given my priors, my predicted or expected reading time is
- 02:18167 milliseconds.
- 02:20And I'm 95% sure given these data, given these particular data that this range is between 164 and 169. Of course,
- 02:31this credible interval does not mean that these are the true values of this mu parameter.
- 02:36Because these values are conditioned on the data that we happen to get.
- 02:40Maybe we've got skewed data, maybe we've got biased data from a subject who was particularly tired or not representative
- 02:48of their normal behavior on some other day.
- 02:52They might not have slept the previous night for example.
- 02:55So it could be biased data.
- 02:57And so if the data are biased then your estimates are not gonna represent the reality.
- 03:02So you should treat all these credible intervals
- 03:05and this means as you should be very clear about the fact that they're conditioned on the data that you have.
- 03:12And whether those data reflect reality or not, who knows?
- 03:16So it's tempting to generalize from these data, from these posteriors to reveal the unknown reality out there.
- 03:24But the reality is by definition unknown.
- 03:26We just don't know what the true mu is.
- 03:29We're trying to get at some estimates of these given some priors and given the data that we happen to have.
- 03:36Okay, alright.
- 03:37But we can still look at the posterior distributions, the posterior predictive distributions to see if the future data produced
- 03:45by this model is reasonable given the data that we actually have.
- 03:49So this is just a sanity check which tells us that yes, the model is producing reasonably, you know, distributed data given
- 03:58what we've seen in this particular data set.
- 04:01One other question that one often asks is, well, I had this normal likelihood earlier and then I switched to the log-normal
- 04:09likelihood.
- 04:10Did I get an improvement in the fit in some way?
- 04:14So you could ask whether the likelihoods are, you know, one likelihood is better than the other likelihood.
- 04:20So, that's an interesting question to ask to.
- 04:22And you can use the posterior predictive data to check that.
- 04:25Okay, so here is the distribution of the minimum reading times produced by the normal likelihood model that we have
- 04:35fit long ago and this vertical bar is the observed minimum reading time.
- 04:40That's of course a point value in the data.
- 04:42And so what we notice is that the normal model is actually generating two small minimum reading times here.
- 04:54The distribution is too far away from the vertical bar that we've got here. In the log-normal model, we see somewhat
- 05:01better distribution of minimum reading times.
- 05:05Compared to the observed data.
- 05:06So in that sense, subjectively just looking at these figures.
- 05:10Intuitively, I don't have any quantified statistics or anything here.
- 05:13I'm just using a graphical check to decide whether the log-normal model does better than this normal model.
- 05:21And the answer seems to be yes.
- 05:23I'm producing minimum reading times that are pretty close to the actual observed minimum reading time.
- 05:29Now one can also check whether the two models, the normal likelihood model and the log-normal likelihood model, whether
- 05:37they reflect the maximum observed reading time in the data.
- 05:41So the maximum reading time observed is about 400 and something milliseconds.
- 05:47And but the normal model, the maximum reading times are very short.
- 05:52So it's kind of missing something important.
- 05:55The normal model is missing something important about the observed data.
- 06:01And interestingly, the log-normal model is also underestimating the maximum reading time.
- 06:07So if this data is representative, you know of some kind of systematic behavior of this subject, then both models actually
- 06:15failing to capture that.
- 06:17So often models will be imperfect in this sense.
- 06:20Even the log-normal model does not capture or capture all aspects of the data.
- 06:26Often that's okay. Models are always imperfect.
- 06:29That's why it's called a model.
- 06:31It's not the actual reality.
- 06:32So they will miss some aspects of reality.
- 06:34You just need to know what those aspects are and what's missing in the data.
- 06:39So in this particular case, you know, what could be happening
- 06:42is that what we might be looking at in the data is not just one distribution, but a mixture of distributions.
- 06:53So what could be happening is that when we look at data in, we see the skew in the data.
- 07:02What could lie behind that single distribution that we're seeing is a mixture of two distributions that look like one skewed
- 07:10distribution.
- 07:10So there could be a mixture process that's producing this data.
- 07:14As you will see in the later lectures in the textbook, you can actually define a generative process where you assume that
- 07:23there are a few rare slow reading times that could model this long tail here of you know, very rare but very slow reading
- 07:33times.
- 07:33This could be attentional timeouts that the subject is experiencing getting bored pressing the button.
- 07:39You know, they could be in rare occasions producing very slow reading times, but most of the data could be coming from a
- 07:45different distribution.
- 07:46So you can actually define this kind of finite mixture models that can model this kind of data.
- 07:52So this is an elaboration of the linear model.
- 07:55And this also illustrates the great flexibility of the Bayesian approach.
- 07:58You can really build elaborate process models that reflect the underlying generative process and it doesn't have to be a
- 08:06simple generative process of the type that we are studying here.
- 08:09That's one of the beauties of the framework that we're looking at here.
- 08:14All right.
- 08:15So what's happened so far, we've looked at two simple examples of a simple linear model using two different likelihoods the
- 08:22normal and the log-normal.
- 08:24And what we learned to do was to generate prior predictive and posterior predictive data using the
- 08:30brms package.
- 08:31We can do that or we can do it in R ourselves.
- 08:34And so these two approaches to understanding the model are very important.
- 08:39They will teach us about what the model predicts and what the underlying assumptions of the model are.
- 08:45Whether they produced reasonable data.
- 08:50And so what we tend to say is that these are telling us about the descriptive adequacy of the model.
- 08:57And so we can ask both what happens before we've seen the data.
- 09:01That's the prior predictive distribution.
- 09:03And what happens after we've seen the data.
- 09:05That's the posterior predictive distribution.
- 09:07And this is usually part of the workflow of doing a Bayesian data analysis.
- 09:11You do this prior and posterior predictive checks.
- 09:14You do a sensitivity analysis and that constitutes a complete analysis of data set.
- 09:20Usually you may only report the final analysis in the paper but underlying all this investigation, one should
- 09:28do to check whether the model makes any sense at all.
- 09:32Alright.
- 09:33So what's going to happen next is that we are going to now elaborate on this simple linear model in several ways.
- 09:40First we're going to add a predictor.
- 09:42So instead of just having an intercept, we're going to have an intercept and slope.
- 09:46And this becomes a regression model, you can also have multiple regression models now.
- 09:51So this improves the flexibility of this linear modeling approach and allows us to ask all kinds of questions about complex
- 09:58data sets.
- 10:00Another example I will show you is of using logistic regression.
- 10:04This will be familiar to people doing machine learning where you've got a 0, 1 response.
- 10:10Where we're trying to model this kind of Bernoulli process,
- 10:15using a linear modeling framework.
- 10:17So we're going to use logistic regression for that.
- 10:19And finally, I'm going to show you just a glimpse of what a linear mixed model or a hierarchical model looks like
- 10:26where you start adding information about individual participants,
- 10:31the variability due to individual participants into the model.
- 10:35This is a very sophisticated framework that allows you to study individual differences for example, and it allows you to
- 10:43analyze repeated measures of dependent data.
- 10:46This is a very important framework and it's a bread and butter framework for many fields like psychology and linguistics.
- 10:53So that's coming up next.
To enable the transcript, please select a language in the video player settings menu.