Це відео відноситься до openHPI курсу Introduction to Bayesian Data Analysis. Бажаєте побачити більше?
An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.
Прокрутити до поточної позиції
- 00:00In the last lecture I gave you an example of the beta binomial conjugate case in which I showed you that you can easily derive
- 00:08the posterior by just multiplying out likelihood and the prior.
- 00:12And you can choose different prior specifications for θ.
- 00:15So today in this lecture I'm going to talk about another conjugate case which is the Poisson-Gamma case.
- 00:23Okay.
- 00:23So the example that I want to work with is some real data that it comes from one of my old papers.
- 00:28What we have is from an eye tracking study.
- 00:30So we are looking at reading times when people are reading sentences on the computer screen, you can record the number of
- 00:37leftward eye movements that you make from a particular word when you're reading a sentence.
- 00:42So that's what the data looks like.
- 00:44You would get some Discrete values.
- 00:4701234.
- 00:49That would tell you the number of regressions that you're making at a particular word in some particular condition and for
- 00:55some particular subject or item and so on.
- 00:57Okay.
- 00:58So how would I analyze this data?
- 01:00Well, first of all, if you take a look at the data, if I look at the summary, you know, I see that the data
- 01:05has a maximum value of 21, minimum of zero, lots of missing data, we can ignore that for now.
- 01:11And although I'm getting a mean of 1.165.
- 01:15The actual numbers are discrete numbers here.
- 01:18Okay.
- 01:18You can't get 1.5 regressive eye movement in a particular word.
- 01:23So because the outcomes are discrete.
- 01:27We're going to model it using some kind of discrete random variable and one useful, you know, random variable you can use
- 01:35for this particular type of data is the Poisson distribution.
- 01:40So the Poisson distribution is defined as shown in equation 1.
- 01:44It's called a parameter λ which is the rate parameter, the rate at which you do those regressive eye movements.
- 01:51And of course we don't know what this λ is.
- 01:54It's always a question of estimating it from the data and access the data here.
- 01:58Okay.
- 01:59And x! here is I assume that you know what this is.
- 02:03You x! is a mathematical term here.
- 02:06That should be familiar to everyone.
- 02:09Okay.
- 02:10So how do I go about doing a Bayesian analysis given some data of this type?
- 02:16So we're going to just simulate some data because I don't want to work with the complexity of the real data right now, just
- 02:23to illustrate the point, suppose I have 10 data points that are coming from some λ distribution with some unknown λ
- 02:31parameter.
- 02:32So these are the data points that are generated randomly.
- 02:36This this will serve as our data now.
- 02:38Okay, so that's an example.
- 02:39So
- 02:43These are the 10 data points.
- 02:44Again, you can see that they have this discrete distribution.
- 02:48So they put the points here because there's nothing between zero and one for example.
- 02:53That's why this is a discrete distribution.
- 02:55Okay, so now let's assume for now that prior research on the topic of leftward eye movements or maybe expert knowledge.
- 03:06There are many eye tracking experts in the world, in psychology, in other areas suggests that the prior mean of λ
- 03:13So the rate of regressions is 3 and the prior variance is 1.5.
- 03:18Okay, So this could be a possible prior knowledge that we have from previous work.
- 03:23So the first step that we will take it, we're going to have to define a prior distribution
- 03:28PDF for the rate parameter, λ in our Poisson distribution.
- 03:33Because we're going to carry out that multiplication likelihood times prior.
- 03:37The parameter is now not θ at λ.
- 03:39So, I'm going to have to define a prior for λ.
- 03:43And this prior has to reflect or should ideally reflect our prior belief or knowledge about λ before we've
- 03:51seen any new data.
- 03:53So one it turns out, okay, it just turns out that one good choice for a prior for the λ parameter is the Gamma distribution
- 04:02which I will explain in a minute.
- 04:04The Gamma distribution is continuous distribution which is again defined in terms of two parameters a and b.
- 04:13Okay, so they have different names in R but in statistics, you will see that's generally defined in terms of some parameters
- 04:20called a and b.
- 04:22Now.
- 04:23At this point, Usually people start getting nervous about prior specification and people keep asking me, my students always
- 04:29ask me, how do I decide what the prior is going to be?
- 04:32I'm going to talk about that later.
- 04:34Right now, we're going to rely on our prior knowledge from some expert, you know, these prior mean and standard prior variants
- 04:42tells us what possible values λ can have.
- 04:44And we're going to use that to define the prior for now.
- 04:47Okay.
- 04:48So what does the Gamma PDF look like?
- 04:50But this is what it looks like?
- 04:51For any random variable x.
- 04:54With parameters a and b.
- 04:56This is the definition.
- 04:57Notice that x has to be larger than zero.
- 04:59So the rate cannot be less than zero.
- 05:01So we're gonna be modeling the rate parameter λ using this Gamma.
- 05:06So, instead of x will be using λ here.
- 05:08Because we're modeling the λ parameter and it has the right property.
- 05:13This Gamma distribution, that rate cannot be less than zero.
- 05:16Okay, so this is the form of the distribution and the details actually don't matter to us right now.
- 05:22Okay.
- 05:23As you will see, it all simplifies and everything works out very nicely in
- 05:28these kinds of analysis.
- 05:29Okay.
- 05:30So in R as I mentioned, a and b have different names.
- 05:34They're called shape and scale, respectively.
- 05:37So, you could actually simulate data, for example, this is rgamma
- 05:41I'm using remember the DPQR family of functions for Gamma.
- 05:45I have it too.
- 05:46So I've generated 10 data points which are producing simulated data from 3 and 1.
- 05:55a and b
- 05:55are 3 and 1 distribution.
- 05:57And so you can see that this is the type of data that I would get.
- 06:01So the reason I'm showing you this is that when I when I'm actually doing data analysis, I actually generate some simulated
- 06:08data from my prior to see what whether they reflect my prior belief adequately or not.
- 06:14So just drawing out the prior distributions graphically is a good way to understand what you're actually asserting about
- 06:22plausible values about the prior about the parameter.
- 06:26Okay, so this is what it looks like the (3, 1) parameter.
- 06:29So in a practical setting it was actually analyzing data with regressive eye movements.
- 06:34I would look at this and check whether this reflects my prior knowledge about leftward eye movements, you know?
- 06:40So I've been doing eye tracking work for 20 years so I have a reasonable idea of what the range of variability might be in
- 06:47leftward eye movements and experts in eye tracking generally know this too or
- 06:52can tell you about this or you can use prior data to work this out.
- 06:55Okay.
- 06:56Anyway, so we are now going to decide on the a
- 06:59and b
- 06:59priors.
- 07:00So we need to first need to figure out what these parameters.
- 07:03a and b are gonna be right.
- 07:05We have to decide on this now, we already know something.
- 07:08We know that the mean and the variance of λ from prior research, the mean is 3 and the variance is 1.5.
- 07:16What we also know?
- 07:18Well, I didn't tell you this, but we know this from statistical theory,
- 07:21that in the Gamma distribution, the mean of the Gamma is a/b.
- 07:27And the variance of the Gamma distribution is (a/b)^2.
- 07:32So this is very cool because I have two variables a and b and I have two equations now that I can write out.
- 07:39So all I'm gonna do in the next lecture is that I'm going to simply carry out this calculation and figure out what a and
- 07:47b is.
- 07:47Given these two equations and given these two values for those two equations.
- 07:53So we're just going to solve for a and b.
- 07:55Simple algebra.
- 07:56And this will give us the parameters on the Gamma prior
- 07:59that we will plug into the Bayes' rule to get the posterior on the λ parameter.
- 08:05Okay, so that's what will happen in the next lecture.
Щоб увімкнути запис, виберіть мову в меню налаштувань відео.