This video belongs to the openHPI course Introduction to Bayesian Data Analysis. Do you want to see more?
An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.
Scroll to current position
- 00:00What we've done so far is that we've looked at random variables like the Bernoulli and the binomial and the continuous version of
- 00:09the normal distribution.
- 00:10These are universal distributions because there's only one variable involved in the examples that I showed you.
- 00:18But we can have situations where you have more than one variable involved.
- 00:23This can happen in an experiment or in a data collection situation.
- 00:28So an example would be if you're collecting the heights, measuring the heights and weights of a sample from a population
- 00:35Now you have two variables that you're looking at.
- 00:37Each of these two variables is assumed to have a random variable associated with it.
- 00:43That means it has either a probability mass function or probability density function associated with it.
- 00:49So now when we're talking about two or even more than two random variables being considered simultaneously, then we are in
- 00:56the world of bivariate or multivariate distributions.
- 00:59So I'm going to start by talking about the discrete case first.
- 01:02Okay, so if you look here, this data comes from one of my former students PhD dissertations, there are two variables
- 01:13involved in this experiment.
- 01:14This is a psycho linguistic experiment.
- 01:16And what we did was we collected accuracy responses which are 01 responses.
- 01:21This of course comes from the Bernoulli.
- 01:22Now, you can assume a Bernoulli likelihood there and the other response that we're getting from each participant
- 01:30simultaneously along with 01 response is a like a trading response.
- 01:34So rating from 1-7 where seven could be perfectly acceptable.
- 01:38These are sentences, you know, and one would be unacceptable.
- 01:44So we have two random variables.
- 01:45And what we want to consider now is the joint distribution of these two random variables.
- 01:53Now we're not talking about only one random variable.
- 01:55If you were talking only about the 01 responses
- 01:58those would be coming from a Bernoulli distribution.
- 02:02But we also have this other random variable X that simultaneously involved in the experiment.
- 02:07So we have two that we have to consider simultaneously.
- 02:10And we have to also now consider the joint probability mass function of these two random variables.
- 02:17So graphically, you know, you can visualize this joint probability mass function as I've shown you here.
- 02:23This is the actual data from my students work.
- 02:27And so what you see here is on one axis
- 02:31you see the 01 responses on the other axis
- 02:33you see the likert responses and on the Y axis
- 02:39so the axis going upwards, that's representing the joint probability of this.
- 02:43So what does this look like numerically?
- 02:45If you look at it as a table, this figure, we would see something like this.
- 02:49So, what you see on the X axis, sorry, not on the X axis, but on the horizontal, you know the rows, you see the
- 02:5501 responses.
- 02:57The probabilities of those for each of the likert responses ranging from 1 to 7.
- 03:02So this we are going to call the joint probability mass function of these two random variables.
- 03:10They are both discrete random variables.
- 03:13That's why I'm saying joint probability mass function.
- 03:15If there had been both continuous random variables, then this would be a joint probability density function.
- 03:25So this is the joint probability mass function of these two random variables.
- 03:29And what I want to show you now is that there are several important things that you can get out of this joint distribution.
- 03:36So the first important thing you can figure out from a joint probability mass function or probability density function is
- 03:45the marginal distribution of each of those two random variables.
- 03:48So notice that there are two random variables,
- 03:51each of those two random variables has its own probability mass function.
- 03:56And you can figure that probability mass function out by calculating the marginal distribution.
- 04:02And the way you compute it is, for example, for the variable X,
- 04:06if I want to figure out the probability mass function the marginal distribution of the variable X, ignoring the Y values
- 04:14what I do is I take a look at the joint distribution table that I just showed you earlier and sum up all the values
- 04:23all the Y values
- 04:25that I have here.
- 04:26And once I sum up all the joint probabilities, what I'm doing is I'm marginalizing out all the Y values.
- 04:33So I get the joint probability mass function for X and
- 04:36you can do the same thing for Y,
- 04:39This time you marginalize out the X values.
- 04:42So how does that work in practice?
- 04:44Let me show you what you're going to do now, is that we're going to figure out here in this table
- 04:49I've added a column here and a row here.
- 04:51This column is showing you the marginal probability of the variable Y.
- 04:55And so the variable Y has only two possible outcomes zero and one.
- 04:59So what I'm gonna do is I'm gonna take all these probabilities for each of the seven responses, likert responses for Y equals
- 05:07zero, sum those up,
- 05:09and I get this 0.291 here, and I can do the same thing for Y equal to one.
- 05:14So that you should notice that this probability mass function is a proper one, because it sums up to one.
- 05:21You can confirm that.
- 05:23Similarly, I can figure out the marginal distribution the probability mass function for X.
- 05:30For the random variable X,
- 05:32by computing these seven probabilities here.
- 05:37I'm summing up everything here and I'm gonna get this row here.
- 05:40And if I sum up these rows, they should also sum to one.
- 05:44You can check that if you like.
- 05:46So what I have now done is that I took the joint probability mass function for this bivariate distribution and I figured
- 05:53out the marginal distributions of each of those two random variables using this formula that I just showed you.
- 06:00It's pretty straightforward,
- 06:02just addition,
- 06:03nothing more.
- 06:04And you can also visualize these marginal distributions like this.
- 06:08This is your Bernoulli now, and this is your likert.
- 06:11So everything is pretty straightforward here.
- 06:14And so the other thing that you can compute with the joint probability mass function is the conditional distribution of X
- 06:25given Y,
- 06:26and the conditional distribution of Y given X.
- 06:30So how does that work now?
- 06:32Perhaps you remember from school, perhaps you saw this in school if you didn't, this might be the first time you're seeing
- 06:37this formula.
- 06:38But it's a pretty straightforward formula.
- 06:40The formula is this is the definition of conditional probability,
- 06:46and the definition states that the conditional probability of X given Y,
- 06:50that vertical bar is a conditionality statement.
- 06:54So the distribution of X given some particular value of Y, is going to be the joint distribution of X and Y divided
- 07:05by the marginal distribution of that particular value of Y that we're talking about.
- 07:11I'll give you an example, of course.
- 07:13And similarly, you can reverse the Xs and Ys and get the conditional distribution of Y and X.
- 07:17Y given X.
- 07:19So that's the definition of conditional probability that we're using here to compute the conditional distributions
- 07:27so these are probability distributions, probability mass functions.
- 07:32For particular values of one variable, For example Y I'm going to figure out the distribution of X.
- 07:38Okay, so how do I do that?
- 07:39Let's take a look.
- 07:40Okay, so here I have the table again.
- 07:42I've got the joint distribution here and I've got the marginal distributions here, for particular values for
- 07:49Y equals zero,
- 07:50I've got this one here for Y equal to one.
- 07:52I've got this one here and so on,
- 07:54for X as well.
- 07:56So let's now figure out what the probability of observing one that is X equal to one, given that Y is equal to zero.
- 08:07What's the probability of X being equal to one given that Y is equal to zero?
- 08:12So, how do I do that?
- 08:13Well, I just refer to the rule that I've just shown you right, I look up the joint probability of one and zero,
- 08:19so X and Y
- 08:20here.
- 08:21What's the joint probability of X equal to one and Y equal to zero?
- 08:25That's this one here,
- 08:260.018.
- 08:27So that's what I write here.
- 08:29And in the denominator, I'm going to figure out the probability of Y being equal to zero because zero is the conditional
- 08:37value that I'm looking at here.
- 08:39Okay, so what's the probability of why being equal to zero for this
- 08:42I have to look up the marginal distribution.
- 08:44So to compute the conditional distribution, I have to know what the marginal distributions are.
- 08:48So, but I have that here.
- 08:50So what is the marginal probability of Y being equal to zero?
- 08:55Well, I can just look that up.
- 08:56It's this 0.291 here.
- 08:58So, I plugged that in here.
- 08:59If I compute this, I get 0.062.
- 09:01I hope that's correct.
- 09:02You can check that.
- 09:04And so what I've just done is that I've figured out the conditional probability of one of the X outcomes given one of
- 09:14the Y outcomes.
- 09:16So what we basically are doing now is that we're going to fill out this table and this table will then give us this conditional
- 09:23probability conditional distribution of X given Y,
- 09:27for each of these possible outcomes.
- 09:29I did this one for you.
- 09:31But I strongly advise you to pause the video and quickly compute all these things.
- 09:35You basically just have to repeat the calculation that I've done here for different values of X and Y.
- 09:42And you will get the conditional distributions here.
- 09:44So, that's basically the whole story with the discrete bivariate distribution. Now
- 09:52what we're going to do next is that we're going to take these ideas of the marginal distribution and the conditional distribution
- 10:00which are so easy to compute in the discrete case because they just involve simple either summation for the marginal or just
- 10:08simple division for the conditional distribution.
- 10:11This is easy to understand conceptually, we're going to use these ideas to think about continuous random variables now in
- 10:17the bivariate and more generally the multivariate case where you have more than two random variables, so that will be the
- 10:24next lecture.
To enable the transcript, please select a language in the video player settings menu.