1.7 Bivariate and multivariate distributions (Discrete case) |

This video belongs to the openHPI course Introduction to Bayesian Data Analysis. Do you want to see more?

1.7 Bivariate and multivariate distributions (Discrete case)

Time effort: approx. 11 minutes

An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.

Scroll to current position

00:00What we've done so far is that we've looked at random variables like the Bernoulli and the binomial and the continuous version of
00:09the normal distribution.
00:10These are universal distributions because there's only one variable involved in the examples that I showed you.
00:18But we can have situations where you have more than one variable involved.
00:23This can happen in an experiment or in a data collection situation.
00:28So an example would be if you're collecting the heights, measuring the heights and weights of a sample from a population
00:35Now you have two variables that you're looking at.
00:37Each of these two variables is assumed to have a random variable associated with it.
00:43That means it has either a probability mass function or probability density function associated with it.
00:49So now when we're talking about two or even more than two random variables being considered simultaneously, then we are in
00:56the world of bivariate or multivariate distributions.
00:59So I'm going to start by talking about the discrete case first.
01:02Okay, so if you look here, this data comes from one of my former students PhD dissertations, there are two variables
01:13involved in this experiment.
01:14This is a psycho linguistic experiment.
01:16And what we did was we collected accuracy responses which are 01 responses.
01:21This of course comes from the Bernoulli.
01:22Now, you can assume a Bernoulli likelihood there and the other response that we're getting from each participant
01:30simultaneously along with 01 response is a like a trading response.
01:34So rating from 1-7 where seven could be perfectly acceptable.
01:38These are sentences, you know, and one would be unacceptable.
01:44So we have two random variables.
01:45And what we want to consider now is the joint distribution of these two random variables.
01:53Now we're not talking about only one random variable.
01:55If you were talking only about the 01 responses
01:58those would be coming from a Bernoulli distribution.
02:02But we also have this other random variable X that simultaneously involved in the experiment.
02:07So we have two that we have to consider simultaneously.
02:10And we have to also now consider the joint probability mass function of these two random variables.
02:17So graphically, you know, you can visualize this joint probability mass function as I've shown you here.
02:23This is the actual data from my students work.
02:27And so what you see here is on one axis
02:31you see the 01 responses on the other axis
02:33you see the likert responses and on the Y axis
02:39so the axis going upwards, that's representing the joint probability of this.
02:43So what does this look like numerically?
02:45If you look at it as a table, this figure, we would see something like this.
02:49So, what you see on the X axis, sorry, not on the X axis, but on the horizontal, you know the rows, you see the
02:5501 responses.
02:57The probabilities of those for each of the likert responses ranging from 1 to 7.
03:02So this we are going to call the joint probability mass function of these two random variables.
03:10They are both discrete random variables.
03:13That's why I'm saying joint probability mass function.
03:15If there had been both continuous random variables, then this would be a joint probability density function.
03:25So this is the joint probability mass function of these two random variables.
03:29And what I want to show you now is that there are several important things that you can get out of this joint distribution.
03:36So the first important thing you can figure out from a joint probability mass function or probability density function is
03:45the marginal distribution of each of those two random variables.
03:48So notice that there are two random variables,
03:51each of those two random variables has its own probability mass function.
03:56And you can figure that probability mass function out by calculating the marginal distribution.
04:02And the way you compute it is, for example, for the variable X,
04:06if I want to figure out the probability mass function the marginal distribution of the variable X, ignoring the Y values
04:14what I do is I take a look at the joint distribution table that I just showed you earlier and sum up all the values
04:23all the Y values
04:25that I have here.
04:26And once I sum up all the joint probabilities, what I'm doing is I'm marginalizing out all the Y values.
04:33So I get the joint probability mass function for X and
04:36you can do the same thing for Y,
04:39This time you marginalize out the X values.
04:42So how does that work in practice?
04:44Let me show you what you're going to do now, is that we're going to figure out here in this table
04:49I've added a column here and a row here.
04:51This column is showing you the marginal probability of the variable Y.
04:55And so the variable Y has only two possible outcomes zero and one.
04:59So what I'm gonna do is I'm gonna take all these probabilities for each of the seven responses, likert responses for Y equals
05:07zero, sum those up,
05:09and I get this 0.291 here, and I can do the same thing for Y equal to one.
05:14So that you should notice that this probability mass function is a proper one, because it sums up to one.
05:21You can confirm that.
05:23Similarly, I can figure out the marginal distribution the probability mass function for X.
05:30For the random variable X,
05:32by computing these seven probabilities here.
05:37I'm summing up everything here and I'm gonna get this row here.
05:40And if I sum up these rows, they should also sum to one.
05:44You can check that if you like.
05:46So what I have now done is that I took the joint probability mass function for this bivariate distribution and I figured
05:53out the marginal distributions of each of those two random variables using this formula that I just showed you.
06:00It's pretty straightforward,
06:02just addition,
06:03nothing more.
06:04And you can also visualize these marginal distributions like this.
06:08This is your Bernoulli now, and this is your likert.
06:11So everything is pretty straightforward here.
06:14And so the other thing that you can compute with the joint probability mass function is the conditional distribution of X
06:25given Y,
06:26and the conditional distribution of Y given X.
06:30So how does that work now?
06:32Perhaps you remember from school, perhaps you saw this in school if you didn't, this might be the first time you're seeing
06:37this formula.
06:38But it's a pretty straightforward formula.
06:40The formula is this is the definition of conditional probability,
06:46and the definition states that the conditional probability of X given Y,
06:50that vertical bar is a conditionality statement.
06:54So the distribution of X given some particular value of Y, is going to be the joint distribution of X and Y divided
07:05by the marginal distribution of that particular value of Y that we're talking about.
07:11I'll give you an example, of course.
07:13And similarly, you can reverse the Xs and Ys and get the conditional distribution of Y and X.
07:17Y given X.
07:19So that's the definition of conditional probability that we're using here to compute the conditional distributions
07:27so these are probability distributions, probability mass functions.
07:32For particular values of one variable, For example Y I'm going to figure out the distribution of X.
07:38Okay, so how do I do that?
07:39Let's take a look.
07:40Okay, so here I have the table again.
07:42I've got the joint distribution here and I've got the marginal distributions here, for particular values for
07:49Y equals zero,
07:50I've got this one here for Y equal to one.
07:52I've got this one here and so on,
07:54for X as well.
07:56So let's now figure out what the probability of observing one that is X equal to one, given that Y is equal to zero.
08:07What's the probability of X being equal to one given that Y is equal to zero?
08:12So, how do I do that?
08:13Well, I just refer to the rule that I've just shown you right, I look up the joint probability of one and zero,
08:19so X and Y
08:20here.
08:21What's the joint probability of X equal to one and Y equal to zero?
08:25That's this one here,
08:260.018.
08:27So that's what I write here.
08:29And in the denominator, I'm going to figure out the probability of Y being equal to zero because zero is the conditional
08:37value that I'm looking at here.
08:39Okay, so what's the probability of why being equal to zero for this
08:42I have to look up the marginal distribution.
08:44So to compute the conditional distribution, I have to know what the marginal distributions are.
08:48So, but I have that here.
08:50So what is the marginal probability of Y being equal to zero?
08:55Well, I can just look that up.
08:56It's this 0.291 here.
08:58So, I plugged that in here.
08:59If I compute this, I get 0.062.
09:01I hope that's correct.
09:02You can check that.
09:04And so what I've just done is that I've figured out the conditional probability of one of the X outcomes given one of
09:14the Y outcomes.
09:16So what we basically are doing now is that we're going to fill out this table and this table will then give us this conditional
09:23probability conditional distribution of X given Y,
09:27for each of these possible outcomes.
09:29I did this one for you.
09:31But I strongly advise you to pause the video and quickly compute all these things.
09:35You basically just have to repeat the calculation that I've done here for different values of X and Y.
09:42And you will get the conditional distributions here.
09:44So, that's basically the whole story with the discrete bivariate distribution. Now
09:52what we're going to do next is that we're going to take these ideas of the marginal distribution and the conditional distribution
10:00which are so easy to compute in the discrete case because they just involve simple either summation for the marginal or just
10:08simple division for the conditional distribution.
10:11This is easy to understand conceptually, we're going to use these ideas to think about continuous random variables now in
10:17the bivariate and more generally the multivariate case where you have more than two random variables, so that will be the
10:24next lecture.