6 Joint Distributions

Learning Outcomes

At the end of this chapter you should be able to:

  1. explain the concept of a joint distribution;
  2. work with joint probability mass functions;
  3. compute expectations, variances and covariances and know their properties;
  4. determine if two jointly distributed random variables are independent;
  5. determine the mean and variance of a sum of random variables;
  6. compute the correlation coefficient between two jointly distributed random variables, and know its properties.

 

 

6.1 Introduction

Frequently two or more variables are related and may change together.
Examples are:

  1. Blood pressure and Sugar level
  2. Prey and Predator numbers
  3. Percentage Forest cover and Rainfall

Notation

We use the notation p_{XY}(x,y) to denote the joint probability mass function of random variables X and Y, where

    \[p_{XY}(x,y) = P(X=x,Y=y).\]

The joint pmf can be tabulated, and is the usual way of presenting the joint pmf of a pair of discrete random variables. The example below illustrates the ideas.

Joint probability mass function

We consider two random variables that that change together. Their joint pmf can be presented as a table, given below.

p_{XY}(x,y)
x p_Y(y)
-1 0 1
y 0 0.1 0.05 0.2 0.35
1 0.05 0.1 0.1 0.25
2 0.15 0.05 0.2 0.4
p_X(x) 0.3 0.2 0.5 1

The table gives joint probabilities. Thus for example,

    \[P(X=0,Y=1) = 0.1 = p_{XY}(0,1), P(X=1,Y=2)=0.2=p_{XY}(1,2).\]

Note that p_{XY}(x,y) is a probability mass function, so it satisfies:

    \[0 \le p_{_{XY}}(x,y) \le 1\]

for each value of x and each value of y, and

    \[\sum_{{\rm all\ }x,y} p_{_{XY}}(x,y) = 1.\]

The marginal pmfs of X and Y are obtained by appropriately summing the rows (giving the pmf of X) and columns (giving the pmf of Y) of the joint table. This follows from the theorem of total probabilities. Thus for example,

    \begin{align*} p_X(0) &= P(X=0,Y=0)+P(X=0,Y=1)+P(X=0,Y=2) \\ &= \text{sum of second column} = 0.05 + 0.1 + 0.05 = 0.2. \end{align*}

Example 6.1

Parent-offspring always share exactly one allele identical by descent (IBD), while siblings can share 0, 1 or 2. The table below gives the joint genotype distribution for IBD status for two siblings.

Sibling 1(X) p_Y(y)
0 1 2
Sibling 2 (Y) 0 0.125 0.25 0.0625 0.4375
1 0. 0.25 0.125 0.375
2 0 0.125 0.0625 0.1875
p_X(x) 0.125 0.625 0.25 1

(a) What is the probability that

(i) Sibling 1 has more alleles than sibling 2?

(ii) the total number of alleles for the two siblings is more than 2?

(b) What is expected total number of alleles for the siblings?

Solution

(a) (i) P(X>Y) = 0.4375. The cells this corresponds to are highlighted in mauve.

(ii) The total is T = X+Y. The three cells that correspond to T>2 are underlined. Then

    \[P( T > 2) = 0.125 + 0.125 + 0.0625 = 0.3125\]

(b) We need E(X+Y) = E(X) + E(Y). First we obtain E(X). This is obtained by multiplying the values of X (first row of table) with the corresponding marginal probabilities (bottom row of table) and summing the results.

    \[E(X) = 0 \times 0.125 + 1 \times 0.625 + 2 \times 0.25 = 1.125.\]

Similarly,

    \[E(Y) = 0 \times 0.4375 + 1 \times 0.375 + 2 \times 0.1875 = 0.75.\]

Then

    \[E(X+Y) = E(X) + E(Y)  = 1.125 + 0.75 = 1.875.\]

Note

In general, if h(x,y) is a function of x and y, we can compute E\left[h\left(X,Y\right)\right] by

    \[E\left[h\left(X,Y\right)\right] = \sum_x \sum_y h(x,y)\ p_{_{XY}}(x,y).\]

Thus in the above example,

    \begin{align*} E\left(X+Y\right) &= \sum_x \sum_y (x+y)\ p_{_{XY}}(x,y)\\ &= (0+0)\times 0.125 + (0+1) \times 0.25 + (0+2) \times 0.0625 + \ldots\\ &=1.875 \end{align*}

as previously.

Exercise

For Example 6.1, calculate {\rm E}(XY).

[Ans: 1]

6.2 Independent random variables

Two random variables X and Y are independent if

    \[\underbrace{p_{_{XY}}(x,y)}_{\text{Joint pmf}} = \underbrace{p_{_X}(x)\ p_{_Y}(y)}_{\text{Marginal pmfs}}\]

for each value of x and y.

[Compare with P(A\cap B) = P(A)\ P(B) for independent events.]

Thus two random variables are independent if their joint pmf can be obtained by multiplying the marginal pmfs.

Example 6.2

(a)

y p_X(x)
-1 0 1
x -2 0.06 0.04 0.10 0.20
0 0.12 0.08 0.2 0.40
1 0.12 0.08 0.2 0.40
p_Y(y) 0.30 0.20 0.50 1

Each entry in the table is the product of the corresponding row and column totals. Thus random variables X and Y are independent since p_{XY}(x,y) = p_X(x)\ p_Y(y) for each x and each y. In addition, any events in terms of X and Y are also independent. Thus

    \[P(X\le 0, Y \le 0) = 0.3 = P(X\le 0)\ P(Y\le 0) = 0.6\times 0.5.\]

(b)

u P_V(v)
-1 0 1
v -1 0.1 0.2 0.1 0.4
0 0.1 0.05 0 0.15
2 0.2 0.15 0.1 0.45
p_U(u) 0.4 0.4 0.2  cx1

Random variables U and V are not independent, since for example,

    \[P(U=1,V=0) = p_{UV}(1,0) = 0 \ne p_U(1)\ p_V(0) = 0.2 \times 0.15.\]

(c) In (a) above, show that

    \[P(X\ge 0, Y \le 0) = P(X\le 0)\ P(Y \le 0).\]

Solution

    \begin{align*} P(X\ge 0, Y \le 0) &= 0.06+ 0.04 + 0.12 + 0.08 = 0.3.\\ P(X\le 0) &= 0.6\\ P(Y \le 0) &= 0.5\\ P(X\le 0)\times P(Y \le 0) &= 0.6 \times 0.5 = 0.3 =  P(X\ge 0, Y \le 0). \end{align*}

In general if X and Y are independent then for any sets A and B,

    \[P(X \in A, Y \in B) = P(X \in A)\ P(Y \in B).\]

[The symbol \in is read as “is in”.]

6.3 Covariance

The covariance between random variables X and Y, denoted {\rm Cov}(X,Y), is defined as

    \[{\rm Cov}(X,Y) = {\rm E}\left[\left(X-{\rm E}\left(X\right)\right)\left(Y-{\rm E}\left(Y\right)\right)\right] = {\rm E}\left[\left(X-\mu_X\right)\left(Y-\mu_Y\right)\right].\]

Notes

  1. The covariance can be calculated from the joint distribution of X and Y.
  2. {\rm Cov}(X,Y) is a measure of the relationship between X and Y. If {\rm Cov}(X,Y) > 0 then as X increases so does Y. Similarly, if {\rm Cov}(X,Y) < 0 then as X increases Y decreases.

Theorem

    \[{\rm Cov}(X,Y) = {\rm E}(XY) - {\rm E}(X){\rm E}(Y) = {\rm E}(XY) - \mu_X \ \mu_Y.\]

Proof

    \begin{align*} {\rm Cov}(X,Y) &= {\rm E}\left[(X-\mu_X)(Y-\mu_Y)\right]\\ &= {\rm E}\left[X(Y-\mu_Y) - \mu_X(Y-\mu_Y)\right]\\ &= {\rm E}(XY-\mu_Y\ X) - \mu_X\underbrace{{\rm E}(Y-\mu_Y)}_{=0}\\ &= {\rm E}(XY) - \mu_Y{\rm E}(X)\\ &= {\rm E}(XY) - \mu_X \mu_Y. \end{align*}

This form is simpler for calculating covariances.

Example 6.3

y p_X(x)
-5 0 3
x -4 0.2 0.1 0.1 0.4
0 0.05 0.1 0.05 0.2
5 0.1 0.2 c 0
p_Y(y) 0.35 0.4 1

Find

(i) the values of c

(ii) P(X = 0\mid Y = 0)

(iii) P(Y=0\mid X=0)

(iv) Cov(X,Y)

Solution

(i) Since the probabilities in the table sum to 1, we get c = 0.1. The updated table is given below.

y p_X(x)
-5 0 3
x -4 0.2 0.1 0.1 0.4
0 0.05 0.1 0.05 0.2
5 0.1 0.2 0.1 0.4
p_Y(y) 0.35 0.4 0.25 1

(ii)

    \[P(Y=0\mid X=0) = \frac{P(Y=0, X=0)}{P(X=0)}= \frac{0.1}{0.2} = 0.5.\]

(iii)

    \[P(Y=0\mid X=0) = \frac{P(X=0, Y=0)}{P(Y=0)}= \frac{0.1}{0.4} = 0.25.\]

(iv) In the calculation of {\rm E}(XY), we can see that the column and row that correspond to Y=0 and X=0 respectively simplify the calculation. This leaves us with only four calculations for calculating {\rm E}(XY).

    \begin{align*} {\rm E}(X) &= -4\times 0.4 + 0\times 0.2 + 5\times 0.4 = 0.4.\\ {\rm E}(Y) &= -5\times 0.35 + 0\times 0.4 + 3\times 0.25 = 1.\\ {\rm E}(XY) &=(-4)(-5)\times 0.2 + 0 + (-4)(3) \times 0.1 + 0 + (5)(-5) \times 0.1 + 0 + (5)(3) \times 0.1 = 1.\\ {\rm Cov}(X,Y) &= {\rm E}(XY)  - {\rm E}(X){\rm E}(Y)  = 1 - 0.4 \times 1 = 0.6. \end{align*}

Theorem

If X and Y are independent random variables then

    \[{\rm E}(XY) = {\rm E}(X)\ {\rm E}(Y)\]

Proof

We use the fact that if X and Y are independent random variables then

    \[p_{XY}(x,y) = p_X(x)\ p_Y(y).\]

    \begin{align*} {\rm E}(XY) &= \sum_x\sum_y xy p_{XY}(x,y)\\ &= \sum_x\sum_y xy p_X(x)\ p_Y(y)\\ &= \left(\sum_x x p_X(x)\right) \left(\sum_y y  p_Y(y)\right)\\ &= {\rm E}(X) {\rm E}(Y). \end{align*}

Note that in the third line we have factorised the sum into a sum involving only x and another involving only y.

Corollary

IF X and Y are independent random variables then

    \[{\rm Cov}(X,Y) = 0.\]

Proof

    \begin{align*} {\rm Cov}(X,Y) &= {\rm E}(XY) - {\rm E}(X){\rm E}(Y)\\ &= \underbrace{{\rm E}(X){\rm E}(Y)}_{\text{By independence}} - {\rm E}(X){\rm E}(Y)\\ &= 0. \end{align*}

Example 6.4

As in Example 6.2(a).

y p_X(x)
-1 0 1
x -2 0.06 0.04 0.10 0.20
0 0.12 0.08 0.2 0.40
1 0.12 0.08 0.2 0.40
p_Y(y) 0.30 0.20 0.50 1

From Example 6.2 (a), X and Y are independent. Verify that

    \[{\rm E}(X) = 0\quad {\rm E}(Y) = 0.2.\]

Solution
By independence,
{\rm E}(XY) = {\rm E}(X){\rm E}(Y)= 0\times 0.2 = 0, so {\rm Cov}(X,Y) = {\rm E}(XY)-{\rm E}(X){\rm E}(Y) = 0.
Also, by independence {\rm Cov}(X,Y) =0.

Exercise

In the above example calculate {\rm E}(XY) directly and show that it is 0.

IMPORTANT NOTE

IF X and Y are independent THEN Cov(X,Y)=0. If Cov(X,Y)=0 then X and Y are not necessarily independent.

Example 6.5

y p_X(x)
-1 0 1
x 0 0.1 0 0.1 0.2
1 0.1 0.1 0.1 0.3
2 0.15 0.2 0.15 0.40
p_Y(y) 0.35 0.3 0.35 1

It can easily be verified that

    \[{\rm E}(X) = 1.3, {\rm E}(Y) = 0, {\rm E}(XY) = 0,\]

so {\rm Cov}(X,Y) = 0. However, X and Y ARE NOT independent, for example,

    \[p_{XY}(0,0) = 0 \ne p_X(0)\ p_Y(0) = 0.2 \times 0.3\]

Properties of Covariance

C1. Cov(X,Y) = Cov(Y,X) (Symmetry)

C2. Cov(X,X) = {\rm Var}(X).

C3. Cov(aX,bY)= ab\ {\rm Cov}(X,Y).

C4. Cov(X+Y,Z)={\rm Cov}(X,Z) + {\rm Cov}(Y,Z).

Proof

C1.

    \[{\rm Cov}(Y,X) = {\rm E}(YX) - {\rm E}(Y){\rm E}(X) = {\rm Cov}(X,Y).\]

C2.

    \[{\rm Cov}(X,X) = {\rm E}(X.X) - {\rm E}(X){\rm E}(X) = {\rm E}\left(X^2\right) - \left[{\rm E}(X)\right]^2 = {\rm Var}(X)\]

C3.

    \[{\rm Cov}(aX,bY) = {\rm E}(aX.bY) - {\rm E}(aX){\rm E}(bY) = ab{\rm E}(XY) - ab{\rm E}(X)\E(Y) = ab{\rm Cov}(X,Y).\]

C4.

    \begin{align*} {\rm Cov}(X+Y,Z) &= {\rm E}\left[(X+Y)Z\right] - {\rm E}(X+Y){\rm E}(Z)\\ &= {\rm E}(XZ+YZ) - \left[{\rm E}(X)+{\rm E}(Y)\right]{\rm E}(Z)\\ &= {\rm E}(XZ) - {\rm E}(X){\rm E}(Z) + {\rm E}(YZ) - {\rm E}(Y){\rm E}(Z)\\ &= {\rm Cov}(X,Z) + {\rm Cov}(Y,Z). \end{align*}

The last result in C4 can be extended in the obvious way. For example,

    \begin{align*} {\rm Cov}(2X+3Y,X+2Z) &= {\rm Cov}(2X,X) + {\rm Cov}(2X,2Z) +{\rm Cov}(3Y,X) + {\rm Cov}(3Y,2Z)\\ &= 2{\rm Var}(X) +4{\rm Cov}(X,Z) + 3{\rm Cov}(Y,X) + 6{\rm Cov}(Y,Z). \end{align*}

One can also represent this calculation in tabular form, as below.

2X 3Y
X Cov(X,2X) = 2 Var(X) Cov(X,3Y) = 3Cov(X,Y)
2Z Cov(2Z,2X) = 4Cov(X,Z) Cov(2Z,3Y) = 6Cov(Y,Z)

Example 6.6

Simplify {\rm Cov}(2X+3Y,X-2Y).

Solution

    \begin{align*} {\rm Cov}(2X+3Y, X-2Y) &= {\rm Cov}(2X, X) + {\rm Cov}(2X,-2Y) + {\rm Cov}(3Y,X) + {\rm Cov}(3Y,-2Y)\\ &= 2 {\rm Var}(X) - 4 {\rm Cov}(X,Y) + 3{\rm Cov}(X,Y) - 6{\rm Var}(Y)\\ &= 2 {\rm Var}(X) - {\rm Cov}(X,Y) - 6{\rm Var}(Y). \end{align*}

6.4 Sums of Random Variables ◊

 

MAIN RESULTS

  1.     \[{\rm Var}(X\pm Y) = {\rm Var}(X) + {\rm Var}(Y) \pm 2\ {\rm Cov}(X,Y)\]

  2.     \[{\rm Var}(aX + bY) = a^2\ {\rm Var}(X) + b^2\ {\rm Var}(Y) + 2ab\ {\rm Cov}(X,Y)\]

  3. If X and Y are independent then

        \[{\rm Var}(X\pm Y) = {\rm Var}(X) + {\rm Var}(Y)\]

Proof

  1.     \begin{align*} {\rm Var}(X+Y) &= {\rm Cov}(X+Y,X+Y)\\ &= {\rm Cov}(X,X) + {\rm Cov}(X,Y) + {\rm Cov}(Y,X) + {\rm Cov}(Y,Y)\\ &= {\rm Var}(X) + {\rm Var}(Y) + 2{\rm Cov}(X,Y). \end{align*}

    Similarly, {\rm Var}(X-Y) = {\rm Var}(X) + {\rm Var}(Y) - 2{\rm Cov}(X,Y).

  2. Exercise. Similar to 1.
  3. If X and Y are independent then {\rm Cov}(X,Y) = 0, so from 1 above it follows that

        \[{\rm Var}(X \pm Y) = {\rm Var}(X) + {\rm Var}(Y).\]

Extension of 3.

If X_1, X_2, \ldots, X_n are independent random variables then

    \[{\rm Var}(X_1 +X_2 + \ldots + X_n) = {\rm Var}(X_1) + {\rm Var}(X_2) + \ldots + {\rm Var}(X_n).\]

Example 6.7

The following table gives the joint pmf of the number (in hundreds) of single cones and double cones sold in an ice cream palour per day, the random variable S denotes the number of single cones and D denotes the number of double cones sold per day.

(a) Calculate the variance of

(i) total number, and (ii) the difference in the numbers of cones sold per day.

(b) The net profit on a single cone is $2 while that on a double cone is $3. What is the mean and variance of the total net profit in a day.

s p_D(d)
4 5 6
d 3 0.15 0.05 0 0.2
4 0.05 0.25 0.1 0.4
5 0 0.05 0.35 0.4
p_S(s) 0.2 0.35 0.45 1

Solution

The marginal distributions of S and D are given in the table. Calculations give

    \[{\rm E}(S) = 5.25, {\rm E}(D) = 4.2, {\rm E}(S^2) = 28.15, {\rm E}(D^2) = 18.2, {\rm E}(S) = 0.5875, {\rm E}(D) = 0.56,\]

    \[{\rm E}(SD) = 22.5, {\rm Cov}(S,D) = 0.45.\]

(a)(i) Let T denote the total number of cones sold per day, so T=S+D. Then

    \begin{align*} {\rm Var}(T) = {\rm E}(S+D) &= {\rm Var}(S) +{\rm Var}(D) + 2{\rm Cov}(S,D)\\ & = 0.5875 + 0.56 + 2(-0.45) = 0.2475, \end{align*}

that is, 2,475. Note that since the number of cones are in hundreds, the variance needs to be multiplied by 100^2.

(ii) Let Z denote the difference in the number of cones sold per day, so Z = S-D. Then

    \begin{align*} {\rm Var}(Z) = {\rm Var}(S-D)&= {\rm Var}(S) + {\rm Var}(D) - 2{\rm Cov}(S,D)\\ &= 0.5875 + 0.56 - 2(-0.45) = 2.0475, \end{align*}

that is, 20,475.

(b) The net profit per day is P=2S+3D. Then

    \begin{align*} {\rm E}(P) &= {\rm E}(2S+3D) = 2{\rm E}(S) + 3{\rm E}(D) = 2\times 5.25 + 3\times 4.2 = 23.1\\ {\rm Var}(P) &= {\rm Var}(2S+3D) = 4{\rm Var}(S) + 9{\rm Var}(D + (2)(2)(3){\rm Cov}(S,D)\\ &= (4)(0.5875) + (9)(0.56) + (12)(-0.45)\\ &= 10.00303. \end{align*}

Thus the mean profit is $23,100 per day with a variance of 100,030.3, or a standard deviation of $316.28.

6.5 Correlation coefficient

We saw in 6.3 that covariance is a measure of relationship between two random variables. The problem with covariance is that it is scale dependent, since

    \[{\rm Cov}(aX,bY) = ab{\rm Cov}(X,Y)\]

Thus, if we X is height in metres and Y is weight in kg, and U is height in cm and V is weight in g, then

    \[{\rm Cov}(U,V) = {\rm Cov}(100X,1000Y) = 100,000{\rm Cov}(X,Y).\]

But we are still measuring a relationship between the same quantities!

For this reason, the correlation coefficient if preferred as a measure of relationship between variable. We define the correlation coefficient as

    \[{\rm Corr}(X,Y) = \frac{{\rm Cov}(X,Y)}{\sigma_X\ \sigma_Y}\]

It can be shown that \absval{{\rm Corr}(X,Y)} \le 1.

Exercise

  1. Prove that correlation is scale independent, that is,

        \[{\mid\rm Corr}(aX+b,cY+d)\mid = \mid{\rm Corr}(X,Y)\mid.\]

  2. For Example 6.7, calculate the correlation coefficient between the number of double and single cones sold per day.

 

Licence

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Statistics: Meaning from data Copyright © 2024 by Dr Nazim Khan is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.

Share This Book