Conditional probabilities

Simon James; Chris Rawson; illustrated by Erin Cheffers, Deakin University

51 Conditional probabilities

For an event to be called independent, the chance of one event happening must have no impact on the probability of another.

Similarly, a conditional event is one that depends on another event also happening. Sometimes these relationships will be quite straight forward (the chance of someone’s eyes being brown bears no relationship to their scores on an IQ test). However, at other times, understanding whether or not something is independent can be a little trickier.

Transcript

The result discussed in the video is so counter-intuitive, but ultimately it comes down to the rarity of the disease, and the framing of our thinking. In particular, the question that we needed to be asking ourselves was: Given I tested positive, what is the probability that I actually have this very, very, very rare condition? This is a key reason why, when we are taking a screening test, we’re asked if there is a family history of the illness we’re being screened for and often other questions that might make it more likely that a positive test result actually means we have the illness – it’s important to avoid patients being anxious if there is a high probability of a false positive.

Causation, correlation and coincidence

One of the things that can be persistently tricky when using probability to investigate and analyse real world phenomena is the precise relationship between two events or variables. This often boils down to a question of whether something causes something else to happen or whether they just seem to go together (often because they have the same cause).

Applying direct heat onto ice will cause it to melt. However, there is only a correlation between a child’s height and their performance on a standardised maths test intended for 12-year-olds. It is the child’s age that causes both phenomena and so they are correlated. Occasionally, of course, two things that happen appear to be correlated but bear no real relationship to one-another.

In real world situations, it can be difficult to figure out what is a cause and what is a correlation. Take the example of life-expectancy. We know from many years of research that a person’s genetics, income, level of education, ‘health literacy,’ job security, ethnic background and place of birth are all factors that impact on how long most of us have on this earth. None of these factors individually determine our life expectancy and clearly many of them are inter-related. Of course, it is also very important to keep in mind that there will always be individual variation – that is, some people can buck the trend and die much sooner or live much longer than expected. Many researchers treat these factors as a kind of group (in the context of medicine, you might have heard of the term ‘risk factors’).

At some point we may need to acknowledge that the question ‘does this group of risk factors cause an increase or decline in life-expectancy or are they simply correlated’ is difficult to answer with certainty. Although it should also be noted that there are some statistical inference methods and experimental methodologies aimed at doing just this.

Conditional probability

To determine conditional probabilities, it’s usually best to think in terms of two-way tables. For example, what is the probability that a 6 has been rolled, given that the sum of two dice is 7 or greater? This is very different to asking what the probability of the dice summing to 7 or greater is, given that a 6 is rolled. We can organize the different outcomes into a table like this.

6 rolled

6 not rolled

Sum greater than or equal to 7

(1,6), (2,6), (3,6), (4,6),

(5,6), (6,6), (6,1), (6,2),

(6,3), (6,4), (6,5)

(2,5), (3,5), (4,5), (5,5),

(3,4), (4,4), (5,4), (4,3),

(5,3), (5,2)

Sum less than 7

(1,1), (1,2), (1,3), (1,4),

(1,5), (2,1), (2,2), (2,3),

(2,4), (3,1), (3,2), (3,3),

(4,1), (4,2), (5,1)

There are 20 outcomes where the sum is greater than or equal to 7, and out of these, 10 involve a 6. So this means that the probability of a 6 being rolled, given that we know the sum is 7 or greater, is 0.5.

On the other hand, for independent events, one event should have no bearing on the other. So for example, the probability of rolling a 6 on the second die, given that the first roll was 6 is [latex]1/6[/latex], however the probability of rolling a 6 on the second die, given that the first roll was something other than 6 is also [latex]1/6[/latex]. So sometimes we describe independence in terms of this conditional probability not being affected, especially when it is looking at events relating to real world phenomena or statistical data (e.g. the probability of someone having type O blood is independent of whether they are male or female, so we would say that blood type and gender are independent).

Independence based on conditional probability

If we denote the conditional probability of an event B, given that an event A has already occurred by [latex]Pr(B|A)[/latex] (read as the probability of B given A) the events A and B are considered independent if [latex]Pr(B|A) = Pr(B)[/latex].

If we have calculated our probabilities using experiments or from data, we could assume that two events might be independent if this formula holds approximately.

See image rights and reuse information

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

For an event to be called independent, the chance of one event happening must have no impact on the probability of another.

Causation, correlation and coincidence

Conditional probability

Independence based on conditional probability

License

Share This Book