Video: An introduction to probabliilty

Video links

This video is found in the pages

An introduction to probability

Transcript of video

Welcome to our introductory lecture on probability.

To introduce the concepts of probability, imagine that you flip two fair coins. By fair coin, I mean a coin that has an equal probability of landing on either Heads or Tails.

Imagine, also that you can distinguish between the two coins. Say one coin is a quarter and the other is a nickel. We'll refer the coins as coin 1 and coin 2.

We'll call flipping the pair of coins an experiment. When we perform the coin flipping experiment there are four possible outcomes. Both coins could turn up heads. The first coin could be heads and the second tails. Or, vice versa, we could get tails and then heads. Lastly, we could get two tails.

These are the only possible outcomes in our experiment. (We won't allow for the possibility of a coin landing on its side.) We call this space of all possible outcomes the sample space. Our sample space for the two coin flipping experiment has four possible outcomes. We expect that all four outcomes are equally likely.

We define an event as a subset of outcomes. We have simple events that consist of just one outcome. For example, we'll let A be the event that both coins turn up heads.

An event could contain multiple outcomes. We could let the event B be the event that both coins landed on the same side. This event has two outcomes: either both heads or both tails.

Another event, let's call it C, is the event that the second coin landed on heads. This event also has two outcomes, one for each of the two possibilities of coin 1.

We assign a probability to each event. The probability of event A, which we denote P(A), is one quarter. It contains just one outcome. Since we had four equally likely outcomes, each outcome occurs with probability one quarter.

Event B is composed of two outcomes, so it is twice as likely as event A. The probability of event B is one half. Similarly, P(C), the probability of event C, is also one half, as it contains half of the equally likely outcomes.

One way to represent all the probabilties of all outcomes is through a contingency table. Here, we'll use the columns to represent coin 1 and the rows to represent coin 2. Since we are modeling the four outcomes as being equally likely, we put one quarter in each of the four cells.

We can gain more insight by adding totals to the contingency table. Here we see that indeed both coins are fair. The first column is the event that the first coin is heads, and its total probability is one half. The second column, or the probability for the first coin landing on tails, is one half. Similarly, the row totals give that probabilities of heads or tails for the second coin. Importantly, if you sum up the rows totals or sum up the column totals, you better get 1. The overall total probability of any one of the outcomes occuring must be 1. If you don't get 1 here, then something went wrong.

We can use the contingency table to represent different types of experiments, rather than flipping two fair coins. Let's imagine that I have some trick coins so that the four outcomes are no longer equally likely. Instead, I can specify different probabilities for the four outcomes of the sample space.

Here is the contingency table for some coins where getting two heads is highly unlikely. Rather than happening with probability one quarter, getting two heads happens with probability 0.04 or only one 25th of the time. On the other hand, these coins show up with two tails more than half of the time, with probability 0.64. Getting either heads then tails or tails then heads also happens less likely than we'd expect, as each of those outcomes occurs will probability 0.16.

But, I have more tricks up my sleeve. I have another pair of coins whose probabilities are shown in contingency table B. Here getting two heads or two tails happens more rarely than you'd expect.

And, contigency table C shows the results for another pair of trick coins, where getting two heads or two tails happens more frequently than you'd expect.

Lastly, contingency table D show a fourth pair of trick coins. Here, getting two heads or two tails is only slightly less than what you'd expect.

Oops, wait. I think I made a mistake. It seems like I completely goofed up on one of the contingency tables. One of them doesn't make any sense. Which one is it?

Pause the video, if you need to, to determine the faulty contingency table before I tell you the answer.

To find the bad contingency table, we can calculate the totals. Everything seems to add up correctly for the first three tables. But, in table D, the total probability is 1.2. The total probability that one of the outcomes of the sample space occurs has to be 1. We can toss out table D as completely bogus.

Let's look more closely at the remaining three contigency tables. I mentioned that the original pair of coins were fair coins, as they were equally likely to come up heads or tails. Which one of these contingency tables represents unfair coins, with unequal probabilities of heads or tails?

Clearly, table A represents unfair coins, as tails are more likely than heads. In particular, by looking at the column sums, we see that the first coin comes up heads only 20% of the time and lands on tails 80% of the time. The row sums show the same behavior for the second coin. It's clear these coins are weighted so that they are four times more likely to land on tails than heads. These coins are definitely unfair.

What about cases B and C? The row and column sums are identical to the totals we got for our standard pair of coins. Each row and column sums up to one half. The column sums demonstrate that coin 1 is equally likely to be heads or tails. The row sums demonstrate that coin 2 is also equally likely to be heads or tails. But, clearly the coins in both case B and case C are behaving strangely. In case B, the coins don't seem to like to land on the same side. In case C, the coins seem to prefer to land on the same side.

Let's look more closely at case B. Let's define the four events that correspond to each coin landing on heads or tails.

H_1 is the event that coin 1 lands on heads. It is composed of the two outcomes from the first column: heads followed by heads, and heads followed by tails. The total probability of event H_1 is one half.

T_1 is the event that coin 1 lands on tails and is composed of the second column's outcomes: tails follwed by heads, and tails followed by tails. The probability that the first coin lands on tails is one half.

We also define two events for the second coin, which correspond to the rows of the contingency table. H_2, the event that coin 2 lands on heads is the first row. The probability of H_2 is one half. T_2, the event that coin 2 lands on tails is the second row. Its probability is also one half.

Here's a puzzle for you to figure out. I flip the coins and don't show you the result. As we've discussed, the probability that the second coin landed on heads is 50%. The coins are fair so that, a priori, P(H_2) is one half.

But now let's imagine that I give you some additional information. I show you the result of the first coin and you see that the first coin landed on heads. In other words, you know that the event H_1 has occurred.

If the coin flipping experiment is described by contingency table B, how does the additional information that coin 1 is heads change your estimate of the probability that coin 2 is heads? In math terms, we ask what is the probability of the event H_2 conditioned on the event H_1?

Why don't you pause the video and see if you can figure it out?

The probability of H_2 conditioned on H_1 is 0.2 or 20%. Before I gave you any extra information, you thought the probability of H_2 was 50%. But, once I tell you that the first coin landed on heads, the probability that the second coin is heads drops from 0.5 to 0.2.

How did we determine that probability? The contingency table gives you the probabilities for all four outcomes. But, when I tell you that the first coin was heads, the universe of possible outcomes shrinks. You can now neglect the second column and focus solely on the first column corresponding to outcomes where the first coin was heads. Since you know an event in the first column must have occured, the probabilities in the first column must now sum to 1. We rescale the first column by dividing by its total, 0.5, to produce a new contingency table conditioned on the observation that the first coin was heads. Now we see that the second coin has a probability 0.2 of being heads and a probability of 0.8 of being tails. In particular, now that we have conditioned on H_1, the probability of H_2 has dropped to 0.2

These coins are somehow being influenced by each other. Maybe they are attached by with an invisible spring? But, whatever the mechanism, the contingency table B shows that the results of the two coin flips are not independent of each other.