Video: Probabilistic inference and Bayes Theorem
This video is found in the pages
Transcript of video
In this lecture, I'd like to introduce the idea of probabilistic inference and how we can use Bayes Theorem to make some simple probabilistic inferences.
But first, what do we mean by probabilistic inference?
The idea is that we have some observations, but those observations don't lead to a certain prediction of the outcomes we're interested in. Maybe the observations are incomplete or have errors. Maybe the process leading to the outcomes has sources of randomness. Maybe we don't completely understand the relationship between the observations and the outcome.
In these cases, we likely cannot make perfect predictions about the resulting outcomes. In probabilistic inference, our goal is to formulate our predictions by assigning probabilities that estimate the likelihood of the possible outcomes.
Some examples of probabilistic inference could be: - Given exposure to a certain level of a toxin, what is probability of developing a disease?
And from that, we might want to infer what is an acceptable level of an environmental toxin? Specifically, what level of the toxin keeps the probability of the disease to an acceptable level?
A related example is in testing for a disease, as tests aren't perfect and sometimes give incorrect results. In light of the possibility of incorrect results, how do we interpret the results of a test that comes out positive for the disease?
Another example of a probabilistic inference is measuring from the seemly noisy activity of a rat's brain as it is navigating a maze and attempting to infer the rat's location.
A basic tool for probabilistic inference is Bayes' theorem. But, before I introduce the theorem, here's a probability problem about the flu and vaccinations.
Let's imagine that if one did not get the flu vaccine, that the probability of getting the flu is 70% or 0.7.
Let's say that 40% of the population got the flu vaccine, and for those who did get the vaccine, the probability of getting the flu drops to 20% or 0.2.
In this scenario, if you meet a person who got the flu, what's the probability that this person had the flu vaccine?
Is it (A) 0.4, which is the probability that any person you meet was vaccinated?
It would seem that the fact that the person got the flu should indicate that maybe they didn't get the vaccine. It makes sense that the probability should be lower than the 40% you'd assume without knowing they had the flu. But, by how much should we lower our estimate.
Is it (B) 0.32, which is just slightly lower than the original 0.4? Or should we reduce our estimate all the way down to (C) 0.16 or even (D) 0.08?
I would imagine at this point, you can eliminate A as a possibility, but you probably don't know which of B, C or D is the right answer.
To make some progress, we'll cast the problem into the language of events and their associated probabilities.
We'll let F be the event that the person got the flu and H be the event that the person stayed health. We'll let V be the event that the person got the flu vaccine and U be the event that they did not.
We are looking for a particular conditional probability. Which one of these is the probability that the person who got the flu was vaccinated?
Is it (A) P(V|F), (B) P(F|V), (C) P(V) or (D) P(F)?
We're looking for (A) P(V|F), the probability of being vaccinated, given the fact that one got the flu.
But, P(V|F) isn't one of the conditional probabilities we were given in the statement of the problem. Instead which probabilities are we given in the problem?
Some other probabilities involving the events are P(V), P(U), P(H), P(F), P(F|V), P(F|U), P(H|V), and P(H|U).
We were given three probabilities in the statement of the problem. Which three of these probabilities were specified and what are their values? Pause the video to figure this out before I give you the answer.
In the statement of the problem, we were given that if you did not get the flu vaccine, the probability of getting the flu is 0.7, i.e., P(F|U) = 0.7.
We were also told that 40% of the population got the flu vaccine, so P(V) = 0.4.
Lastly, the probability of getting the flu if vaccinated is 0.2, so P(F|V) = 0.2.
Our goal is to figure out how to use these values to calculate P(V|F), or the probability of being vaccinated given the fact that one has the flu. The solution is Bayes Theorem. But rather than stating the theorem, let's just figure out the solution (and the theorem) by filling out a contingency table.
In this contingency table, the columns indicate whether or not a person had the flu and the rows indicate whether or not they had the flu vaccine. The four inner squares are the four possible outcomes. Each of these represent the probability of one of the combinations.
Before putting numbers in the contingency table, let's label each entry with the probability it represents.
In writing these probabilities, we use a comma for AND, so the upper left corner is P(F,V), the probability that someone received the flu vaccine and got the flu. The order doesn't matter, so we could also write it as P(V,F).
The three other probabilities for the main part of the contingency table are P(H,V), the probability that a person received the vaccine and stayed healthy, P(F,U) the probability that a person did not receive the vaccine and got the flu, and P(H,U), the probability that a person did not receive the vaccine and stayed healthy.
The row sums are the total probability that a person received the vaccine P(V) or did not receive the vaccine P(U). The column sums are the total proability that someone got the flu P(F) or stayed healthy P(H).
The probabilities we need to deal with, though, are conditional probabilities, such as P(F|V) or P(V|F). We can calculate conditional probabilities by restricting the contingency table to just those entries that include an event. For example, the contingency table conditioned on the event V that one received the vaccine ignores all outcomes that don't involve V.
We have only the first row of the contingency table. When conditioning on V, we know that an outcome in that first row much have happened, so the total probability of the row must be 1. We divide by the row total P(V) to form the contingency table conditioned on V.
In fact, this is how we define the conditional probabilities. They are the values once we divide by P(V). The probability P(F | V), or the probability of getting the flu conditioned on receiving the vaccine, is P(F,V) divided by P(V). Similarly, P(H | V), or the probability of staying healthy conditioned on receiving the vaccine, is the probability of staying healthy and receiving the vaccine, P(H,V), divided by the probability of receiving the vaccine, P(V).
We rewrite the conditional contingency table with that notation, and simplify the total to be 1.
Our desired probability, though, is conditioned on F, the event of having the flu. We can create a contingency table conditioned on F by restricting to the first column. If we know we have the flu, then we want the total of the first column to be 1. We divide by P(F) to form the probabilities conditioned on F. In particular, the probability that a person was vaccinated, conditioned the the fact they got the flu, or P(V | F), is P(F,V) divided by P(F), or the probability of getting vaccinated and getting the flu, divided by the probability of getting the flu. We rewrite the conditional contingency table with that notation.
With these definitions of conditional probabilities, we're ready to connected P(F|V) with P(V|F). Let's start with the definition of P(F|V), which is P(F,V) divided by P(V). If we multiply both sides of that equation by P(V), it becomes P(F|V) times P(V) = P(F,V).
Similarly, we can take the definition of P(V|F) and multiply through by P(F). We see that P(V|F) times P(F) is also equal to P(F,V). We have two expressions equal to the same thing, so we set them equal to each other.
The result is called Bayes' Theorem. Usually we write Bayes' Theorem where it is solved for one of the conditional probabilities, so I'll divide through by P(F). Bayes Theorem states that the probability of V conditioned of F is the same thing as the probability of F conditioned on V, multiplied by the probability of V and divided by the probability of F.
I've written Bayes' Theorem so that it is solved for the probability we're trying to calculate.
We're given P(F|V) and P(V), and we can use Bayes' Theorem to calculate P(V|F).
OK, we're missing one piece, which is P(F) or the total probability that one gets the flu. You can get this value by summing up the first column of the original contingency table. The formula is slightly messy, but you don't need to memorize it if you can remember is just a total from the contingency table.
The sum of the first column is P(F,V) + P(F,U), i.e., including both possibilities of being vaccinated or unvaccinated while getting the flu. We already have a formula for P(F,V) and the formula for P(F,U) is identical.
We're given values for all the pieces in this formula, so now it's just a matter of plugging in the numbers.
Well, we're not exactly given P(U), the probability of being unvaccinated, but that's just 1 minus the probability P(V) of being vaccinated.
It's probably more enlightening to fill numbers into the contingency table rather than just plugging numbers in the formula.
First of all, we know that P(V) is 0.4, and from that we calculate that P(U) is 0.6.
Then, to fill in the first column, we have to multiply the given conditional probabilities by the probability of V or U.
We sum the first column to get that P(F), the probability of getting the flu, is 0.5 or 50%.
Now, we have all the numbers for Bayes Theorem. We do the calculation to determine that P(V|F) is 0.16. The probability that someone with the flu received the flu vaccination is 0.16 or 16%.
In this way, Bayes' Theorem helps you make probabilistic inference. Starting with the given probabilities of getting the flu depending on whether or not one is vaccinated, we determined how likely someone who has the flu was actually vaccinated.