-
Imagine that we are cross-pollinating two plants, and we are interested in a particular gene that has two variants, or alleles, in these plants. Let's denote the two gene variants by A and a. As is true for most genes, each individual plant has two copies of this gene. If both copies of the gene in a particular plant are allele A, we say the genotype of the plant is AA. If, on the other hand, the plant has two copies of allele a, we say its genotype is aa. The last possibility is that a plant is genotype Aa, which means the plant has one copy of allele A and one copy of allele a.
Suppose we take two plants of genotype Aa. We'll pollinate one of these plants with the other, creating offspring that have one allele from each parent plant. In this context, an “experiment” is a pollination resulting in a single offspring having one allele from each parent.
In probability terminology, the sample space is the set of all possible outcomes of an experiment. In this case, the outcome will be the genotype of the offspring. List the genotypes that make up the sample space for our experiment:
. (Separate the genotypes by commas.)
An event is an outcome or a set of outcomes. For example, the event that the offspring also genotype Aa has a single outcome (and hence can be called a simple event). On the other hand, the event that the offspring has at least one copy of the the A allele includes two possible outcomes: genotype AA and genotype Aa.
To form a probability model,we assign a probability to each event that indicates how likely each event is to occur. There are some intuitive rules that a probability model must obey. For instance, consider the event that the outcome is one of the outcomes in the sample space. Is it possible for this event not to happen? (Recall the definition of sample space.)
In order to write some of these rules, we need some terminology from set theory.
-
Let's consider several different events in our cross-pollination example. Say event A is the offspring having at least one copy of allele A, event B is the offspring having at least one copy of allele a, event C is the offspring having genotype aa, and event D is the offspring having genotype AA. What are the specific genotypes (or outcomes) in each of these events?
event A:
event B:
event C:
event D:
-
Observe that A and B have a common genotype,
. This reflects the set theoretic operation of intersection: the intersection of A and B is everything that is in both A and B. In other words, the intersection of A and B is the set of outcomes where both events A and B occurred. Symbolically, this is written as A $\cap$ B and read "A intersect B".
Are there any outcomes in A $\cap$ C?
In this case, the sets A and C are called disjoint, and the intersection is called the empty set, or the null set, which is written with the symbol $\emptyset$. In probability language, we say two events are mutually exclusive if their intersection is the null set, because the occurrence of one event excludes the possibility of the other event.
-
Suppose we want to describe the event of having two copies of the same allele. There are two ways we can think of this. We can think of it is the combination of events C and D, where any outcome in C or D works, or we can think of it as the opposite of having one of each allele. Let's first look at the combination of events C and D. The union of C and D is the set of everything that is in C or in D. In the language of probability, the union of C and D is the event containing all outcomes in C or in D. Symbolically, we write the union of C and D as C $\cup$ D and read it "C union D". What are the genotypes in C $\cup$ D?
What is A $\cup$ B?
-
The idea of taking the "opposite" of an event is called the complement. The complement of a set is everything outside of the set. In the case of probability, the complement of an event is the event containing every other outcome in the sample space. We write the complement of A as A$^c$, which is read "A complement". What is A$^c$?
What is B$^c$?
What is C$^c$?
-
Let's look briefly at how these operations interact. In the following, write everything in terms of the events X, Y, the null set, and the sample space. These identities apply to all events, and you may find it helpful to think it through with the specific events given above. Write null
for the null set and S
for the sample space.
(X$^c$)$^c =$
X $\cap$ X$^c =$
X $\cup$ X$^c =$
-
A probability model assign the probability $P(E)$ to each event $E$. Let's look at the requirements for a probability model to make sense.
First, as we considered before, the event S, where the outcome is in the sample space, has to happen. Therefore
$1$. $P(S)=1$, where S is the sample space.
Second, probabilities have to be between $0$ and $1$, regardless of the event.
$2$. $0\leq P(A) \leq 1$ for any event A.
Third, if two events are mutually exclusive, the probability of their union is the sum of their probabilities.
$3$. If $A \cap B = \emptyset$, then $P(A \cup B) =$
$+$
.
From the first and third conditions, we can come up with a useful relationship between the probability of A and the probability of A$^c$. Replace $B$ in the third condition to find $P(A^c)$ in terms of $P(A)$. (Remember, from previous part, that $A + A^c = S$.)
$P(A^c) =$
-
If there are only a finite number of possible outcomes, assigning probabilities to events is relatively straightforward. We first assign a probability to each simple event, with only the requirement that the sum of all these probabilities must be $1$ so that the sample space has probability $1$. Once we know the probabilities of the simple events, the probability of any other (non-simple) event can be written as the
of the probabilities of the simple events it contains.
Let's use this to come up with probabilities for the events in our cross-pollination model. Supposing that there is an equal chance that each parent passes on either allele to the offspring. There are four equally possible situations: both parents pass on the A allele, the ovule has the A allele and the pollen the a allele, the ovule has the a allele and the pollen has the A allele, and both parents pass on the a allele. Two of these are indistinguishable in the offspring, resulting in only three possible genotypes. What are the probabilities of each of these genotypes?
$P(AA) = $
$P(Aa) = $
$P(aa) = $
Notice how the probability of these three simple events sums up to one and that these three simple events make up the entire sample space.
We can just add up the probability of these simple events to determine the probabilities of the events A, B, C, and D from part b.
$P(A) =$
$P(B) =$
$P(C) =$
$P(D) =$