Introduction to Probability Theory¶
Definition (Event Space \(\mathcal{F}\))
The collection \(\mathcal{F}\) of subsets of the sample space \(\Omega\) is called an event space if
i.e. \(\mathcal{F}\) is closed under the operations of taking complements and countable unions.
Remark (Intuitive Understanding of Event Space)
We can call a non-empty set \(\mathcal{F}\) Event Space as long as any event \(A\) in \(\mathcal{F}\) satisfies:
1. \(A\) not happening is also an event in \(\mathcal{F}\) by the axiom (2).
2. Either \(A\) or any other event \(B \in \mathcal{F}\) happening is also an event in \(\mathcal{F}\) by the axiom (3).
Note (Properties of Event Space \(\mathcal{F}\))
(a) An event space \(\mathcal{F}\) must contain the empty set \(\varnothing\) and the whole set \(\Omega\).
Proof
By (1), there exists some \(A \in \mathcal{F}\). By (2), \(A^c \in \mathcal{F}\). We set \(A_1 = A\), \(A_i = A^c\) for \(i \geq 2\) in (3), and deduce that \(\mathcal{F}\) contains the union \(\Omega = A \cup A^c\). By (2) again, the complement \(\Omega \setminus \Omega = \varnothing\) lies in \(\mathcal{F}\) also.
(b) An event space is closed under the operation of finite unions.
Proof
Let \(A_1, A_2, \dots, A_m \in \mathcal{F}\), and set \(A_i = \varnothing\) for \(i > m\). Then \(A := \bigcup_{i=1}^{m} A_i\) satisfies \(A = \bigcup_{i=1}^{\infty} A_i\), so that \(A \in \mathcal{F}\) by (3).
(c) An event space is also closed under the operations of taking finite or countable intersections.
Proof
Note that \((A \cap B)^c = A^c \cup B^c\).
Example (Examples of Event Space \(\mathcal{F}\))
Example 1
\(\Omega\) is any non-empty set and \(\mathcal{F}\) is the power set of \(\Omega\). △
Example 2
\(\Omega\) is any non-empty set and \(\mathcal{F} = \{\varnothing, A, \Omega \setminus A, \Omega\}\), where \(A\) is a given non-trivial subset of \(\Omega\). △
Example 3
\(\Omega = \{1, 2, 3, 4, 5, 6\}\) and \(\mathcal{F}\) is the collection
\(\{\varnothing, \{1, 2\}, \{3, 4\}, \{5, 6\}, \{1, 2, 3, 4\}, \{3, 4, 5, 6\}, \{1, 2, 5, 6\}, \Omega\}\) of subsets of \(\Omega\). △
Definition (Probability Measure)
A mapping \(\mathbb{P} : \mathcal{F} \rightarrow \mathbb{R}\) is called a probability measure on \((\Omega, \mathcal{F})\) if
(a) \(\mathbb{P}(A) \geq 0\) for \(A \in \mathcal{F}\),
(b) \(\mathbb{P}(\Omega) = 1\) and \(\mathbb{P}(\varnothing) = 0\),
(c) if \(A_1, A_2, \dots\) are disjoint events in \(\mathcal{F}\) (in that \(A_i \cap A_j = \varnothing\) whenever \(i \neq j\)) then
Note (Why \(\mathbb{P}(\varnothing) = 0\) ?)
Define the disjoint events \(A_1 = \Omega\), \(A_i = \varnothing\) for \(i \geq 2\). By condition (c),
Definition (Probability Space)
A probability space is a triple \((\Omega, \mathcal{F}, \mathbb{P})\) of objects such that
(a) \(\Omega\) is a non-empty set,
(b) \(\mathcal{F}\) is an event space of subsets of \(\Omega\),
(c) \(\mathbb{P}\) is a probability measure on \((\Omega, \mathcal{F})\).
Important Note (Properties of Probability Space)
Property 1 If \(A, B \in \mathcal{F}\), then \(A \setminus B \in \mathcal{F}\).
Proof
The complement of \(A \setminus B\) equals \((\Omega \setminus A) \cup B\), which is the union of events and is therefore an event. Hence \(A \setminus B\) is an event. □
Property 2 If \(A, B \in \mathcal{F}\), then \(\mathbb{P}(A \cup B) + \mathbb{P}(A \cap B) = \mathbb{P}(A) + \mathbb{P}(B)\).
Proof
The set \(A\) is the union of the disjoint sets \(A \setminus B\) and \(A \cap B\), and hence
A similar remark holds for the set \(B\), giving that
Property 3 If \(A_1, A_2, \dots \in \mathcal{F}\), then \(\bigcap_{i=1}^{\infty} A_i \in \mathcal{F}\).
Proof
The complement of \(\bigcap_{i=1}^{\infty} A_i\) equals \(\bigcup_{i=1}^{\infty} (\Omega \setminus A_i)\), which is the union of the complements of events and is therefore an event. Hence the intersection of the \(A_i\) is an event also, as before. □
Property 4 If \(A, B \in \mathcal{F}\) and \(A \subseteq B\) then \(\mathbb{P}(A) \leq \mathbb{P}(B)\).
Proof
We have that \(\mathbb{P}(B) = \mathbb{P}(A) + \mathbb{P}(B \setminus A) \geq \mathbb{P}(A)\). □
Definition (Conditional Probability)
If \(A, B \in \mathcal{F}\) and \(\mathbb{P}(B) > 0\), the conditional probability of \(A\) given \(B\) is denoted by \(\mathbb{P}(A \mid B)\) and defined by
Remark
Formula (5) is a definition rather than a theorem. An intuition to define conditional probability is that \(\mathbb{P}(A \mid A) = 1\). (Draw Venn Diagram)
Theorem
If \(B \in \mathcal{F}\) and \(\mathbb{P}(B) > 0\) then \((\Omega, \mathcal{F}, \mathbb{Q})\) is a probability space where \(\mathbb{Q} : \mathcal{F} \rightarrow \mathbb{R}\) is defined by \(\mathbb{Q}(A) = \mathbb{P}(A \mid B)\).
Proof
We need only check that \(\mathbb{Q}\) is a probability measure on \((\Omega, \mathcal{F})\). Certainly \(\mathbb{Q}(A) \geq 0\) for \(A \in \mathcal{F}\) and
and it remains to check that \(\mathbb{Q}\) satisfies (4). Suppose that \(A_1, A_2, \dots\) are disjoint events in \(\mathcal{F}\). Then
Therefore, \(\mathbb{Q}\) is a probability measure on \((\Omega, \mathcal{F})\). □
Definition (Naive Definition of Independent Events)
We call two events \(A\) and \(B\) independent if the occurrence of one of them does not affect the probability that the other occurs. More formally, we define:
if \(\mathbb{P}(A), \mathbb{P}(B) > 0\), then
Definition (Generalized Definition of Two Independent Events)
Writing \(\mathbb{P}(A \mid B) = \mathbb{P}(A \cap B)/\mathbb{P}(B)\), we have:
Events \(A\) and \(B\) of a probability space \((\Omega, \mathcal{F}, \mathbb{P})\) are called independent if
and dependent otherwise.
This definition is slightly more general since it allows the events \(A\) and \(B\) to have zero probability.
It is easily generalized as follows to more than two events.:
Definition (Definition of Independent Events)
A family \(\mathcal{A} = (A_i \mid i \in I)\) of events is called independent if, for all finite subsets \(J\) of \(I\),
Definition (Pairwise Independent)
The family \(\mathcal{A}\) is called pairwise independent if (4) holds whenever \(|J| = 2\).
There are families of events which are pairwise independent but not independent.
Example (Pairwise Independent but Dependent)
Suppose that we throw a fair four-sided die (you may think of this as a square die thrown in a two-dimensional universe). We may take \(\Omega = \{1, 2, 3, 4\}\), where each \(\omega \in \Omega\) is equally likely to occur. The events \(A = \{1, 2\}\), \(B = \{1, 3\}\), \(C = \{1, 4\}\) are pairwise independent but not independent.
Theorem (Partition Theorem)
If \(\{B_1, B_2, \dots \}\) is a partition of \(\Omega\) with \(\mathbb{P}(B_i) > 0\) for each \(i\), then
Proof
We have that
Remark (Intuition of Partition Theorem)
The partition theorem is saying if we have a set of event that partition \(\Omega\), we can use it to crop an arbitrary event \(E\) into several pieces, and the combination of all those pieces is exactly \(E\).
Exercise (Application of Partition Theorem)
Tomorrow there will be either rain or snow but not both; the probability of rain is \(\frac{2}{5}\) and the probability of snow is \(\frac{3}{5}\). If it rains, the probability that I will be late for my lecture is \(\frac{1}{5}\), while the corresponding probability in the event of snow is \(\frac{3}{5}\). What is the probability that I will be late?
Solution
Let \(A\) be the event that I am late and \(B\) be the event that it rains. The pair \(B, B^c\) is a partition of the sample space (since exactly one of them must occur). By Partition Theorem,
Theorem (Bayes' Theorem)
Let \(\{B_1, B_2, \dots\}\) be a partition of the sample space \(\Omega\) such that \(\mathbb{P}(B_i) > 0\) for each \(i\). For any event \(A\) with \(\mathbb{P}(A) > 0\),
Proof
By the definition of conditional probability,
Exercise (False Positives)
A rare but potentially fatal disease has an incidence of 1 in \(10^5\) in the population at large. There is a diagnostic test, but it is imperfect. If you have the disease, the test is positive with probability \(\frac{9}{10}\); if you do not, the test is positive with probability \(\frac{1}{20}\). Your test result is positive. What is the probability that you have the disease?
Solution
Write \(D\) for the event that you have the disease, and \(P\) for the event that the test is positive. By Bayes' theorem,