Introduction to Probability

Sample Spaces

Any action with a random outcome is defined as random experiment

The sample space for this action is the set of all possible outcomes. Denoted as $S$ .

A sample space can be discrete or continuous. Ex. flipping a coin, measuring weight $(0, \infty)$

For any events $A$ and $B$ in $S$ :

The union of $A \cup B$ are all outcomes from $S$ in either $A$ or $B$ ;
The intersection of $A \cap B$ are all outcomes in both of $A$ and $B$ ;
The complement of A(sometimes denoted as $\overline{A}$ or $-A$ ) is the set of all outcomes in $S$ that are not in $A$ ;

If $A$ and $B$ don't have any common outcomes, they are mutually exclusive. $A \cap B$ is $\varnothing$ , or the empty set.

Counting Techniques

Addition Rule

Consider a job that can be done with 2 independent machine; first machine OR second machine. The first machine does the job in $m_1$ ways and the second machine does the job in $m_2$ ways. Then, the job can be done in $m_1 + m_2$ ways.

Multiplication Rule

A job that can be done in a k-stage procedure would be modeled as having $k$ bags, with $m_1$ items in the first bag,..., $m_k$ items in $k$ -th bag. A $k$ -stage process is a process for which:

there are $m_1$ possibilities at stage 1;
regardless of the 1st outcome there are $m_2$ possibilities of stage 2
regardless of the previous outcomes, there are $m_k$ choices at stage $k$

Then there are $m_1\cdot m_2\cdot ...\cdot m_k$ total ways the process can turn out.

Ordered and Unordered Samples

Ordered Samples

Two different scenarios: with replacement or without replacement.

Sampling With Replacement (order important):

If $n$ outcomes and $r$ repetitions then $n^r$

Sampling Without Replacement (order important):

This results in $nP_r = \frac{n!}{(n-r)!}$

Unordered Samples

In this case we would use $nC_r = \frac{n!}{{(n-r)!r!}}$ = $n \choose r$

Probability of an Event

For situations where we have a random experiment which has exactly $N$ possible **mutually exclusive, equally likely outcomes, we can assign a probability to an event $A$ by counting the number of outcomes that correspond to $A$ . If the count is $a$ then: $P(A) = \frac{a}{N}$ .

The probability of each individual outcome is thus $1/N$

Axioms of Probability

For any event $A$ , $1 \geq P(A) \geq 0$ .
For the complete sample space $S$ , $P(S) = 1$ .
For the empty event $\varnothing, P(\varnothing) = 0$ .
For two mutually exclusive events $A$ and $B$ , the probability that $A$ or $B$ occurs is $P(A \cup B) = P(A) + P(B)$

Since $S = A \cup \overline{A}$ , and since $A$ and $\overline{A}$ are mutually exclusive then, $1 = P(S) = P(A \cup \overline{A}) = P(A) + P(\overline{A}) \Rightarrow P(\overline{A}) = 1 - P(A)$ .

General Addition Rule

$(A \cup B) = P(A) + P(B) - P(A \cap B)$

When $A$ and $B$ are mutually exclusive, $P(A \cap B) = P(\varnothing) = 0$ and $P(A \cup B) = P(A) + P(B)$

If there are more than two events, the rule expands as follows:

$P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A \cap B) - P(A \cap C) - P(B \cap C) + P (A \cap B \cap C)$

Conditional Probabilities and Independent Events

Any two events $A$ and $B$ satisfying $P(A \cap B) = P(A) \times P(B)$ are said to be independent; this is a purely mathematical definition, but it agrees with the intuitive notion of independence in simple examples.

When events are not independent, we say that they are dependent or conditional.

Conditional Probability

The conditional probability of an event $B$ given that another event $A$ has occurred as $P(B|A) = \frac{P(A \cap B)}{P(A)}$ .

Law of Total Probability

if $A_1, ...A_k$ are mutually exclusive and exhaustive (i.e. $A_i \cap A_j = \varnothing$ forall $i \neq j$ and $A_1 \cup ... \cup A_k = S$ ), then for any event $B$

$P(B) = \sum_{j=1}^{k} P(B|A_j)P(A_j) = P(B|A_1)P(A_1)+ ... + P(B|A_k)P(A_k)$ .

Bayes Theorem

$P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}$

Terminology

$P$ (hypothesis) is the probability of the hypothesis being true prior to the experiment (called the prior)
$P$ (hypothesis|data) is the probability of the hypothesis being true once the experimental data is taken into account (called the posterior)
$P$ (data|hypothesis) is the probability of the experimental data being observed assuming that the hypothesis is true (called the likelihood)

Statistics and Probability

Discrete Distributions