Continuous Distributions

Continuous Random Variables

Discrete data are data with a finite or countably infinite number of possible outcomes.

Continuous data are data which come from a continuous interval of possible outcomes. It means that continuous data have infinitely many outcomes.

The cumulative distribution function remains the same for discrete and continuous random data.

Probability Density Function (p.d.f)

The probability density function of a continuous random variable X is an integrable function $f_X: X(S) \to \R$

$P(A) = P(a,b) = \int_{a}^{b} f_X (x) dx$

Expectation

$\mathrm{E}[X] = \int_{-\infty}^{\infty} xf_X (x) dx$

Note that the expectation exists if the integral exists

Mean and Variance

As in the discrete case, if $X,Y$ are continuous random variables, and $a,b \in \R$ ,

$\mathrm{E}[aY +bX] = a\mathrm{E}[Y] + b\mathrm{E}[X]$
$\mathrm{Var}[a+bX] = b^2\mathrm{Var}[X]$
$\mathrm{SD}[a+bX] = |b|\mathrm{SD}[X]$

Standard Normal Distribution

$\phi(z) = \frac{1}{\sqrt{2\pi}}e^{-z^2/2}$

and the corresponding cumulative distribution function is denoted by $\Phi(z) = P(Z \leq z) = \int_{-\infty}^{z} \phi(t)dt$

A random variable $Z$ with this c.d.f is said to have a standard normal distribution, and we write $Z \sim N (0,1)$

Expectation and Variance

$\mathrm{E}[Z] = \int_{-\infty}^{\infty} z\phi(z)dz = \int_{-\infty}^{\infty} z\frac{1}{\sqrt{2\pi}}e^{-z^2/2} dz = 0$
$\mathrm{Var}[Z] = \int_{-\infty}^{\infty} z^2\phi(z)dz = 1$
$\mathrm{SD}[Z] = \sqrt{\mathrm{Var}[Z]} = \sqrt{1} = 1$

General Normal Random Variable

$Z \sim N(0,1) = \frac{X-\mu}{\sigma}$

c.d.f is given by $F_X(x) = \Phi(\frac{x-\mu}{\sigma})$

p.d.f is given by $f_X(x) = \frac{1}{\sigma} \phi(\frac{x-\mu}{\sigma})$

Any random variable $X$ with this c.d.f/p.d.f must satisfy: $\mathrm{E}[X] = \mu, \mathrm{Var}[X] = \sigma^2, \mathrm{SD}[X] = \sigma$

Exponential Random Variable

Assume that cars arrive according to a Poisson process with rate λ, i.e. the number of cars arriving within a fixed unit time period is a Poisson random variable with parameter λ. Over a period of time x, we would expect the number of arrivals N to follow a Poisson process with parameter λx. Let X be the waiting time to the first car arrival. Then

$P(X > x) = 1 - P(X ≤ x) = P(N = 0) = exp(-λx)$

We say that X follows a exponential distribution $\mathrm{Exp}(λ)$ , and

$F_X(x) = \{0 \ \mathrm{for} \ x < 0 \ \mathrm{and} \ 1-e^{-\lambda x} \ \mathrm{for} \ 0 \leq x\}$
$f_X(x) = \{0 \ \mathrm{for} \ x \leq 0 \ \mathrm{and} \ \lambda e^{-\lambda x} \ \mathrm{for} \ 0 \leq x\}$

Properties

If $X \sim \mathrm{Exp}(\lambda)$ , then:

$\mu = \mathrm{E}[X] = 1/\lambda$
$\sigma^2 = \mathrm{Var}[X] = 1/\lambda^2$

Memory-Less Property:

$P(X > s + t | X > t) = P(X > s)$

Gamma Random Variable

$f_Y (y) = \{0 \ \mathrm{for} \ y < 0 \ \mathrm{and} \ \frac{y^{k-1}}{(k-1)!}\lambda^ke^{-\lambda y} \ \mathrm{for} \ 0 \leq y\}$

$\mu = \mathrm{E}[Y] = \frac{k}{y}$ and $\sigma^2 = \mathrm{Var}[X] = \frac{k}{\lambda^2}$

Joint Distributions

$f(x, y) ≥ 0,$ for all $x, y$
$\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x, y) dxdy = 1$
$P(A) = \int \int_A f(x, y) dxdy,$ where $A ⊆ \R^2$

Normal Approximation of the Binomial Distribution

If $X \sim B(n, p)$ then we may interpret $X$ as a sum of independent and identically distributed random variables

$X = I_1 + I_ + ... + I_n$ where each $I_i \sim B(1, p)$ .

Thus, according to the Central Limit Theorem (more on this later), for large $n$ , we have

$\frac{X-np}{\sqrt{np(1-p)}} \approx N (0,1)$

for large $n$ if $X \sim B(n, p)$ then $X \approx N (np, np(1 - p))$ .

Normal Approximation with Continuity Correction

Let $X \sim B(n, p)$ . Recall that E $[X] = np$ and Var $[X] = np(1 - p)$ . If $n$ is large, we may approximate $X$ by a normal random variable in the following way:

$P(X ≤ x) = P(X < x + 0.5) = P(Z < \frac{x-np+0.5}{\sqrt{np(1-p)}})$

and

$P(X \geq x) = P(X > x - 0.5) = P(Z > \frac{x-np-0.5}{\sqrt{np(1-p)}})$

Discrete Distributions

Descriptive Statistics and Sampling Distributions