Continuous Distributions

Continuous Random Variables

Discrete data are data with a finite or countably infinite number of possible outcomes.

Continuous data are data which come from a continuous interval of possible outcomes. It means that continuous data have infinitely many outcomes.

The cumulative distribution function remains the same for discrete and continuous random data.

Probability Density Function (p.d.f)

The probability density function of a continuous random variable X is an integrable function fX:X(S)Rf_X: X(S) \to \R

P(A)=P(a,b)=abfX(x)dxP(A) = P(a,b) = \int_{a}^{b} f_X (x) dx

Expectation

E[X]=xfX(x)dx\mathrm{E}[X] = \int_{-\infty}^{\infty} xf_X (x) dx

Note that the expectation exists if the integral exists

Mean and Variance

As in the discrete case, if X,YX,Y are continuous random variables, and a,bRa,b \in \R,

E[aY+bX]=aE[Y]+bE[X]\mathrm{E}[aY +bX] = a\mathrm{E}[Y] + b\mathrm{E}[X]
Var[a+bX]=b2Var[X]\mathrm{Var}[a+bX] = b^2\mathrm{Var}[X]
SD[a+bX]=bSD[X]\mathrm{SD}[a+bX] = |b|\mathrm{SD}[X]

Standard Normal Distribution

ϕ(z)=12πez2/2\phi(z) = \frac{1}{\sqrt{2\pi}}e^{-z^2/2}

and the corresponding cumulative distribution function is denoted by Φ(z)=P(Zz)=zϕ(t)dt\Phi(z) = P(Z \leq z) = \int_{-\infty}^{z} \phi(t)dt

A random variable ZZ with this c.d.f is said to have a standard normal distribution, and we write ZN(0,1)Z \sim N (0,1)

Expectation and Variance

E[Z]=zϕ(z)dz=z12πez2/2dz=0\mathrm{E}[Z] = \int_{-\infty}^{\infty} z\phi(z)dz = \int_{-\infty}^{\infty} z\frac{1}{\sqrt{2\pi}}e^{-z^2/2} dz = 0
Var[Z]=z2ϕ(z)dz=1\mathrm{Var}[Z] = \int_{-\infty}^{\infty} z^2\phi(z)dz = 1
SD[Z]=Var[Z]=1=1\mathrm{SD}[Z] = \sqrt{\mathrm{Var}[Z]} = \sqrt{1} = 1

General Normal Random Variable

ZN(0,1)=XμσZ \sim N(0,1) = \frac{X-\mu}{\sigma}

c.d.f is given by FX(x)=Φ(xμσ)F_X(x) = \Phi(\frac{x-\mu}{\sigma})

p.d.f is given by fX(x)=1σϕ(xμσ)f_X(x) = \frac{1}{\sigma} \phi(\frac{x-\mu}{\sigma})

Any random variable XX with this c.d.f/p.d.f must satisfy: E[X]=μ,Var[X]=σ2,SD[X]=σ\mathrm{E}[X] = \mu, \mathrm{Var}[X] = \sigma^2, \mathrm{SD}[X] = \sigma

Exponential Random Variable

Assume that cars arrive according to a Poisson process with rate λ, i.e. the number of cars arriving within a fixed unit time period is a Poisson random variable with parameter λ. Over a period of time x, we would expect the number of arrivals N to follow a Poisson process with parameter λx. Let X be the waiting time to the first car arrival. Then

P(X>x)=1P(Xx)=P(N=0)=exp(λx)P(X > x) = 1 - P(X ≤ x) = P(N = 0) = exp(-λx)

We say that X follows a exponential distribution Exp(λ)\mathrm{Exp}(λ), and

FX(x)={0 for x<0 and 1eλx for 0x}F_X(x) = \{0 \ \mathrm{for} \ x < 0 \ \mathrm{and} \ 1-e^{-\lambda x} \ \mathrm{for} \ 0 \leq x\}
fX(x)={0 for x0 and λeλx for 0x}f_X(x) = \{0 \ \mathrm{for} \ x \leq 0 \ \mathrm{and} \ \lambda e^{-\lambda x} \ \mathrm{for} \ 0 \leq x\}

Properties

If XExp(λ)X \sim \mathrm{Exp}(\lambda), then:

  • μ=E[X]=1/λ\mu = \mathrm{E}[X] = 1/\lambda
  • σ2=Var[X]=1/λ2\sigma^2 = \mathrm{Var}[X] = 1/\lambda^2

Memory-Less Property:

P(X>s+tX>t)=P(X>s)P(X > s + t | X > t) = P(X > s)

Gamma Random Variable

fY(y)={0 for y<0 and yk1(k1)!λkeλy for 0y}f_Y (y) = \{0 \ \mathrm{for} \ y < 0 \ \mathrm{and} \ \frac{y^{k-1}}{(k-1)!}\lambda^ke^{-\lambda y} \ \mathrm{for} \ 0 \leq y\}

μ=E[Y]=ky\mu = \mathrm{E}[Y] = \frac{k}{y} and σ2=Var[X]=kλ2\sigma^2 = \mathrm{Var}[X] = \frac{k}{\lambda^2}

Joint Distributions

  1. f(x,y)0,f(x, y) ≥ 0, for all x,yx, y
  2. f(x,y)dxdy=1\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x, y) dxdy = 1
  3. P(A)=Af(x,y)dxdy,P(A) = \int \int_A f(x, y) dxdy, where AR2A ⊆ \R^2

Normal Approximation of the Binomial Distribution

If XB(n,p)X \sim B(n, p) then we may interpret XX as a sum of independent and identically distributed random variables

X=I1+I+...+InX = I_1 + I_ + ... + I_n where each IiB(1,p)I_i \sim B(1, p).

Thus, according to the Central Limit Theorem (more on this later), for large nn, we have

Xnpnp(1p)N(0,1)\frac{X-np}{\sqrt{np(1-p)}} \approx N (0,1)

for large nn if XB(n,p)X \sim B(n, p) then XN(np,np(1p))X \approx N (np, np(1 - p)).

Normal Approximation with Continuity Correction

Let XB(n,p)X \sim B(n, p). Recall that E[X]=np[X] = np and Var[X]=np(1p)[X] = np(1 - p). If nn is large, we may approximate XX by a normal random variable in the following way:

P(Xx)=P(X<x+0.5)=P(Z<xnp+0.5np(1p))P(X ≤ x) = P(X < x + 0.5) = P(Z < \frac{x-np+0.5}{\sqrt{np(1-p)}})

and

P(Xx)=P(X>x0.5)=P(Z>xnp0.5np(1p))P(X \geq x) = P(X > x - 0.5) = P(Z > \frac{x-np-0.5}{\sqrt{np(1-p)}})