Common Probability Distributions

Visual guide to common probability distributions with their properties and use cases.


Normal (Gaussian) Distribution

Symmetric bell curve - most common distribution in nature.

$$ f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}} $$

  • Mean: $\mu$
  • Variance: $\sigma^2$
  • Symmetric around mean
  • 68-95-99.7 rule (1σ, 2σ, 3σ)
 1from scipy import stats
 2import numpy as np
 3
 4# Generate samples
 5samples = np.random.normal(mu=0, sigma=1, size=1000)
 6
 7# PDF
 8x = np.linspace(-4, 4, 100)
 9pdf = stats.norm.pdf(x, loc=0, scale=1)
10
11# CDF
12cdf = stats.norm.cdf(x, loc=0, scale=1)

Binomial Distribution

Discrete - number of successes in $n$ independent trials.

$$ P(X = k) = \binom{n}{k} p^k (1-p)^{n-k} $$

  • Parameters: $n$ (trials), $p$ (success probability)
  • Mean: $np$
  • Variance: $np(1-p)$
1# Binomial: n=10 trials, p=0.5 success probability
2samples = np.random.binomial(n=10, p=0.5, size=1000)

Poisson Distribution

Discrete - number of events in fixed interval (rare events).

$$ P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!} $$

  • Parameter: $\lambda$ (average rate)
  • Mean: $\lambda$
  • Variance: $\lambda$
  • Models: arrivals, defects, calls per hour
1# Poisson: lambda=3 average events
2samples = np.random.poisson(lam=3, size=1000)

Exponential Distribution

Continuous - time between events (memoryless property).

$$ f(x) = \lambda e^{-\lambda x}, \quad x \geq 0 $$

  • Parameter: $\lambda$ (rate)
  • Mean: $1/\lambda$
  • Variance: $1/\lambda^2$
  • Memoryless: $P(X > s+t | X > s) = P(X > t)$
1# Exponential: lambda=0.5
2samples = np.random.exponential(scale=1/0.5, size=1000)

Gamma Distribution

Continuous - sum of exponential random variables, waiting time for $k$ events.

$$ f(x) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\beta x}, \quad x \geq 0 $$

  • Parameters: $\alpha$ (shape), $\beta$ (rate)
  • Mean: $\alpha/\beta$
  • Variance: $\alpha/\beta^2$
  • Special case: $\alpha=1$ gives Exponential
1from scipy import stats
2
3# Gamma distribution
4alpha, beta = 2, 1
5samples = np.random.gamma(alpha, 1/beta, size=1000)
6
7# Or using scipy
8x = np.linspace(0, 10, 100)
9pdf = stats.gamma.pdf(x, a=alpha, scale=1/beta)

Pareto Distribution

Heavy-tailed - power law distribution (80/20 rule, wealth distribution).

$$ f(x) = \frac{\alpha x_m^\alpha}{x^{\alpha+1}}, \quad x \geq x_m $$

  • Parameters: $\alpha$ (shape), $x_m$ (scale/minimum)
  • Mean: $\frac{\alpha x_m}{\alpha - 1}$ for $\alpha > 1$
  • Heavy tail: $P(X > x) \propto x^{-\alpha}$

Note: Heavy tail means rare extreme events are more likely than in normal distribution.

1from scipy import stats
2
3# Pareto distribution
4alpha, xm = 2, 1
5samples = (np.random.pareto(alpha, size=1000) + 1) * xm
6
7# Or using scipy
8x = np.linspace(xm, 10, 100)
9pdf = stats.pareto.pdf(x, alpha, scale=xm)

Skewed Distributions

Log-Normal Distribution

Right-skewed - exponential of normal distribution.

$$ f(x) = \frac{1}{x\sigma\sqrt{2\pi}} e^{-\frac{(\ln x - \mu)^2}{2\sigma^2}}, \quad x > 0 $$

1# Log-normal distribution
2mu, sigma = 0, 1
3samples = np.random.lognormal(mu, sigma, size=1000)

Distribution Comparison

DistributionTypeParametersMeanUse Case
NormalContinuous$\mu, \sigma$$\mu$Natural phenomena, errors
BinomialDiscrete$n, p$$np$Success/failure trials
PoissonDiscrete$\lambda$$\lambda$Rare events, arrivals
ExponentialContinuous$\lambda$$1/\lambda$Time between events
GammaContinuous$\alpha, \beta$$\alpha/\beta$Waiting times
ParetoContinuous$\alpha, x_m$$\frac{\alpha x_m}{\alpha-1}$Power laws, wealth
Log-NormalContinuous$\mu, \sigma$$e^{\mu + \sigma^2/2}$Multiplicative processes

Further Reading

Related Snippets