Post

What Are Random Variables and How Do We Visualize Their Distributions?

What Are Random Variables and How Do We Visualize Their Distributions?

How can we model the outcome of a random process?
What’s the difference between discrete and continuous probability?
And how do we visualize all of that?

This post answers these questions — with intuitive charts and real-world examples.


📚 This post is part of the "Intro to Statistics" series

🔙 Previously: Understanding Independence and Bayes’ Rule

🔜 Next: Summary Statistics of Probability Distributions


🎲 What Is a Random Variable?

A random variable is a numerical outcome of a random phenomenon.

It can take different values depending on the situation — like the result of a die roll, the temperature in your city, or a person’s height.


🧱 Types of Random Variables

TypeDescriptionExamples
DiscreteTakes a countable number of values# of calls/day, die roll result
ContinuousTakes any value within an interval (infinite possibilities)Height, temperature, weight

📊 How Do We Work With Random Variables?

We use probability distributions to describe how likely each outcome is.

A probability distribution can be expressed as:

  • A table
  • A graph
  • An equation

Depending on the variable type, we use:

TypeDistribution Function
DiscreteProbability Mass Function (PMF)
ContinuousProbability Density Function (PDF)

🔍 Visual: PMF (Discrete Distribution)

PMF Plot

Each bar shows the probability of an exact outcome.


🔍 Visual: PDF (Continuous Distribution)

PDF Plot

The area under the curve (not the height) represents probability.
You can’t directly say \( P(X = 5) \); it’s always \( P(a \le X \le b) \).


⚖️ Why Are Discrete Probabilities Simpler?

With discrete random variables, calculating probabilities is straightforward — you can just add up the values:

\( P(X = 2 \text{ or } X = 3) = P(X = 2) + P(X = 3) \)

In contrast, with continuous variables, you need to integrate the area under the curve — which often requires formulas or software.


📈 Cumulative Distribution Function (CDF)

The Cumulative Distribution Function answers:

What is the probability that \( X \) is less than or equal to some value?

We can compute CDFs for both discrete and continuous variables.


🧪 Example: CDF (Discrete)

xP(X = x)P(X ≤ x)
10.10.1
20.30.4
30.20.6
40.250.85
50.151.0

CDF Discrete

Each step adds the probability from the previous value.


📊 Example: CDF (Continuous)

CDF Continuous

This curve shows P(X ≤ x) for every point — and it always increases.


📉 Distribution vs Cumulative: Visual Comparison

ViewWhat It Shows
PDF / PMFProbability of individual values (or areas)
CDFCumulative probability up to a certain point

🎨 Visual Comparison

PDF → Use the area under curve to find probability
CDF → Read probability directly from the graph


📌 Key Properties of CDF

  • Always increases (never decreases)
  • Final value = 1
  • You can find \( x \) for a given probability — or the other way around

🎯 What Is a Quantile?

A quantile tells us the value at a certain cumulative probability.

  • The median is the 0.5 quantile → 50% of values lie below
  • The 0.9 quantile means 90% of values are below that point

🔍 Visual Example

Quantile Concept

If the 90th percentile is 8.1, then \( P(X \le 8.1) = 0.90 \)


🧠 Level Up: Why CDFs Are Powerful
CDFs help you answer questions in reverse:
  • “What’s the probability of X being below a threshold?” (➡️ read from CDF)
  • “What value corresponds to 75% of cases?” (➡️ find the x-value for \( P(X) = 0.75 \)) You can even invert the function to get values back from probabilities. CDFs are especially useful in:
  • Risk modeling
  • Threshold setting
  • Statistical simulations
  • Machine learning (quantile regression)

  • 🧠 Try It Yourself: Random Variables & Distributions

    Q1: What distinguishes a discrete random variable from a continuous one?

    💡 Show Answer

    Discrete variables take countable values; continuous variables can take infinitely many values within a range.

    Q2: What is the name of the distribution function for discrete random variables?

    💡 Show Answer

    Probability Mass Function (PMF).

    Q3: Why is it easier to compute probabilities with discrete variables?

    💡 Show Answer

    Because we can directly sum the individual probabilities without needing integration.

    Q4: What does the CDF tell us?

    💡 Show Answer

    It gives the cumulative probability that a variable is less than or equal to a certain value.

    Q5: What is a quantile?

    💡 Show Answer

    The value below which a certain proportion of the data falls — for example, the median is the 0.5 quantile.


    ✅ Summary

    ConceptDescription
    Random VariableRepresents numeric outcome of a random event
    DiscreteCountable outcomes (use PMF)
    ContinuousInfinite outcomes (use PDF)
    PMF / PDFDescribe probability distribution
    CDFAccumulated probability up to x
    QuantileInverse of CDF — get x for a given probability

    🔜 Up Next

    In the next post, we’ll explore summary statistics like:

    • Mean
    • Variance
    • Standard deviation
    • Expected value

    These help us describe how a probability distribution behaves.

    Stay tuned!

    This post is licensed under CC BY 4.0 by the author.