Post

Descriptive vs Inferential Statistics – A Simple Start

Descriptive vs Inferential Statistics – A Simple Start

Before we teach a computer how to “learn,” we first need to understand our data.
That’s where statistics comes in — not the scary kind with symbols and proofs, but the practical kind that helps us:

✔️ Understand the data we have
✔️ Ask the right questions
✔️ Make smart guesses about new data

In this post, we’ll look at two basic types of statistics you need to know:


1️⃣ Descriptive Statistics: “What do I see?”

Descriptive statistics help you describe and summarize a set of data.

Imagine you have a list of exam scores for a class of students. Descriptive stats can tell you:

QuestionDescriptive ToolExample Answer
What’s the average score?Mean75 out of 100
Are most scores similar?Standard DeviationYes, they’re close
What’s the highest/lowest?Min / Max98 and 45
How are scores spread out?Range / HistogramMost are in the 70s

🟠 Think of it as a summary card for your data.

Practical Example: Calculating Descriptive Statistics in Python

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import numpy as np
import matplotlib.pyplot as plt

scores = [75,nt("Mean:", np.mean(scores))
print("Standard Deviation:", np.std(scores))
print("Minimum:", np.min(scores))
print("Maximum:", np.max(scores))

plt.hist(scores, bins=5, color='skyblue', edgecolor='black')
plt.title('Exam Score Distribution')
plt.xlabel('Score')
plt.ylabel('Frequency')
plt.show()


2️⃣ Inferential Statistics: “What can I guess about others?”

Now imagine you only saw 10 scores out of 100 students. You might want to:

  • Guess the average for the whole class
  • Predict how future students will do
  • Compare one group’s scores to another

That’s what inferential statistics does — it helps us make educated guesses about a bigger group based on a smaller sample.

SituationInferential Thinking
You try a new teaching method with 10 students“Will this help the whole class?”
You test a drug on 50 people“Will it work for everyone?”
You train a model on part of the data“Will it work on new data?”

🟢 It’s all about prediction and generalization.


🗺️ When to Use Each?

  • Use descriptive statistics when you want to summarize or explore the data you have.
  • Use inferential statistics when you want to make predictions or generalizations about a larger group based on a sample.

⚠️ Common Mistakes

Don’t use inferential statistics if you already have data for the whole population—just describe it!

Be careful: Inferential statistics require that your sample is random and representative of the population.


👀 Visual Summary

Descriptive vs Inferential

Imagine you’re tasting soup:

  • Descriptive: You taste the whole pot. “It’s salty.”
  • Inferential: You take one spoon and guess: “I think the whole pot is salty.”

🍲 That’s the difference!


🧠 Why This Matters for Machine Learning

Machine learning uses both types:

TaskWhat It Uses
Cleaning and exploring dataDescriptive stats
Training on sample dataInferential stats
Making predictionsInferential thinking

Even if you haven’t learned ML yet — this is your foundation.


🧠 Level Up: Why Inferential Statistics Matter in Machine Learning

While descriptive statistics summarize the data you have, inferential statistics let you:

  • 🔮 Make predictions or decisions based on sample data
  • 📊 Test hypotheses to understand if patterns are meaningful
  • 🔍 Estimate properties of a larger population from limited observations
  • 🤖 Form the mathematical foundation behind many machine learning algorithms

Understanding the difference helps you know when you’re just describing versus when you’re generalizing — a critical skill in data science and ML.


🏆 Real-World Mini Case Study: Predicting Voter Preferences

Suppose you want to know who will win an election. You can’t ask every voter, so you survey a random sample of 1,000 people.

  • Descriptive statistics: Summarize the survey results (e.g., 48% prefer Candidate A).
  • Inferential statistics: Estimate the true support for Candidate A in the whole country, and calculate a margin of error.

This is the same logic used when evaluating how well a machine learning model will perform on unseen data!


📌 Try It Yourself

Q: If you summarize test scores from 100 students using an average and a histogram, are you using descriptive or inferential statistics?

💡 Show Answer

Descriptive statistics — you're describing the data you already have, not making predictions or generalizations.


📚 Quick Glossary

  • Mean: The average value.
  • Standard Deviation: A measure of how spread out the numbers are.
  • Sample: A subset of data from a larger group.
  • Population: The entire group you care about.
  • Prediction: Using data to guess about something unknown.

✅ Summary

DescriptiveInferential
Describes what we haveHelps us guess about the unknown
No prediction involvedFocuses on prediction & decisions
Uses full dataOften uses samples

🚀 What’s Next?

In the next post, we’ll explore two tools that help us work with data:

  • Data Matrix: a simple way to organize information
  • Frequency Tables: to see how often things appear

Have questions or want to share your own examples? Drop a comment below or suggest a topic you’d like to see next!

Stay tuned!

This post is licensed under CC BY 4.0 by the author.