What Is Probability?
Probability is the mathematical language for quantifying uncertainty. It assigns a number between 0 and 1 to the likelihood of an event:
P(event) = 0 → impossible (will never happen)
P(event) = 1 → certain (will always happen)
P(event) = 0.5 → equally likely to happen or not
Probability underpins everything in inferential statistics — confidence intervals, hypothesis tests, and machine learning models all rely on probability theory.
Sample Spaces and Events
Sample Space (S or Ω)
The set of ALL possible outcomes of an experiment.
Toss a coin: S = {Heads, Tails}
Roll a die: S = {1, 2, 3, 4, 5, 6}
Two coin tosses: S = {HH, HT, TH, TT}
Select one employee: S = {Priya, Raj, Meera, Arjun, Kavya}
Event
A subset of the sample space — one or more outcomes you're interested in.
Roll a die. Event A = "roll an even number"
A = {2, 4, 6} ⊆ S = {1, 2, 3, 4, 5, 6}
Event B = "roll a number greater than 4"
B = {5, 6}
Defining Probability
Classical Probability (Equally Likely Outcomes)
P(A) = (number of outcomes in A) / (number of outcomes in S)
P(even number on die) = 3/6 = 0.5
P(roll a 6) = 1/6 ≈ 0.167
P(roll < 3) = 2/6 = 1/3 ≈ 0.333
Relative Frequency (Empirical) Probability
P(A) ≈ (number of times A occurred) / (number of trials)
Flipped a coin 1,000 times → Heads appeared 487 times
P(Heads) ≈ 487/1000 = 0.487 ≈ 0.5 (Law of Large Numbers: approaches 0.5 as n→∞)
Subjective Probability
A probability assigned based on judgement or expertise, not counted data.
"I estimate a 70% chance this merger will be approved."
"Our model gives a 35% probability of default."
Probability Rules
Rule 1: Probability is Between 0 and 1
0 ≤ P(A) ≤ 1 for any event A
Rule 2: Sum of All Probabilities = 1
The total probability across all possible outcomes = 1
P(S) = 1
P(1) + P(2) + P(3) + P(4) + P(5) + P(6) = 6/6 = 1
Rule 3: Complement Rule
The complement of event A (written Aᶜ or Ā) = "A does not happen."
P(Aᶜ) = 1 − P(A)
P(not rolling a 6) = 1 − P(6) = 1 − 1/6 = 5/6
P(at least one head in 5 coin tosses):
Direct calculation is complex.
Complement: P(no heads) = (1/2)⁵ = 1/32
P(at least one head) = 1 − 1/32 = 31/32 ≈ 0.969
The complement rule is extremely useful when "at least one" or "at least N" problems arise.
Types of Events
Mutually Exclusive (Disjoint) Events
Events that cannot happen simultaneously.
Rolling a die: Event A = {1,2}, Event B = {5,6}
A and B cannot both occur on one roll → mutually exclusive
A and B are NOT mutually exclusive:
A = {2,4,6} (even), B = {3,4,5,6} (> 2)
Both can happen if we roll a 4 or 6
Exhaustive Events
Events that together cover the entire sample space.
A = {1,2,3}, B = {4,5,6}
A ∪ B = {1,2,3,4,5,6} = S → A and B are exhaustive
Mutually exclusive AND exhaustive events form a partition of S.
The Addition Rule
For Mutually Exclusive Events
P(A or B) = P(A) + P(B)
P(rolling 1 or 6) = P(1) + P(6) = 1/6 + 1/6 = 2/6 = 1/3
General Addition Rule (Non-Mutually Exclusive)
When A and B can overlap, adding them double-counts the overlap:
P(A or B) = P(A) + P(B) − P(A and B)
A = {2,4,6} (even), P(A) = 3/6
B = {3,4,5,6} (> 2), P(B) = 4/6
A ∩ B = {4,6}, P(A and B) = 2/6
P(A or B) = 3/6 + 4/6 − 2/6 = 5/6
Venn Diagram intuition: P(A or B) = the total area covered by both circles; subtracting the intersection avoids double-counting it.
The Multiplication Rule
Used when you need the probability of two events both occurring.
For Independent Events
Two events are independent if knowing one happened doesn't change the probability of the other.
P(A and B) = P(A) × P(B) [only if A and B are independent]
Two coin flips:
P(H on flip 1 AND H on flip 2) = P(H₁) × P(H₂) = 0.5 × 0.5 = 0.25
Rolling a die and flipping a coin:
P(6 AND Heads) = 1/6 × 1/2 = 1/12 ≈ 0.083
For Dependent Events (General Multiplication Rule)
P(A and B) = P(A) × P(B|A)
Where P(B|A) = probability of B given A has occurred (conditional probability — Chapter 8)
Drawing 2 aces from a deck without replacement:
P(Ace₁ AND Ace₂) = P(Ace₁) × P(Ace₂ | Ace₁)
= 4/52 × 3/51
= 12/2652
= 1/221
≈ 0.0045
Probability Trees
A tree diagram maps out all possible outcomes and probabilities systematically.
Loan application (two stages: credit check, income verification):
Credit Check:
P(Pass) = 0.7
P(Fail) = 0.3
Income Verification (if passed credit check):
P(Pass | Credit Pass) = 0.8
P(Fail | Credit Pass) = 0.2
If failed credit check → automatic rejection (no income check)
Tree:
Credit Check Income Check Outcome P
Pass (0.7) ─── Pass (0.8) ─── Approved 0.7×0.8 = 0.56
└── Fail (0.2) ─── Rejected 0.7×0.2 = 0.14
Fail (0.3) ──────────────────── Rejected 0.3×1.0 = 0.30
Total: 0.56 + 0.14 + 0.30 = 1.00 ✓
P(Approved) = 0.56 (56%)
Counting: Combinations and Permutations
When calculating probability, you often need to count outcomes.
Permutations — Order Matters
Arranging r items from n distinct items, order matters:
P(n, r) = n! / (n−r)!
How many ways to arrange 3 people from a group of 5 in order (1st, 2nd, 3rd)?
P(5,3) = 5! / (5−3)! = 5! / 2! = 120 / 2 = 60 ways
Combinations — Order Doesn't Matter
Choosing r items from n items, order doesn't matter:
C(n, r) = n! / (r! × (n−r)!) also written as ⁿCᵣ or (n choose r)
How many ways to select a committee of 3 from 5 people?
C(5,3) = 5! / (3! × 2!) = 120 / (6×2) = 10 ways
Probability that a specific 3-person committee is selected (from random draw):
P = 1 / C(5,3) = 1/10 = 0.10
Lottery Example
Lottery: pick 6 numbers from 1–49
Total combinations: C(49,6) = 49! / (6! × 43!) = 13,983,816
P(winning) = 1 / 13,983,816 ≈ 0.0000000715 (7.15 × 10⁻⁸)
→ Probability of winning = about 1 in 14 million
Practical Examples
Example 1: Loan Default Risk
A bank's historical data:
- P(applicant defaults) = 0.04 (4%)
- P(no default) = 0.96
For a portfolio of 3 independent loans:
P(all 3 default) = 0.04 × 0.04 × 0.04 = 0.000064 (0.0064%)
P(at least one defaults) = 1 − P(none default) = 1 − 0.96³ = 1 − 0.885 = 0.115 (11.5%)
→ Even with 4% individual default rate, there's 11.5% chance of at least one default
Example 2: Drug Testing
Medical test for a disease:
P(positive test | have disease) = 0.95 (sensitivity)
P(positive test | no disease) = 0.10 (false positive rate)
Test 3 independent patients, all without the disease:
P(all test negative) = 0.90 × 0.90 × 0.90 = 0.729
P(at least one false positive) = 1 − 0.729 = 0.271
→ 27% chance of at least one false positive when testing 3 disease-free patients
Example 3: Quality Control
Production line: 2% of items are defective (P(defective) = 0.02)
Items are inspected independently.
Quality control samples 10 items:
P(no defectives) = (0.98)^10 = 0.817 (81.7%)
P(at least one defective) = 1 − 0.817 = 0.183 (18.3%)
→ About 18% of 10-item batches will contain at least one defective
→ Chapter 9 (Binomial distribution) generalises this calculation
Example 4: Investment Outcomes
Three independent projects:
Project A: P(success) = 0.8
Project B: P(success) = 0.6
Project C: P(success) = 0.7
P(all three succeed) = 0.8 × 0.6 × 0.7 = 0.336
P(none succeed) = 0.2 × 0.4 × 0.3 = 0.024
P(at least one succeeds) = 1 − 0.024 = 0.976
Common Mistakes
1. Adding probabilities when they're not mutually exclusive
P(Finance) = 0.32, P(salary > 80k) = 0.45
P(Finance OR salary > 80k) ≠ 0.32 + 0.45 = 0.77 ← WRONG
Need to subtract the overlap:
P(Finance AND salary > 80k) = 0.15 (Finance employees over 80k)
P(Finance OR salary > 80k) = 0.32 + 0.45 − 0.15 = 0.62 ← CORRECT
2. Multiplying dependent events as if independent
Draw 2 cards from a deck without replacement:
Wrong: P(2 hearts) = 13/52 × 13/52 = 0.0625 (assumes replacement)
Right: P(2 hearts) = 13/52 × 12/51 = 0.0588 (without replacement — dependent)
3. Gambler's Fallacy
"I've flipped Tails 5 times in a row — Heads is 'due'."
WRONG: P(Heads on flip 6) = 0.5 (still)
Each flip is independent. The coin has no memory.
The Law of Large Numbers says proportions converge in the LONG run — not the short run.
4. Confusing P(A and B) with P(A or B)
P(A and B) = probability of BOTH happening (usually smaller)
P(A or B) = probability of AT LEAST ONE happening (usually larger)
Practice Exercises
-
A card is drawn from a standard 52-card deck. Find: a) P(red card) b) P(face card) c) P(red OR face card) — these are not mutually exclusive
-
Two dice are rolled. Find P(sum = 7).
-
A company has 10 applicants (6 experienced, 4 junior). If 3 are selected randomly, what is P(all 3 are experienced)?
-
A disease affects 1% of the population. A test has a 95% sensitivity and 5% false positive rate. If you test 100 disease-free people, what is P(at least one false positive)?
-
A basket contains 5 red and 3 blue balls. Two balls are drawn without replacement. Find: a) P(both red) b) P(one red, one blue) c) P(at least one red)
Summary
In this chapter you learned:
- Probability — a number in [0,1] measuring the likelihood of an event
- Sample space (S): all possible outcomes; Event: a subset of S
- Classical probability: P(A) = favourable outcomes / total outcomes (equally likely)
- Empirical probability: P(A) ≈ observed frequency / total trials
- Complement rule: P(Aᶜ) = 1 − P(A); use for "at least one" problems
- Mutually exclusive events: P(A and B) = 0; can't both happen
- Addition rule: P(A or B) = P(A) + P(B) − P(A and B); for ME events, drop last term
- Multiplication rule (independent): P(A and B) = P(A) × P(B)
- Multiplication rule (general): P(A and B) = P(A) × P(B|A)
- Permutations — order matters: n!/(n−r)!; Combinations — order doesn't: n!/(r!(n−r)!)
- Gambler's fallacy: independent events have no memory
Next up: Conditional Probability & Bayes' Theorem — how new information updates our probability estimates.