Modules & Packages

What Are Modules?

As your programs grow beyond a few dozen lines, keeping everything in a single file becomes unmanageable. Python solves this with modules — a way to split your code into separate, reusable files.

A module is simply any file with a .py extension. Every Python file you have ever written is already a module.

Why Modules Exist

Modules address several fundamental programming challenges:

Benefit	Description
Code Organisation	Break large programs into logical, manageable files
Reusability	Write a function once, use it in many programs
Namespace Management	Each module has its own namespace, preventing name collisions
Collaboration	Team members can work on different modules simultaneously
Maintenance	Easier to find, fix, and update code in smaller files
Testing	Test individual modules in isolation

How Python Sees Modules

When you write a file called greetings.py, Python treats it as a module named greetings (without the .py extension). Any functions, classes, variables, or constants you define inside that file become attributes of that module.

# greetings.py — this file IS a module named "greetings"

DEFAULT_GREETING = "Hello"

def say_hello(name):
    """Return a greeting string."""
    return f"{DEFAULT_GREETING}, {name}!"

def say_goodbye(name):
    """Return a farewell string."""
    return f"Goodbye, {name}. See you soon!"

class Greeter:
    """A class that manages personalised greetings."""
    def __init__(self, greeting="Hi"):
        self.greeting = greeting

    def greet(self, name):
        return f"{self.greeting}, {name}!"

Everything in this file — the constant DEFAULT_GREETING, the functions say_hello and say_goodbye, and the class Greeter — can now be imported and used by other Python files.

Creating Your Own Modules

Creating a module is as simple as creating a .py file. There are no special declarations needed.

What Can Go in a Module

A module can contain any valid Python code:

# math_utils.py — A utility module for math operations

# ---- Constants ----
PI = 3.14159265358979
E = 2.71828182845905
GOLDEN_RATIO = 1.61803398874989

# ---- Variables ----
_calculation_count = 0  # leading underscore = "private by convention"

# ---- Functions ----
def add(a, b):
    """Return the sum of two numbers."""
    global _calculation_count
    _calculation_count += 1
    return a + b

def multiply(a, b):
    """Return the product of two numbers."""
    global _calculation_count
    _calculation_count += 1
    return a * b

def circle_area(radius):
    """Calculate the area of a circle."""
    return PI * radius ** 2

def factorial(n):
    """Return n! using recursion."""
    if n <= 1:
        return 1
    return n * factorial(n - 1)

def get_calculation_count():
    """Return how many calculations have been performed."""
    return _calculation_count

# ---- Classes ----
class Vector:
    """A simple 2D vector class."""
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def magnitude(self):
        return (self.x ** 2 + self.y ** 2) ** 0.5

    def __repr__(self):
        return f"Vector({self.x}, {self.y})"

Module-Level Code Execution

Any code at the top level of a module runs when the module is first imported. This is important to understand:

# config.py
print("Loading config module...")  # This runs on import!

DATABASE_URL = "postgresql://localhost/mydb"
DEBUG = True

def get_config():
    return {"db": DATABASE_URL, "debug": DEBUG}

print("Config module loaded!")  # This also runs on import!

# main.py
import config  # This triggers the print statements in config.py
# Output:
# Loading config module...
# Config module loaded!

print(config.DATABASE_URL)  # postgresql://localhost/mydb

Python caches imported modules, so the top-level code only runs once, even if you import the module multiple times in different files.

Importing Modules

Python provides several ways to import modules, each suited to different situations.

`import module`

The most straightforward approach — import the entire module:

import math_utils

result = math_utils.add(5, 3)        # 8
area = math_utils.circle_area(10)    # 314.159...
v = math_utils.Vector(3, 4)
print(v.magnitude())                  # 5.0
print(math_utils.PI)                  # 3.14159265358979

Pros: Clear where each name comes from. No name collisions. Cons: Verbose if you use many items from the module.

`from module import item`

Import specific items directly into your namespace:

from math_utils import add, multiply, PI

result = add(5, 3)       # 8 — no prefix needed
product = multiply(4, 2)  # 8
print(PI)                  # 3.14159265358979

Pros: Concise. Only import what you need. Cons: Can cause name collisions if two modules export the same name.

`import module as alias`

Give a module a shorter name:

import math_utils as mu
import datetime as dt

result = mu.add(5, 3)
now = dt.datetime.now()

This is extremely common in the Python ecosystem. Many libraries have conventional aliases:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf

`from module import item as alias`

Rename specific imports:

from math_utils import circle_area as area
from math_utils import factorial as fact

print(area(5))    # 78.539...
print(fact(10))   # 3628800

This is useful when two modules export items with the same name:

from math_utils import add as math_add
from string_utils import add as string_add  # hypothetical

math_add(5, 3)          # numeric addition
string_add("hi", "!")   # string concatenation

`from module import *` (Avoid This)

Import everything from a module:

from math_utils import *

# Now add, multiply, PI, Vector, etc. are all in your namespace
print(add(5, 3))

Why you should avoid this:

Name collisions — You might overwrite existing names without realising
Unclear origin — Hard to tell where a function came from when reading code
Maintenance headache — Adding new names to the module can silently break your code

# Dangerous example
from math import *
from cmath import *   # Overwrites sqrt, log, etc. from math!

# Which sqrt is this? The real or complex version?
print(sqrt(4))  # cmath.sqrt — returns (2+0j), not 2.0!

The one acceptable use is in interactive sessions (Python REPL) for quick exploration.

Import Search Path (`sys.path`)

When you write import something, Python searches for the module in a specific order:

The current directory (directory of the script being run)
PYTHONPATH environment variable directories (if set)
Standard library directories
Site-packages (where pip installs third-party packages)

You can inspect and modify this search path:

import sys

# View the search path
for path in sys.path:
    print(path)
# Output (example):
# /home/user/my_project        (current directory)
# /usr/lib/python3.12
# /usr/lib/python3.12/lib-dynload
# /home/user/.local/lib/python3.12/site-packages

# Add a custom directory to the search path
sys.path.append("/home/user/my_libraries")

# Now Python will also look in /home/user/my_libraries
import my_custom_module  # Found in the appended path

The `if name == "main"` Pattern

This is one of the most important patterns in Python. Every Python developer must understand it.

What Is `name`?

Every module has a built-in attribute called __name__. Its value depends on how the module is being used:

If the file is run directly (e.g., python my_script.py), __name__ is set to "__main__"
If the file is imported by another file, __name__ is set to the module name (e.g., "my_script")

# demo.py
print(f"__name__ is: {__name__}")

# Run directly
$ python demo.py
__name__ is: __main__

# other.py
import demo
# Output: __name__ is: demo

The Guard Pattern

This lets you write code that only runs when the file is executed directly:

# calculator.py
def add(a, b):
    """Return the sum of two numbers."""
    return a + b

def subtract(a, b):
    """Return the difference of two numbers."""
    return a - b

def multiply(a, b):
    """Return the product of two numbers."""
    return a * b

def divide(a, b):
    """Return the quotient. Raises ValueError if b is zero."""
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b

# This block ONLY runs when you do: python calculator.py
# It does NOT run when another file does: import calculator
if __name__ == "__main__":
    # Test our functions
    print("Testing calculator functions:")
    print(f"add(10, 5) = {add(10, 5)}")           # 15
    print(f"subtract(10, 5) = {subtract(10, 5)}")  # 5
    print(f"multiply(10, 5) = {multiply(10, 5)}")  # 50
    print(f"divide(10, 5) = {divide(10, 5)}")      # 2.0

    # Test error handling
    try:
        divide(10, 0)
    except ValueError as e:
        print(f"Caught error: {e}")  # Caught error: Cannot divide by zero

Now when another file imports calculator, only the functions are available — the test code does not execute:

# main.py
from calculator import add, divide

print(add(100, 200))    # 300
print(divide(100, 4))   # 25.0
# No test output appears!

Practical Patterns

Pattern 1: Module with a CLI interface

# word_counter.py
import sys

def count_words(text):
    """Count words in a string."""
    return len(text.split())

def count_lines(text):
    """Count lines in a string."""
    return len(text.splitlines())

def analyze_text(text):
    """Return a dictionary of text statistics."""
    return {
        "words": count_words(text),
        "lines": count_lines(text),
        "characters": len(text),
        "characters_no_spaces": len(text.replace(" ", "")),
    }

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python word_counter.py <filename>")
        sys.exit(1)

    filename = sys.argv[1]
    with open(filename, "r") as f:
        content = f.read()

    stats = analyze_text(content)
    for key, value in stats.items():
        print(f"{key}: {value}")

Pattern 2: Quick demo / documentation

# shapes.py
import math

def circle_area(radius):
    return math.pi * radius ** 2

def rectangle_area(width, height):
    return width * height

def triangle_area(base, height):
    return 0.5 * base * height

if __name__ == "__main__":
    # Serve as live documentation / examples
    print("=== Shape Area Calculator ===")
    print(f"Circle (r=5): {circle_area(5):.2f}")
    print(f"Rectangle (4x6): {rectangle_area(4, 6):.2f}")
    print(f"Triangle (b=3, h=8): {triangle_area(3, 8):.2f}")

Packages

As your project grows, you need to organise modules into directories. A package is a directory that contains modules and a special __init__.py file.

Directory Structure

my_project/
├── main.py
└── utils/                  # This is a package
    ├── __init__.py         # Makes "utils" a package
    ├── math_helpers.py     # A module in the package
    ├── string_helpers.py   # Another module
    └── file_helpers.py     # Another module

The `init.py` File

The __init__.py file serves several purposes:

Marks the directory as a package (required in Python 3.2 and earlier, recommended in all versions)
Runs when the package is imported — initialisation code goes here
Controls the package's public API — define what from package import * exports

# utils/__init__.py

# Import key items so users can access them directly from the package
from .math_helpers import add, multiply, circle_area
from .string_helpers import capitalize_words, slugify
from .file_helpers import read_file, write_file

# Define what "from utils import *" exports
__all__ = [
    "add", "multiply", "circle_area",
    "capitalize_words", "slugify",
    "read_file", "write_file",
]

# Package metadata
__version__ = "1.0.0"
__author__ = "Your Name"

Now users can import cleanly:

# Thanks to __init__.py, these all work:
from utils import add, capitalize_words
from utils import __version__

# Instead of the longer:
from utils.math_helpers import add
from utils.string_helpers import capitalize_words

An empty __init__.py is also perfectly valid — it simply marks the directory as a package without any extra setup.

Nested Packages (Sub-packages)

Packages can contain other packages:

my_project/
├── main.py
└── mylib/
    ├── __init__.py
    ├── core/
    │   ├── __init__.py
    │   ├── engine.py
    │   └── config.py
    ├── utils/
    │   ├── __init__.py
    │   ├── math_helpers.py
    │   └── string_helpers.py
    └── io/
        ├── __init__.py
        ├── readers.py
        └── writers.py

# Importing from nested packages
from mylib.core.engine import Engine
from mylib.utils.math_helpers import add
from mylib.io.readers import read_csv

Relative Imports

Inside a package, you can use relative imports to refer to sibling modules or parent packages:

# mylib/utils/string_helpers.py

# Relative import from the same package (utils/)
from .math_helpers import add            # same directory
from . import file_helpers               # same directory

# Relative import from parent package (mylib/)
from ..core.config import DATABASE_URL   # up one level, then into core/
from ..core import engine                # up one level, then into core/

Relative import syntax:

. means "current package"
.. means "parent package"
... means "grandparent package"

Important: Relative imports only work inside packages. They do not work in scripts run directly.

The Python Standard Library

Python's motto is "batteries included" — it ships with a massive standard library covering everything from math to networking to file compression. No installation needed.

Overview by Category

Category	Key Modules
Math & Numbers	`math`, `decimal`, `fractions`, `statistics`
Data Structures	`collections`, `heapq`, `bisect`, `array`
Text Processing	`string`, `re`, `textwrap`, `difflib`
Date & Time	`datetime`, `time`, `calendar`
File & I/O	`os`, `pathlib`, `shutil`, `glob`, `tempfile`
Data Formats	`json`, `csv`, `xml`, `configparser`
Functional	`itertools`, `functools`, `operator`
System	`sys`, `os`, `platform`, `subprocess`
Concurrency	`threading`, `multiprocessing`, `asyncio`
Networking	`urllib`, `http`, `socket`, `email`
Type System	`typing`, `dataclasses`, `abc`
Debugging	`logging`, `pdb`, `traceback`, `warnings`
Testing	`unittest`, `doctest`
Compression	`zipfile`, `gzip`, `tarfile`
Cryptography	`hashlib`, `secrets`, `hmac`
Copying	`copy`

Let us now explore the most important modules in depth.

`math` — Mathematical Functions

The math module provides access to mathematical functions defined by the C standard.

import math

Constants:

print(math.pi)     # 3.141592653589793
print(math.e)      # 2.718281828459045
print(math.tau)    # 6.283185307179586 (2 * pi)
print(math.inf)    # inf (positive infinity)
print(math.nan)    # nan (not a number)

Rounding and Absolute Value:

print(math.ceil(4.2))     # 5  — round up
print(math.ceil(-4.2))    # -4
print(math.floor(4.8))    # 4  — round down
print(math.floor(-4.8))   # -5
print(math.trunc(4.8))    # 4  — truncate toward zero
print(math.trunc(-4.8))   # -4
print(math.fabs(-5.5))    # 5.5 — absolute value (always float)

Powers, Roots, and Logarithms:

print(math.sqrt(16))       # 4.0
print(math.sqrt(2))        # 1.4142135623730951
print(math.pow(2, 10))     # 1024.0 (always returns float)
print(math.log(100, 10))   # 2.0 (log base 10 of 100)
print(math.log(math.e))    # 1.0 (natural log)
print(math.log2(1024))     # 10.0
print(math.log10(1000))    # 3.0
print(math.isqrt(10))      # 3 (integer square root)

Factorials and Combinatorics:

print(math.factorial(5))   # 120 (5! = 5 * 4 * 3 * 2 * 1)
print(math.factorial(10))  # 3628800
print(math.comb(10, 3))    # 120 (10 choose 3)
print(math.perm(10, 3))    # 720 (permutations of 3 from 10)
print(math.gcd(48, 18))    # 6 (greatest common divisor)
print(math.lcm(12, 18))    # 36 (least common multiple)

Trigonometric Functions (angles in radians):

# Convert degrees to radians and back
print(math.radians(180))   # 3.141592653589793
print(math.degrees(math.pi))  # 180.0

# Trigonometric functions
print(math.sin(math.pi / 2))   # 1.0
print(math.cos(0))              # 1.0
print(math.tan(math.pi / 4))   # 0.9999999999999999 (approx 1)

# Inverse trigonometric functions
print(math.asin(1))    # 1.5707963... (pi/2)
print(math.acos(0))    # 1.5707963... (pi/2)
print(math.atan(1))    # 0.7853981... (pi/4)
print(math.atan2(1, 1))  # 0.7853981... (pi/4) — handles quadrants

Special Value Checks:

print(math.isnan(float("nan")))   # True
print(math.isnan(42))              # False
print(math.isinf(float("inf")))   # True
print(math.isinf(42))              # False
print(math.isfinite(42))           # True
print(math.isfinite(float("inf")))  # False
print(math.isclose(0.1 + 0.2, 0.3, rel_tol=1e-9))  # True

`random` — Random Number Generation

The random module generates pseudo-random numbers for various distributions.

import random

Basic Random Numbers:

# Random float between 0.0 and 1.0
print(random.random())        # e.g., 0.7431448254356782

# Random float in a range
print(random.uniform(1.0, 10.0))  # e.g., 6.234...

# Random integer in a range (inclusive on both ends)
print(random.randint(1, 100))     # e.g., 42

# Random integer in a range (exclusive upper bound)
print(random.randrange(0, 100, 5))  # Random multiple of 5: 0, 5, 10, ... 95

Choosing from Sequences:

colors = ["red", "green", "blue", "yellow", "purple"]

# Pick one random item
print(random.choice(colors))   # e.g., "blue"

# Pick multiple WITH replacement (items can repeat)
print(random.choices(colors, k=3))  # e.g., ["red", "red", "blue"]

# Pick multiple WITHOUT replacement (no repeats)
print(random.sample(colors, k=3))   # e.g., ["green", "purple", "red"]

# Weighted random choices
fruits = ["apple", "banana", "cherry"]
weights = [50, 30, 20]  # apple is most likely
picks = random.choices(fruits, weights=weights, k=10)
print(picks)  # Mostly apples, some bananas, few cherries

Shuffling:

deck = list(range(1, 53))  # A deck of 52 cards
random.shuffle(deck)        # Shuffle in place
print(deck[:5])             # e.g., [34, 7, 51, 22, 3]

Reproducible Results with Seeds:

random.seed(42)
print(random.randint(1, 100))  # Always 82 with seed 42
print(random.randint(1, 100))  # Always 15

random.seed(42)  # Reset the seed
print(random.randint(1, 100))  # 82 again — same sequence!
print(random.randint(1, 100))  # 15 again

Statistical Distributions:

# Gaussian (normal) distribution — mean=0, std_dev=1
print(random.gauss(0, 1))     # e.g., -0.234...

# Generate 1000 samples to see the distribution
samples = [random.gauss(100, 15) for _ in range(1000)]
avg = sum(samples) / len(samples)
print(f"Average: {avg:.1f}")  # Close to 100

`datetime` — Dates and Times

The datetime module supplies classes for manipulating dates and times.

from datetime import datetime, date, time, timedelta

Getting the Current Date and Time:

now = datetime.now()
print(now)                # 2026-03-24 14:30:45.123456
print(now.year)           # 2026
print(now.month)          # 3
print(now.day)            # 24
print(now.hour)           # 14
print(now.minute)         # 30
print(now.second)         # 45
print(now.weekday())      # 1 (0=Monday, 6=Sunday)

today = date.today()
print(today)              # 2026-03-24

Creating Specific Dates and Times:

# Create a date
birthday = date(1995, 8, 15)
print(birthday)           # 1995-08-15

# Create a time
alarm = time(7, 30, 0)
print(alarm)              # 07:30:00

# Create a full datetime
event = datetime(2026, 12, 31, 23, 59, 59)
print(event)              # 2026-12-31 23:59:59

Formatting Dates (strftime):

now = datetime.now()

print(now.strftime("%Y-%m-%d"))           # 2026-03-24
print(now.strftime("%d/%m/%Y"))           # 24/03/2026
print(now.strftime("%B %d, %Y"))          # March 24, 2026
print(now.strftime("%I:%M %p"))           # 02:30 PM
print(now.strftime("%A, %B %d, %Y"))      # Tuesday, March 24, 2026
print(now.strftime("%d %b %Y %H:%M"))     # 24 Mar 2026 14:30

Code	Meaning	Example
`%Y`	4-digit year	2026
`%m`	Month (zero-padded)	03
`%d`	Day (zero-padded)	24
`%H`	Hour (24-hour)	14
`%I`	Hour (12-hour)	02
`%M`	Minute	30
`%S`	Second	45
`%p`	AM/PM	PM
`%A`	Full weekday	Tuesday
`%a`	Short weekday	Tue
`%B`	Full month	March
`%b`	Short month	Mar

Parsing Date Strings (strptime):

date_str = "24-03-2026"
parsed = datetime.strptime(date_str, "%d-%m-%Y")
print(parsed)             # 2026-03-24 00:00:00

date_str2 = "March 24, 2026 02:30 PM"
parsed2 = datetime.strptime(date_str2, "%B %d, %Y %I:%M %p")
print(parsed2)            # 2026-03-24 14:30:00

Date Arithmetic with timedelta:

now = datetime.now()

# Add or subtract time
tomorrow = now + timedelta(days=1)
next_week = now + timedelta(weeks=1)
two_hours_later = now + timedelta(hours=2)
last_month_approx = now - timedelta(days=30)

print(f"Tomorrow: {tomorrow.strftime('%Y-%m-%d')}")
print(f"Next week: {next_week.strftime('%Y-%m-%d')}")

# Difference between dates
new_year = datetime(2026, 12, 31)
diff = new_year - now
print(f"{diff.days} days until New Year's Eve")
print(f"That's about {diff.days // 7} weeks")

Timezone Basics:

from datetime import timezone

# UTC time
utc_now = datetime.now(timezone.utc)
print(utc_now)  # 2026-03-24 09:30:45.123456+00:00

# Create a timezone offset
ist = timezone(timedelta(hours=5, minutes=30))  # India Standard Time
ist_now = datetime.now(ist)
print(ist_now)  # 2026-03-24 15:00:45.123456+05:30

`os` — Operating System Interface

The os module provides functions for interacting with the operating system.

import os

Working Directory:

# Get current working directory
print(os.getcwd())  # /home/user/my_project

# Change directory (use sparingly — prefer absolute paths)
os.chdir("/tmp")
print(os.getcwd())  # /tmp

Listing and Creating Directories:

# List files and folders in a directory
entries = os.listdir(".")
print(entries)  # ['file1.py', 'folder1', 'file2.txt']

# List a specific directory
entries = os.listdir("/home/user/documents")

# Create a single directory
os.mkdir("new_folder")

# Create nested directories (like mkdir -p)
os.makedirs("output/reports/2026", exist_ok=True)
# exist_ok=True prevents error if directory already exists

File Operations:

# Rename a file or directory
os.rename("old_name.txt", "new_name.txt")

# Remove a file
os.remove("unwanted_file.txt")

# Remove an empty directory
os.rmdir("empty_folder")

# Remove nested empty directories
os.removedirs("output/reports/2026")

Path Operations (os.path):

# Join path components (handles OS separators automatically)
path = os.path.join("home", "user", "documents", "file.txt")
print(path)  # home/user/documents/file.txt (on Unix)

# Check if path exists
print(os.path.exists("myfile.txt"))    # True or False

# Check if it's a file or directory
print(os.path.isfile("myfile.txt"))    # True
print(os.path.isdir("my_folder"))      # True

# Get file name and directory from a path
print(os.path.basename("/home/user/doc.txt"))  # doc.txt
print(os.path.dirname("/home/user/doc.txt"))   # /home/user

# Split into directory and filename
print(os.path.split("/home/user/doc.txt"))     # ('/home/user', 'doc.txt')

# Get file extension
print(os.path.splitext("report.pdf"))  # ('report', '.pdf')

# Get file size in bytes
print(os.path.getsize("myfile.txt"))   # 1024

Environment Variables:

# Get an environment variable
home = os.environ.get("HOME")
print(home)  # /home/user

# Get with a default value
db_url = os.environ.get("DATABASE_URL", "sqlite:///default.db")

# Set an environment variable (for current process only)
os.environ["MY_APP_MODE"] = "development"

Walking a Directory Tree:

# os.walk traverses all files and subdirectories
for dirpath, dirnames, filenames in os.walk("/home/user/project"):
    print(f"Directory: {dirpath}")
    for filename in filenames:
        full_path = os.path.join(dirpath, filename)
        print(f"  File: {full_path}")

`sys` — System-Specific Parameters

The sys module provides access to system-specific parameters and functions.

import sys

Command-Line Arguments:

# script.py
# Run: python script.py hello world 42
print(sys.argv)
# ['script.py', 'hello', 'world', '42']

print(sys.argv[0])  # 'script.py' — the script name
print(sys.argv[1])  # 'hello'     — first argument
print(len(sys.argv))  # 4         — total count including script name

Python Version and Platform:

print(sys.version)
# 3.12.0 (main, Oct 2 2024, 00:00:00) [GCC 12.2.0]

print(sys.version_info)
# sys.version_info(major=3, minor=12, micro=0, ...)

print(sys.platform)    # 'linux', 'darwin' (macOS), or 'win32'
print(sys.executable)  # /usr/bin/python3

Module Search Path:

# View where Python looks for modules
for p in sys.path:
    print(p)

# Add a custom directory
sys.path.insert(0, "/my/custom/modules")

Standard Streams:

# Write to stdout
sys.stdout.write("Hello from stdout\n")

# Write to stderr (for error messages)
sys.stderr.write("This is an error message\n")

# Read from stdin
# line = sys.stdin.readline()

Memory and Exit:

# Size of an object in bytes
print(sys.getsizeof(42))           # 28
print(sys.getsizeof("hello"))      # 54
print(sys.getsizeof([1, 2, 3]))    # 88
print(sys.getsizeof({}))           # 64

# Maximum integer size (for recursion limits, etc.)
print(sys.maxsize)          # 9223372036854775807 (on 64-bit)
print(sys.getrecursionlimit())  # 1000 (default)

# Exit the program
# sys.exit(0)   # Exit with success code
# sys.exit(1)   # Exit with error code
# sys.exit("Something went wrong")  # Exit with error message

`collections` — Specialised Container Types

The collections module provides alternatives to Python's built-in containers.

from collections import Counter, defaultdict, OrderedDict, namedtuple, deque, ChainMap

Counter — Count Occurrences:

from collections import Counter

# Count items in a list
fruits = ["apple", "banana", "apple", "cherry", "banana", "apple", "date"]
count = Counter(fruits)
print(count)
# Counter({'apple': 3, 'banana': 2, 'cherry': 1, 'date': 1})

# Count characters in a string
char_count = Counter("mississippi")
print(char_count)
# Counter({'s': 4, 'i': 4, 'p': 2, 'm': 1})

# Most common items
print(count.most_common(2))   # [('apple', 3), ('banana', 2)]

# Arithmetic with Counters
inventory = Counter(apples=5, bananas=3)
sold = Counter(apples=2, bananas=1)
remaining = inventory - sold
print(remaining)  # Counter({'apples': 3, 'bananas': 2})

# Total count
print(count.total())  # 7

defaultdict — Dict with Default Values:

from collections import defaultdict

# List as default — great for grouping
students_by_grade = defaultdict(list)
students_by_grade["A"].append("Alice")
students_by_grade["B"].append("Bob")
students_by_grade["A"].append("Arjun")
students_by_grade["C"].append("Charlie")
print(dict(students_by_grade))
# {'A': ['Alice', 'Arjun'], 'B': ['Bob'], 'C': ['Charlie']}

# Int as default — great for counting
word_count = defaultdict(int)
for word in ["the", "cat", "sat", "on", "the", "mat"]:
    word_count[word] += 1
print(dict(word_count))
# {'the': 2, 'cat': 1, 'sat': 1, 'on': 1, 'mat': 1}

# Set as default — great for unique collections
tags = defaultdict(set)
tags["python"].add("programming")
tags["python"].add("scripting")
tags["python"].add("programming")  # Duplicate ignored
print(dict(tags))
# {'python': {'programming', 'scripting'}}

OrderedDict — Dictionary that Remembers Insertion Order:

from collections import OrderedDict

# In Python 3.7+, regular dicts maintain insertion order.
# OrderedDict is still useful for its extra methods.

od = OrderedDict()
od["first"] = 1
od["second"] = 2
od["third"] = 3

# Move an item to the end
od.move_to_end("first")
print(list(od.keys()))  # ['second', 'third', 'first']

# Move an item to the beginning
od.move_to_end("third", last=False)
print(list(od.keys()))  # ['third', 'second', 'first']

# Pop the last item
print(od.popitem())      # ('first', 1)

# Pop the first item
print(od.popitem(last=False))  # ('third', 3)

namedtuple — Lightweight Class:

from collections import namedtuple

# Define a named tuple
Point = namedtuple("Point", ["x", "y"])
p = Point(3, 4)
print(p.x, p.y)      # 3 4
print(p[0], p[1])     # 3 4 — also supports indexing
print(p)              # Point(x=3, y=4)

# Can be used as a dictionary key (tuples are hashable)
Student = namedtuple("Student", ["name", "grade", "age"])
s = Student("Alice", "A", 20)
print(f"{s.name} got grade {s.grade}")  # Alice got grade A

# Convert to dictionary
print(s._asdict())  # {'name': 'Alice', 'grade': 'A', 'age': 20}

# Create a modified copy
s2 = s._replace(grade="A+")
print(s2)  # Student(name='Alice', grade='A+', age=20)

deque — Double-Ended Queue:

from collections import deque

# Create a deque
dq = deque([1, 2, 3, 4, 5])

# Add to both ends (O(1) — much faster than list for this)
dq.append(6)        # Add to right: [1, 2, 3, 4, 5, 6]
dq.appendleft(0)    # Add to left:  [0, 1, 2, 3, 4, 5, 6]

# Remove from both ends
dq.pop()             # Remove from right: returns 6
dq.popleft()         # Remove from left:  returns 0
print(dq)            # deque([1, 2, 3, 4, 5])

# Rotate elements
dq.rotate(2)         # Rotate right by 2
print(dq)            # deque([4, 5, 1, 2, 3])

dq.rotate(-2)        # Rotate left by 2
print(dq)            # deque([1, 2, 3, 4, 5])

# Fixed-size deque (oldest items are dropped)
recent = deque(maxlen=3)
recent.append("a")
recent.append("b")
recent.append("c")
recent.append("d")  # "a" is dropped
print(recent)        # deque(['b', 'c', 'd'], maxlen=3)

ChainMap — Merge Multiple Dictionaries:

from collections import ChainMap

defaults = {"color": "blue", "size": "medium", "font": "Arial"}
user_prefs = {"color": "green", "font": "Helvetica"}
cli_args = {"color": "red"}

# ChainMap searches in order: cli_args -> user_prefs -> defaults
config = ChainMap(cli_args, user_prefs, defaults)
print(config["color"])   # red (from cli_args)
print(config["font"])    # Helvetica (from user_prefs)
print(config["size"])    # medium (from defaults)

`itertools` — Efficient Iteration Tools

The itertools module provides fast, memory-efficient tools for working with iterators.

import itertools

Combining Iterables:

from itertools import chain

# Flatten multiple lists into one
combined = list(chain([1, 2], [3, 4], [5, 6]))
print(combined)  # [1, 2, 3, 4, 5, 6]

# Flatten a list of lists
nested = [[1, 2], [3, 4], [5, 6]]
flat = list(chain.from_iterable(nested))
print(flat)  # [1, 2, 3, 4, 5, 6]

Combinatorics:

from itertools import product, combinations, permutations

# Cartesian product (all pairs)
pairs = list(product("AB", [1, 2]))
print(pairs)  # [('A', 1), ('A', 2), ('B', 1), ('B', 2)]

# Product with repeat
dice_rolls = list(product(range(1, 7), repeat=2))
print(f"Two dice: {len(dice_rolls)} combinations")  # 36

# Combinations (order doesn't matter, no replacement)
combos = list(combinations("ABCD", 2))
print(combos)
# [('A','B'), ('A','C'), ('A','D'), ('B','C'), ('B','D'), ('C','D')]

# Permutations (order matters)
perms = list(permutations("ABC", 2))
print(perms)
# [('A','B'), ('A','C'), ('B','A'), ('B','C'), ('C','A'), ('C','B')]

Infinite Iterators:

from itertools import count, cycle, repeat

# count: infinite counter
for i in count(start=10, step=3):
    if i > 25:
        break
    print(i, end=" ")  # 10 13 16 19 22 25

# cycle: repeat an iterable forever
colors = cycle(["red", "green", "blue"])
for _, color in zip(range(7), colors):
    print(color, end=" ")
# red green blue red green blue red

# repeat: repeat a value
zeros = list(repeat(0, 5))
print(zeros)  # [0, 0, 0, 0, 0]

Slicing and Grouping:

from itertools import islice, groupby, accumulate

# islice: slice an iterator without converting to a list
squares = (x**2 for x in range(100))
first_five = list(islice(squares, 5))
print(first_five)  # [0, 1, 4, 9, 16]

# Skip first 3, take next 4
items = list(islice(range(100), 3, 7))
print(items)  # [3, 4, 5, 6]

# groupby: group consecutive items (data must be sorted by key)
data = [
    ("fruit", "apple"), ("fruit", "banana"),
    ("veggie", "carrot"), ("veggie", "pea"),
    ("fruit", "cherry"),
]
data.sort(key=lambda x: x[0])  # Must sort first!
for key, group in groupby(data, key=lambda x: x[0]):
    items = [item[1] for item in group]
    print(f"{key}: {items}")
# fruit: ['apple', 'banana', 'cherry']
# veggie: ['carrot', 'pea']

# accumulate: running totals
nums = [1, 2, 3, 4, 5]
running_sum = list(accumulate(nums))
print(running_sum)  # [1, 3, 6, 10, 15]

# Running maximum
import operator
running_max = list(accumulate([3, 1, 4, 1, 5, 9], func=max))
print(running_max)  # [3, 3, 4, 4, 5, 9]

`functools` — Higher-Order Functions

The functools module provides tools for working with functions and callable objects.

from functools import reduce, lru_cache, partial, wraps, total_ordering

reduce — Accumulate a Sequence to a Single Value:

from functools import reduce

# Sum of numbers (same as built-in sum)
total = reduce(lambda a, b: a + b, [1, 2, 3, 4, 5])
print(total)  # 15

# Product of numbers
product = reduce(lambda a, b: a * b, [1, 2, 3, 4, 5])
print(product)  # 120

# Find maximum (same as built-in max)
largest = reduce(lambda a, b: a if a > b else b, [3, 1, 4, 1, 5, 9])
print(largest)  # 9

# Flatten nested list
nested = [[1, 2], [3, 4], [5, 6]]
flat = reduce(lambda a, b: a + b, nested)
print(flat)  # [1, 2, 3, 4, 5, 6]

lru_cache — Memoisation (Cache Results):

from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n):
    """Calculate nth Fibonacci number with caching."""
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

# Without cache: extremely slow for large n
# With cache: near-instant!
print(fibonacci(50))   # 12586269025
print(fibonacci(100))  # 354224848179261915075

# Check cache statistics
print(fibonacci.cache_info())
# CacheInfo(hits=98, misses=101, maxsize=128, currsize=101)

# Clear the cache
fibonacci.cache_clear()

partial — Pre-fill Function Arguments:

from functools import partial

def power(base, exponent):
    return base ** exponent

# Create specialised versions
square = partial(power, exponent=2)
cube = partial(power, exponent=3)

print(square(5))  # 25
print(cube(3))    # 27

# Practical: create a custom print function
debug_print = partial(print, "[DEBUG]")
debug_print("Starting process")   # [DEBUG] Starting process
debug_print("Value is", 42)       # [DEBUG] Value is 42

wraps — Preserve Function Metadata in Decorators:

from functools import wraps
import time

def timer(func):
    @wraps(func)  # Preserves the name, docstring, etc. of the original function
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        elapsed = time.time() - start
        print(f"{func.__name__} took {elapsed:.4f}s")
        return result
    return wrapper

@timer
def slow_function():
    """This function is intentionally slow."""
    time.sleep(0.1)
    return "done"

slow_function()  # slow_function took 0.1003s
print(slow_function.__name__)  # slow_function (not "wrapper"!)
print(slow_function.__doc__)   # This function is intentionally slow.

total_ordering — Complete Comparison Methods:

from functools import total_ordering

@total_ordering
class Temperature:
    """You only need to define __eq__ and one of __lt__, __gt__, etc."""
    def __init__(self, celsius):
        self.celsius = celsius

    def __eq__(self, other):
        return self.celsius == other.celsius

    def __lt__(self, other):
        return self.celsius < other.celsius

    def __repr__(self):
        return f"Temperature({self.celsius}C)"

t1 = Temperature(20)
t2 = Temperature(30)
print(t1 < t2)    # True
print(t1 > t2)    # False   (auto-generated!)
print(t1 <= t2)   # True    (auto-generated!)
print(t1 >= t2)   # False   (auto-generated!)

`json` — JSON Encoding and Decoding

The json module handles reading and writing JSON data.

import json

Converting Python to JSON (dumps):

data = {
    "name": "Alice",
    "age": 30,
    "hobbies": ["reading", "coding", "hiking"],
    "address": {
        "city": "Mumbai",
        "country": "India"
    },
    "active": True,
    "score": None
}

# Convert to JSON string
json_str = json.dumps(data)
print(json_str)
# {"name": "Alice", "age": 30, ...}

# Pretty-printed JSON
json_pretty = json.dumps(data, indent=2)
print(json_pretty)
# {
#   "name": "Alice",
#   "age": 30,
#   "hobbies": [
#     "reading",
#     "coding",
#     "hiking"
#   ],
#   ...
# }

# Sort keys alphabetically
json_sorted = json.dumps(data, indent=2, sort_keys=True)

Converting JSON to Python (loads):

json_string = '{"name": "Bob", "age": 25, "active": true}'
parsed = json.loads(json_string)
print(parsed["name"])     # Bob
print(parsed["active"])   # True (Python bool, not JSON true)
print(type(parsed))       # <class 'dict'>

Reading and Writing JSON Files (load / dump):

# Write JSON to a file
data = {"users": ["Alice", "Bob", "Charlie"], "count": 3}
with open("data.json", "w") as f:
    json.dump(data, f, indent=2)

# Read JSON from a file
with open("data.json", "r") as f:
    loaded = json.load(f)
print(loaded)  # {'users': ['Alice', 'Bob', 'Charlie'], 'count': 3}

Custom Serialisation:

from datetime import datetime

class DateTimeEncoder(json.JSONEncoder):
    """Custom encoder that handles datetime objects."""
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat()
        return super().default(obj)

event = {
    "name": "Meeting",
    "time": datetime(2026, 3, 24, 14, 30),
}

# Without custom encoder: TypeError
# With custom encoder: works!
json_str = json.dumps(event, cls=DateTimeEncoder, indent=2)
print(json_str)
# {
#   "name": "Meeting",
#   "time": "2026-03-24T14:30:00"
# }

`re` — Regular Expressions

The re module provides pattern matching for strings.

import re

Basic Functions:

text = "My phone number is 123-456-7890 and email is alice@example.com"

# search: find first match anywhere in string
match = re.search(r"\d{3}-\d{3}-\d{4}", text)
if match:
    print(match.group())  # 123-456-7890

# match: match at the BEGINNING of string only
result = re.match(r"My", text)
print(result.group())  # My

# findall: find ALL matches (returns list of strings)
numbers = re.findall(r"\d+", text)
print(numbers)  # ['123', '456', '7890']

# findall with emails
emails = re.findall(r"[\w.+-]+@[\w-]+\.[\w.]+", text)
print(emails)  # ['alice@example.com']

Substitution:

# sub: replace matches
text = "I have 2 cats and 3 dogs"
result = re.sub(r"\d+", "many", text)
print(result)  # I have many cats and many dogs

# Replace with a function
def double_number(match):
    return str(int(match.group()) * 2)

result = re.sub(r"\d+", double_number, text)
print(result)  # I have 4 cats and 6 dogs

Compiling Patterns (for repeated use):

# Compile a pattern for better performance when used multiple times
email_pattern = re.compile(r"[\w.+-]+@[\w-]+\.[\w.]+")

texts = [
    "Contact alice@example.com for info",
    "Send to bob@company.org please",
    "No email here",
]

for t in texts:
    found = email_pattern.findall(t)
    if found:
        print(f"Found: {found}")
# Found: ['alice@example.com']
# Found: ['bob@company.org']

Common Patterns:

Pattern	Matches	Example
`\d`	Any digit	`0`, `9`
`\w`	Word character (letter, digit, `_`)	`a`, `Z`, `_`
`\s`	Whitespace	space, tab, newline
`.`	Any character (except newline)	anything
`^`	Start of string	`^Hello`
`$`	End of string	`world$`
`*`	0 or more	`ab*` matches `a`, `ab`, `abb`
`+`	1 or more	`ab+` matches `ab`, `abb`
`?`	0 or 1	`ab?` matches `a`, `ab`
`{n}`	Exactly n	`\d{3}` matches `123`
`{n,m}`	Between n and m	`\d{2,4}` matches `12`, `1234`
`[abc]`	Any of a, b, c	character set
`[^abc]`	Not a, b, or c	negated set
`(...)`	Capture group	group matches
`\|`	Or	`cat\|dog`

`pathlib` — Object-Oriented Filesystem Paths

pathlib is the modern, recommended way to work with file paths in Python (preferred over os.path).

from pathlib import Path

Creating Paths:

# Current directory
cwd = Path.cwd()
print(cwd)  # /home/user/my_project

# Home directory
home = Path.home()
print(home)  # /home/user

# Create a path from a string
p = Path("/home/user/documents/report.txt")

# Join paths with /
data_dir = Path("project") / "data" / "raw"
print(data_dir)  # project/data/raw

config_file = Path.home() / ".config" / "myapp" / "settings.json"
print(config_file)  # /home/user/.config/myapp/settings.json

Path Properties:

p = Path("/home/user/documents/report.pdf")

print(p.name)       # report.pdf
print(p.stem)       # report (name without extension)
print(p.suffix)     # .pdf
print(p.parent)     # /home/user/documents
print(p.parts)      # ('/', 'home', 'user', 'documents', 'report.pdf')
print(p.anchor)     # /
print(p.is_absolute())  # True

Checking Existence:

p = Path("myfile.txt")
print(p.exists())      # True or False
print(p.is_file())     # True if it is a file
print(p.is_dir())      # True if it is a directory

Reading and Writing Files:

# Write text to a file
p = Path("output.txt")
p.write_text("Hello, World!\nLine 2\n")

# Read text from a file
content = p.read_text()
print(content)  # Hello, World!\nLine 2\n

# Write bytes
p.write_bytes(b"\x00\x01\x02")

# Read bytes
data = p.read_bytes()

Globbing (Finding Files by Pattern):

project = Path(".")

# Find all Python files in current directory
for py_file in project.glob("*.py"):
    print(py_file)

# Find all Python files recursively
for py_file in project.rglob("*.py"):
    print(py_file)

# Find all image files
for img in project.rglob("*.png"):
    print(img)

Creating Directories:

new_dir = Path("output") / "reports" / "2026"
new_dir.mkdir(parents=True, exist_ok=True)
# parents=True creates intermediate directories
# exist_ok=True does not raise an error if the directory already exists

`string` — String Constants and Templates

The string module provides useful string constants and a template class.

import string

String Constants:

print(string.ascii_letters)    # abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
print(string.ascii_lowercase)  # abcdefghijklmnopqrstuvwxyz
print(string.ascii_uppercase)  # ABCDEFGHIJKLMNOPQRSTUVWXYZ
print(string.digits)           # 0123456789
print(string.hexdigits)        # 0123456789abcdefABCDEF
print(string.punctuation)      # !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
print(string.whitespace)       # ' \t\n\r\x0b\x0c'
print(string.printable)        # All printable characters

Template Strings (Safe Substitution):

from string import Template

# Create a template
t = Template("Hello, $name! You have $count new messages.")
result = t.substitute(name="Alice", count=5)
print(result)  # Hello, Alice! You have 5 new messages.

# safe_substitute: does not raise error for missing keys
t2 = Template("$greeting, $name!")
result = t2.safe_substitute(greeting="Hi")
print(result)  # Hi, $name!  (missing key left as-is)

Practical Use: Generating Random Strings

import string
import random

def generate_password(length=12):
    """Generate a random password."""
    chars = string.ascii_letters + string.digits + string.punctuation
    return "".join(random.choice(chars) for _ in range(length))

print(generate_password())     # e.g., k9$Tz!mP@2xR
print(generate_password(20))   # e.g., aB3$kLm!Pq9@rT5&wX2z

`copy` — Shallow and Deep Copy

The copy module provides functions to duplicate objects.

import copy

Why Copying Matters:

# Assignment does NOT copy — both variables point to the same object
original = [1, 2, [3, 4]]
reference = original

reference[0] = 99
print(original)   # [99, 2, [3, 4]] — original changed too!

Shallow Copy (copy.copy):

import copy

original = [1, 2, [3, 4]]
shallow = copy.copy(original)

# Top-level changes are independent
shallow[0] = 99
print(original)   # [1, 2, [3, 4]] — not affected

# BUT nested objects are still shared!
shallow[2][0] = 99
print(original)   # [1, 2, [99, 4]] — nested list WAS affected!

Deep Copy (copy.deepcopy):

import copy

original = [1, 2, [3, 4]]
deep = copy.deepcopy(original)

# Everything is fully independent
deep[2][0] = 99
print(original)   # [1, 2, [3, 4]] — completely unaffected!
print(deep)       # [1, 2, [99, 4]]

When to Use Each:

Scenario	Use
Simple flat list/dict	`copy.copy()` (shallow)
Nested structures	`copy.deepcopy()` (deep)
Immutable data (strings, tuples of ints)	No copy needed
Performance-critical, large data	`copy.copy()` if possible

`typing` — Type Hints

The typing module provides tools for type annotations, which help with code clarity and IDE support.

from typing import List, Dict, Tuple, Optional, Union, Any, Callable

Basic Type Hints:

# Variable annotations
name: str = "Alice"
age: int = 30
score: float = 95.5
active: bool = True

# Function annotations
def greet(name: str) -> str:
    return f"Hello, {name}!"

def add(a: int, b: int) -> int:
    return a + b

Collection Types:

from typing import List, Dict, Tuple, Set

# List of integers
scores: List[int] = [95, 87, 92, 78]

# Dictionary with string keys and int values
age_map: Dict[str, int] = {"Alice": 30, "Bob": 25}

# Tuple with specific types
coordinate: Tuple[float, float] = (3.5, 7.2)

# Set of strings
tags: Set[str] = {"python", "coding", "tutorial"}

# Note: In Python 3.9+, you can use built-in types directly:
# scores: list[int] = [95, 87, 92, 78]
# age_map: dict[str, int] = {"Alice": 30, "Bob": 25}

Optional and Union:

from typing import Optional, Union

# Optional: value can be the given type or None
def find_user(user_id: int) -> Optional[str]:
    """Return username or None if not found."""
    users = {1: "Alice", 2: "Bob"}
    return users.get(user_id)

# Union: value can be one of several types
def process(value: Union[int, str]) -> str:
    return str(value)

# Python 3.10+ allows: int | str instead of Union[int, str]

Callable:

from typing import Callable

# A function that takes a function as an argument
def apply_operation(x: int, y: int, func: Callable[[int, int], int]) -> int:
    return func(x, y)

result = apply_operation(5, 3, lambda a, b: a + b)
print(result)  # 8

Any:

from typing import Any

def log_value(value: Any) -> None:
    """Accept any type of value."""
    print(f"Value: {value}, Type: {type(value).__name__}")

Type hints are not enforced at runtime — they are documentation and tooling aids. Use tools like mypy to check types statically.

Installing Third-Party Packages

Python's standard library is extensive, but the real power lies in the hundreds of thousands of third-party packages available on PyPI (Python Package Index).

`pip` — The Package Installer

pip is Python's built-in package manager.

# Install a package
pip install requests

# Install a specific version
pip install pandas==2.1.0

# Install minimum version
pip install numpy>=1.24.0

# Install multiple packages at once
pip install requests pandas numpy matplotlib

# Upgrade a package to the latest version
pip install --upgrade requests

# Uninstall a package
pip uninstall requests

Managing Dependencies

# List all installed packages
pip list

# Show details about a specific package
pip show requests

# Save current environment to requirements.txt
pip freeze > requirements.txt

# Install all packages from requirements.txt
pip install -r requirements.txt

`requirements.txt` Format

# requirements.txt
requests==2.31.0
pandas>=2.1.0,<3.0.0
numpy~=1.24.0
flask
python-dotenv>=1.0

Syntax	Meaning
`==2.31.0`	Exact version
`>=2.1.0`	Minimum version
`<=3.0.0`	Maximum version
`>=2.1.0,<3.0.0`	Version range
`~=1.24.0`	Compatible release (>=1.24.0, <1.25.0)
`flask`	Any version (latest)

Editable Install

For developing your own packages:

# Install in "editable" mode — changes to source are immediately reflected
pip install -e .

# Install with optional development dependencies
pip install -e ".[dev]"

Virtual Environments

Every Python project should use a virtual environment to isolate its dependencies.

Why Virtual Environments?

Without virtual environments:

Project A needs requests==2.25.0
Project B needs requests==2.31.0
Only one version can exist system-wide — one project breaks.

Virtual environments give each project its own Python installation and packages.

Creating and Using a Virtual Environment

# Create a virtual environment named "venv"
python -m venv venv

# Activate it
# macOS / Linux:
source venv/bin/activate

# Windows:
venv\Scripts\activate

# Your prompt changes to show the active env:
# (venv) $

# Now pip installs go into this venv only
pip install requests pandas

# Verify: packages are isolated
pip list
# Shows only packages installed in this venv

# When done, deactivate
deactivate

Best Practices for Virtual Environments

Always add venv/ to .gitignore:

# .gitignore
venv/
.venv/
env/
__pycache__/
*.pyc

Use requirements.txt to share dependencies (not the venv itself):

# Developer A creates the requirements file
pip freeze > requirements.txt
git add requirements.txt
git commit -m "Add project dependencies"

# Developer B recreates the environment
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Naming conventions:

venv or .venv are the most common names
.venv (with dot) keeps it hidden on Unix systems

Popular Third-Party Packages

Here is a curated list of widely-used Python packages every developer should know about:

Package	Category	Description	Install
`requests`	HTTP	Simple HTTP requests	`pip install requests`
`httpx`	HTTP	Async-capable HTTP client	`pip install httpx`
`pandas`	Data	Data manipulation and analysis	`pip install pandas`
`numpy`	Math	Numerical computing, arrays	`pip install numpy`
`matplotlib`	Visualisation	Plotting and charts	`pip install matplotlib`
`flask`	Web	Lightweight web framework	`pip install flask`
`fastapi`	Web	Modern async web framework	`pip install fastapi`
`django`	Web	Full-featured web framework	`pip install django`
`sqlalchemy`	Database	SQL toolkit and ORM	`pip install sqlalchemy`
`pytest`	Testing	Testing framework	`pip install pytest`
`beautifulsoup4`	Scraping	HTML/XML parsing	`pip install beautifulsoup4`
`selenium`	Automation	Browser automation	`pip install selenium`
`pillow`	Images	Image processing	`pip install pillow`
`click`	CLI	Command-line interfaces	`pip install click`
`rich`	CLI	Rich terminal formatting	`pip install rich`

Quick Examples

requests — Making HTTP Requests:

import requests

response = requests.get("https://api.github.com/users/octocat")
print(response.status_code)   # 200
data = response.json()
print(data["name"])            # The Octocat
print(data["public_repos"])    # 8

pytest — Writing Tests:

# test_calculator.py
def add(a, b):
    return a + b

def test_add():
    assert add(2, 3) == 5
    assert add(-1, 1) == 0
    assert add(0, 0) == 0

# Run with: pytest test_calculator.py

Package Distribution Basics

When you want to share your Python code as an installable package, you need a project configuration file.

`pyproject.toml` (Modern Standard)

The modern way to define a Python project:

# pyproject.toml
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.backends._legacy:_Backend"

[project]
name = "my-awesome-package"
version = "0.1.0"
description = "A short description of your package"
readme = "README.md"
license = {text = "MIT"}
requires-python = ">=3.8"
authors = [
    {name = "Your Name", email = "you@example.com"}
]
dependencies = [
    "requests>=2.25.0",
    "click>=8.0",
]

[project.optional-dependencies]
dev = ["pytest", "mypy", "black"]

`setup.py` (Legacy but Still Common)

# setup.py
from setuptools import setup, find_packages

setup(
    name="my-awesome-package",
    version="0.1.0",
    packages=find_packages(),
    install_requires=[
        "requests>=2.25.0",
        "click>=8.0",
    ],
)

Version Management

A common pattern is to define the version in one place and read it from __init__.py:

# my_package/__init__.py
__version__ = "0.1.0"

Publishing to PyPI (Brief Overview)

# 1. Build the package
pip install build
python -m build

# 2. Upload to PyPI (requires an account at pypi.org)
pip install twine
twine upload dist/*

# 3. Now anyone can install it!
# pip install my-awesome-package

Practical Examples

Let us put everything together with practical, real-world examples.

Example 1: Building a Utility Package

Create a package with string, math, and file helpers:

Project structure:

my_utils/
├── __init__.py
├── string_helpers.py
├── math_helpers.py
└── file_helpers.py

string_helpers.py:

# my_utils/string_helpers.py

def slugify(text):
    """Convert text to URL-friendly slug.

    >>> slugify("Hello World!")
    'hello-world'
    """
    import re
    text = text.lower().strip()
    text = re.sub(r"[^\w\s-]", "", text)
    text = re.sub(r"[\s_]+", "-", text)
    text = re.sub(r"-+", "-", text)
    return text.strip("-")

def truncate(text, max_length=50, suffix="..."):
    """Truncate text to max_length, adding suffix if truncated.

    >>> truncate("Hello, World!", max_length=8)
    'Hello...'
    """
    if len(text) <= max_length:
        return text
    return text[:max_length - len(suffix)] + suffix

def word_count(text):
    """Count words in text.

    >>> word_count("Hello beautiful world")
    3
    """
    return len(text.split())

def title_case(text):
    """Convert text to title case, handling small words.

    >>> title_case("the quick brown fox")
    'The Quick Brown Fox'
    """
    small_words = {"a", "an", "the", "and", "but", "or", "for", "in", "on", "at", "to"}
    words = text.split()
    result = []
    for i, word in enumerate(words):
        if i == 0 or word.lower() not in small_words:
            result.append(word.capitalize())
        else:
            result.append(word.lower())
    return " ".join(result)

math_helpers.py:

# my_utils/math_helpers.py

def clamp(value, min_val, max_val):
    """Restrict value to the range [min_val, max_val].

    >>> clamp(15, 0, 10)
    10
    >>> clamp(-5, 0, 10)
    0
    """
    return max(min_val, min(value, max_val))

def percentage(part, whole):
    """Calculate percentage.

    >>> percentage(25, 200)
    12.5
    """
    if whole == 0:
        return 0.0
    return (part / whole) * 100

def average(numbers):
    """Calculate the arithmetic mean.

    >>> average([10, 20, 30])
    20.0
    """
    if not numbers:
        return 0.0
    return sum(numbers) / len(numbers)

def is_prime(n):
    """Check if a number is prime.

    >>> is_prime(17)
    True
    >>> is_prime(4)
    False
    """
    if n < 2:
        return False
    for i in range(2, int(n ** 0.5) + 1):
        if n % i == 0:
            return False
    return True

file_helpers.py:

# my_utils/file_helpers.py

from pathlib import Path

def read_lines(filepath):
    """Read a file and return a list of stripped lines."""
    return Path(filepath).read_text().strip().splitlines()

def write_lines(filepath, lines):
    """Write a list of strings to a file, one per line."""
    Path(filepath).write_text("\n".join(lines) + "\n")

def file_size_human(filepath):
    """Return human-readable file size.

    >>> file_size_human("small_file.txt")  # If file is 1536 bytes
    '1.50 KB'
    """
    size = Path(filepath).stat().st_size
    for unit in ["B", "KB", "MB", "GB", "TB"]:
        if size < 1024:
            return f"{size:.2f} {unit}"
        size /= 1024
    return f"{size:.2f} PB"

def ensure_directory(path):
    """Create a directory (and parents) if it does not exist."""
    Path(path).mkdir(parents=True, exist_ok=True)

__init__.py:

# my_utils/__init__.py

from .string_helpers import slugify, truncate, word_count, title_case
from .math_helpers import clamp, percentage, average, is_prime
from .file_helpers import read_lines, write_lines, file_size_human, ensure_directory

__version__ = "1.0.0"
__all__ = [
    "slugify", "truncate", "word_count", "title_case",
    "clamp", "percentage", "average", "is_prime",
    "read_lines", "write_lines", "file_size_human", "ensure_directory",
]

Using the Package:

# main.py
from my_utils import slugify, is_prime, average, ensure_directory

print(slugify("Hello World! This is Great"))  # hello-world-this-is-great
print(is_prime(17))                            # True
print(average([85, 90, 78, 92, 88]))          # 86.6
ensure_directory("output/reports")

Example 2: Password Generator (Using Standard Library)

# password_generator.py
import random
import string
import math

def generate_password(
    length=16,
    use_uppercase=True,
    use_digits=True,
    use_symbols=True,
    exclude_chars="",
):
    """Generate a secure random password."""
    chars = string.ascii_lowercase

    if use_uppercase:
        chars += string.ascii_uppercase
    if use_digits:
        chars += string.digits
    if use_symbols:
        chars += string.punctuation

    # Remove excluded characters
    for ch in exclude_chars:
        chars = chars.replace(ch, "")

    if not chars:
        raise ValueError("No characters available for password generation")

    password = "".join(random.choice(chars) for _ in range(length))
    return password

def password_strength(password):
    """Estimate password strength."""
    charset_size = 0
    if any(c in string.ascii_lowercase for c in password):
        charset_size += 26
    if any(c in string.ascii_uppercase for c in password):
        charset_size += 26
    if any(c in string.digits for c in password):
        charset_size += 10
    if any(c in string.punctuation for c in password):
        charset_size += 32

    if charset_size == 0:
        return "Empty", 0

    entropy = len(password) * math.log2(charset_size)

    if entropy < 28:
        strength = "Very Weak"
    elif entropy < 36:
        strength = "Weak"
    elif entropy < 60:
        strength = "Moderate"
    elif entropy < 80:
        strength = "Strong"
    else:
        strength = "Very Strong"

    return strength, round(entropy, 1)

if __name__ == "__main__":
    print("=== Password Generator ===\n")

    for length in [8, 12, 16, 24]:
        pw = generate_password(length=length)
        strength, entropy = password_strength(pw)
        print(f"Length {length:2d}: {pw}")
        print(f"          Strength: {strength} ({entropy} bits of entropy)\n")

    # Generate passwords without confusing characters
    pw = generate_password(length=16, exclude_chars="0OIl1|")
    print(f"Easy-to-read: {pw}")

Example 3: Date Calculator

# date_calculator.py
from datetime import datetime, date, timedelta
import calendar

def days_between(date1_str, date2_str, fmt="%Y-%m-%d"):
    """Calculate the number of days between two date strings."""
    d1 = datetime.strptime(date1_str, fmt)
    d2 = datetime.strptime(date2_str, fmt)
    diff = abs((d2 - d1).days)
    return diff

def days_until(target_str, fmt="%Y-%m-%d"):
    """Calculate days from today until a target date."""
    target = datetime.strptime(target_str, fmt).date()
    today = date.today()
    diff = (target - today).days
    return diff

def add_business_days(start_date, num_days):
    """Add business days (skipping weekends) to a date."""
    current = start_date
    added = 0
    while added < num_days:
        current += timedelta(days=1)
        if current.weekday() < 5:  # Monday=0 to Friday=4
            added += 1
    return current

def age_calculator(birth_date_str, fmt="%Y-%m-%d"):
    """Calculate age in years, months, and days."""
    birth = datetime.strptime(birth_date_str, fmt).date()
    today = date.today()

    years = today.year - birth.year
    months = today.month - birth.month
    days = today.day - birth.day

    if days < 0:
        months -= 1
        # Get days in the previous month
        prev_month = today.month - 1 if today.month > 1 else 12
        prev_year = today.year if today.month > 1 else today.year - 1
        days += calendar.monthrange(prev_year, prev_month)[1]

    if months < 0:
        years -= 1
        months += 12

    return years, months, days

if __name__ == "__main__":
    print("=== Date Calculator ===\n")

    # Days between two dates
    d = days_between("2026-01-01", "2026-12-31")
    print(f"Days in 2026: {d}")

    # Days until New Year
    until_ny = days_until("2026-12-31")
    print(f"Days until Dec 31, 2026: {until_ny}")

    # Business days
    start = date.today()
    deadline = add_business_days(start, 10)
    print(f"10 business days from today: {deadline.strftime('%Y-%m-%d (%A)')}")

    # Age calculator
    years, months, days = age_calculator("1995-08-15")
    print(f"Age (born 1995-08-15): {years} years, {months} months, {days} days")

Example 4: File Organiser Script

# file_organiser.py
"""Organise files in a directory by extension."""

import os
import shutil
from pathlib import Path
from collections import defaultdict, Counter
from datetime import datetime

# Map extensions to folder names
EXTENSION_MAP = {
    # Images
    ".jpg": "Images", ".jpeg": "Images", ".png": "Images",
    ".gif": "Images", ".bmp": "Images", ".svg": "Images", ".webp": "Images",
    # Documents
    ".pdf": "Documents", ".doc": "Documents", ".docx": "Documents",
    ".txt": "Documents", ".rtf": "Documents", ".odt": "Documents",
    # Spreadsheets
    ".xls": "Spreadsheets", ".xlsx": "Spreadsheets", ".csv": "Spreadsheets",
    # Videos
    ".mp4": "Videos", ".avi": "Videos", ".mkv": "Videos", ".mov": "Videos",
    # Audio
    ".mp3": "Audio", ".wav": "Audio", ".flac": "Audio", ".aac": "Audio",
    # Archives
    ".zip": "Archives", ".tar": "Archives", ".gz": "Archives", ".rar": "Archives",
    # Code
    ".py": "Code", ".js": "Code", ".html": "Code", ".css": "Code",
    ".java": "Code", ".cpp": "Code", ".c": "Code", ".rs": "Code",
}

def organise_directory(source_dir, dry_run=True):
    """Organise files in a directory by their extension.

    Args:
        source_dir: Path to the directory to organise.
        dry_run: If True, only prints what would happen without moving files.

    Returns:
        Dictionary mapping category to list of moved files.
    """
    source = Path(source_dir)
    if not source.is_dir():
        print(f"Error: '{source_dir}' is not a valid directory")
        return {}

    moved = defaultdict(list)
    stats = Counter()

    for filepath in source.iterdir():
        # Skip directories and hidden files
        if filepath.is_dir() or filepath.name.startswith("."):
            continue

        ext = filepath.suffix.lower()
        category = EXTENSION_MAP.get(ext, "Other")

        dest_dir = source / category
        dest_file = dest_dir / filepath.name

        # Handle name conflicts
        if dest_file.exists():
            stem = filepath.stem
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            dest_file = dest_dir / f"{stem}_{timestamp}{ext}"

        if dry_run:
            print(f"  [DRY RUN] {filepath.name} -> {category}/")
        else:
            dest_dir.mkdir(exist_ok=True)
            shutil.move(str(filepath), str(dest_file))
            print(f"  Moved: {filepath.name} -> {category}/")

        moved[category].append(filepath.name)
        stats[category] += 1

    return dict(moved)

if __name__ == "__main__":
    import sys

    target = sys.argv[1] if len(sys.argv) > 1 else "."
    dry = "--execute" not in sys.argv

    print(f"Organising: {os.path.abspath(target)}")
    if dry:
        print("(Dry run — add --execute to actually move files)\n")
    else:
        print("(EXECUTING — files will be moved)\n")

    results = organise_directory(target, dry_run=dry)

    print(f"\nSummary:")
    total = 0
    for category, files in sorted(results.items()):
        print(f"  {category}: {len(files)} files")
        total += len(files)
    print(f"  Total: {total} files")

Best Practices

1. Organise Imports Properly

Follow the standard import ordering convention (also enforced by tools like isort):

# 1. Standard library imports
import os
import sys
from datetime import datetime
from pathlib import Path

# 2. Third-party imports
import requests
import pandas as pd
from flask import Flask, jsonify

# 3. Local / project imports
from myapp.models import User
from myapp.utils import format_date

Each group should be separated by a blank line, and imports within each group should be alphabetically sorted.

2. Use `all` to Control Exports

Define __all__ to explicitly declare what your module exports:

# mymodule.py

__all__ = ["public_function", "PublicClass"]

def public_function():
    """This will be exported."""
    pass

def _private_helper():
    """This will NOT be exported (leading underscore convention)."""
    pass

class PublicClass:
    """This will be exported."""
    pass

When someone does from mymodule import *, only names listed in __all__ are imported.

3. Avoid Circular Imports

Circular imports happen when two modules import each other:

# BAD: Circular import
# module_a.py
from module_b import function_b   # module_b imports module_a!

def function_a():
    return function_b()

# module_b.py
from module_a import function_a   # module_a imports module_b!

def function_b():
    return function_a()

Solutions:

Move the shared code into a third module
Use late imports (import inside the function)
Restructure your code

# Solution: late import
# module_a.py
def function_a():
    from module_b import function_b  # Import only when needed
    return function_b()

4. Prefer Absolute Imports

# GOOD: Absolute imports — clear and unambiguous
from myproject.utils.math_helpers import add
from myproject.models.user import User

# OK: Relative imports within a package (keep them short)
from .math_helpers import add      # Same package
from ..models.user import User     # Parent package

# AVOID: Deep relative imports
from ...core.base.mixins import LogMixin  # Hard to follow

Common Mistakes

1. Naming Files the Same as Standard Library Modules

This is the single most common beginner mistake:

# You create a file called "random.py"
# random.py
import random   # This imports YOUR file, not the standard library!

print(random.randint(1, 10))  # AttributeError!

Solution: Never name your files random.py, math.py, os.py, json.py, string.py, email.py, test.py, or any other standard library module name.

2. Circular Imports

As covered above, this happens when module A imports module B, and module B imports module A. The fix is to restructure your code or use late (inside-function) imports.

3. Forgetting `init.py`

While Python 3.3+ supports "namespace packages" without __init__.py, it is best practice to always include it:

# Without __init__.py, imports may behave unexpectedly
my_package/
├── module_a.py
└── module_b.py

# With __init__.py — explicit, clear, reliable
my_package/
├── __init__.py
├── module_a.py
└── module_b.py

4. Import Side Effects

Avoid running significant code at module level. It executes on every import:

# BAD: Side effects on import
# config.py
import requests

# This HTTP request runs every time someone imports config!
response = requests.get("https://api.example.com/config")
CONFIG = response.json()

# GOOD: Wrap side effects in functions
# config.py
import requests

_config_cache = None

def get_config():
    """Load config on first call, then cache it."""
    global _config_cache
    if _config_cache is None:
        response = requests.get("https://api.example.com/config")
        _config_cache = response.json()
    return _config_cache

5. Using `from module import *` in Production Code

# BAD: Unclear where names come from, risk of collisions
from os import *
from sys import *
from json import *

# GOOD: Explicit imports
from os import path, getcwd, listdir
from sys import argv, exit
from json import loads, dumps

Practice Exercises

Exercise 1: Module Creation

Create a module called temperature.py with functions:

celsius_to_fahrenheit(c) — convert Celsius to Fahrenheit
fahrenheit_to_celsius(f) — convert Fahrenheit to Celsius
celsius_to_kelvin(c) — convert Celsius to Kelvin
is_boiling(celsius) — return True if water boils at this temperature

Add a __name__ == "__main__" guard that tests each function.

# temperature.py
def celsius_to_fahrenheit(c):
    return (c * 9 / 5) + 32

def fahrenheit_to_celsius(f):
    return (f - 32) * 5 / 9

def celsius_to_kelvin(c):
    return c + 273.15

def is_boiling(celsius):
    return celsius >= 100

if __name__ == "__main__":
    print(celsius_to_fahrenheit(100))   # 212.0
    print(fahrenheit_to_celsius(32))    # 0.0
    print(celsius_to_kelvin(0))         # 273.15
    print(is_boiling(100))              # True
    print(is_boiling(99))               # False

Exercise 2: Standard Library Exploration

Write a script that uses at least 5 standard library modules to:

Generate 10 random integers between 1 and 100 (random)
Calculate their mean and standard deviation (math)
Save the results with a timestamp to a JSON file (json, datetime)
Print the file size (os)

import random
import math
import json
from datetime import datetime
import os

# 1. Generate random numbers
numbers = [random.randint(1, 100) for _ in range(10)]

# 2. Calculate statistics
mean = sum(numbers) / len(numbers)
variance = sum((x - mean) ** 2 for x in numbers) / len(numbers)
std_dev = math.sqrt(variance)

# 3. Save to JSON with timestamp
result = {
    "timestamp": datetime.now().isoformat(),
    "numbers": numbers,
    "mean": round(mean, 2),
    "std_dev": round(std_dev, 2),
}
filename = "stats_output.json"
with open(filename, "w") as f:
    json.dump(result, f, indent=2)

# 4. Print file size
size = os.path.getsize(filename)
print(f"Numbers: {numbers}")
print(f"Mean: {mean:.2f}, Std Dev: {std_dev:.2f}")
print(f"Saved to {filename} ({size} bytes)")

Exercise 3: Package Builder

Create a package called texttools with the following structure:

texttools/
├── __init__.py
├── analysis.py   (word_count, char_count, sentence_count)
├── transform.py  (reverse, to_snake_case, to_camel_case)
└── validate.py   (is_email, is_url, is_phone)

Write the __init__.py to export all functions, then write a main.py that uses them.

Exercise 4: Collections Challenge

Given a list of student records, use collections to:

Count how many students got each grade (use Counter)
Group students by grade (use defaultdict)
Find the top 3 most common grades (use Counter.most_common)

from collections import Counter, defaultdict

students = [
    ("Alice", "A"), ("Bob", "B"), ("Charlie", "A"),
    ("Diana", "C"), ("Eve", "A"), ("Frank", "B"),
    ("Grace", "A"), ("Hank", "B"), ("Ivy", "C"),
    ("Jack", "A"), ("Kate", "D"), ("Leo", "B"),
]

# 1. Count grades
grade_counts = Counter(grade for _, grade in students)
print(grade_counts)  # Counter({'A': 5, 'B': 4, 'C': 2, 'D': 1})

# 2. Group by grade
by_grade = defaultdict(list)
for name, grade in students:
    by_grade[grade].append(name)
for grade in sorted(by_grade):
    print(f"Grade {grade}: {by_grade[grade]}")

# 3. Top 3 grades
print(grade_counts.most_common(3))
# [('A', 5), ('B', 4), ('C', 2)]

Exercise 5: File Organiser Enhancement

Extend the file organiser example from this chapter:

Add a --log flag that writes all moves to a log file
Add support for organising by date (files modified this month, last month, older)
Add a summary that shows total size moved per category

Exercise 6: Build a CLI Tool

Use sys.argv (or the argparse standard library module) to build a command-line tool that:

Accepts a filename and an operation (--count-words, --count-lines, --find PATTERN)
Reads the file and performs the operation
Prints the result

# text_tool.py
import sys
import re

def count_words(text):
    return len(text.split())

def count_lines(text):
    return len(text.splitlines())

def find_pattern(text, pattern):
    matches = re.findall(pattern, text, re.IGNORECASE)
    return matches

if __name__ == "__main__":
    if len(sys.argv) < 3:
        print("Usage: python text_tool.py <file> <--count-words|--count-lines|--find PATTERN>")
        sys.exit(1)

    filename = sys.argv[1]
    operation = sys.argv[2]

    with open(filename, "r") as f:
        content = f.read()

    if operation == "--count-words":
        print(f"Words: {count_words(content)}")
    elif operation == "--count-lines":
        print(f"Lines: {count_lines(content)}")
    elif operation == "--find":
        if len(sys.argv) < 4:
            print("Error: --find requires a pattern")
            sys.exit(1)
        pattern = sys.argv[3]
        matches = find_pattern(content, pattern)
        print(f"Found {len(matches)} matches for '{pattern}':")
        for m in matches:
            print(f"  {m}")
    else:
        print(f"Unknown operation: {operation}")
        sys.exit(1)

Summary

In this final chapter, you learned how Python organises and distributes code:

Modules are .py files that group related functions, classes, and variables
Importing gives you access to code from other modules using import, from ... import, and aliases
The __name__ guard lets you write code that runs only when a file is executed directly
Packages are directories of modules with an __init__.py file, supporting nested sub-packages
The standard library provides a vast collection of ready-to-use modules — from math and random to json, collections, itertools, pathlib, and many more
pip installs third-party packages from PyPI, and virtual environments keep project dependencies isolated
Best practices include organising imports, using __all__, and avoiding circular imports

Congratulations!

You have completed the entire Python tutorial series. You now have a solid foundation covering:

Variables, data types, and operators
Control flow (if/elif/else, loops)
Data structures (lists, tuples, dictionaries, sets)
Functions and scope
String manipulation
File handling
Error handling and exceptions
Object-oriented programming (classes, inheritance)
List comprehensions and generators
Decorators and closures
Modules, packages, and the standard library

What to Build Next

The best way to solidify your knowledge is to build projects. Here are some ideas:

Project	Skills Practised
To-do list CLI app	File I/O, JSON, `argparse`, OOP
Web scraper	`requests`, `beautifulsoup4`, `csv`, file handling
Personal budget tracker	Classes, file I/O, `datetime`, `collections`
Quiz game	`random`, dictionaries, loops, file I/O
Weather app	`requests`, `json`, API interaction
URL shortener	`flask`/`fastapi`, `hashlib`, databases
Markdown to HTML converter	`re`, file I/O, `pathlib`
Chat bot	String processing, `random`, `json`

Advanced Topics to Explore

Once you are comfortable building projects, dive deeper into:

Asynchronous programming — asyncio, async/await
Testing — pytest, test-driven development (TDD)
Web development — Flask, FastAPI, Django
Data science — pandas, NumPy, matplotlib
Databases — SQLAlchemy, SQLite, PostgreSQL
APIs — Building and consuming REST APIs
Type checking — mypy for static analysis
Packaging — Publishing your own packages on PyPI
Design patterns — Singleton, Factory, Observer, Strategy
Concurrency — threading, multiprocessing, asyncio

Happy coding!