What Are Modules?
As your programs grow beyond a few dozen lines, keeping everything in a single file becomes unmanageable. Python solves this with modules — a way to split your code into separate, reusable files.
A module is simply any file with a .py extension. Every Python file you have ever written is already a module.
Why Modules Exist
Modules address several fundamental programming challenges:
| Benefit | Description |
|---|---|
| Code Organisation | Break large programs into logical, manageable files |
| Reusability | Write a function once, use it in many programs |
| Namespace Management | Each module has its own namespace, preventing name collisions |
| Collaboration | Team members can work on different modules simultaneously |
| Maintenance | Easier to find, fix, and update code in smaller files |
| Testing | Test individual modules in isolation |
How Python Sees Modules
When you write a file called greetings.py, Python treats it as a module named greetings (without the .py extension). Any functions, classes, variables, or constants you define inside that file become attributes of that module.
# greetings.py — this file IS a module named "greetings"
DEFAULT_GREETING = "Hello"
def say_hello(name):
"""Return a greeting string."""
return f"{DEFAULT_GREETING}, {name}!"
def say_goodbye(name):
"""Return a farewell string."""
return f"Goodbye, {name}. See you soon!"
class Greeter:
"""A class that manages personalised greetings."""
def __init__(self, greeting="Hi"):
self.greeting = greeting
def greet(self, name):
return f"{self.greeting}, {name}!"
Everything in this file — the constant DEFAULT_GREETING, the functions say_hello and say_goodbye, and the class Greeter — can now be imported and used by other Python files.
Creating Your Own Modules
Creating a module is as simple as creating a .py file. There are no special declarations needed.
What Can Go in a Module
A module can contain any valid Python code:
# math_utils.py — A utility module for math operations
# ---- Constants ----
PI = 3.14159265358979
E = 2.71828182845905
GOLDEN_RATIO = 1.61803398874989
# ---- Variables ----
_calculation_count = 0 # leading underscore = "private by convention"
# ---- Functions ----
def add(a, b):
"""Return the sum of two numbers."""
global _calculation_count
_calculation_count += 1
return a + b
def multiply(a, b):
"""Return the product of two numbers."""
global _calculation_count
_calculation_count += 1
return a * b
def circle_area(radius):
"""Calculate the area of a circle."""
return PI * radius ** 2
def factorial(n):
"""Return n! using recursion."""
if n <= 1:
return 1
return n * factorial(n - 1)
def get_calculation_count():
"""Return how many calculations have been performed."""
return _calculation_count
# ---- Classes ----
class Vector:
"""A simple 2D vector class."""
def __init__(self, x, y):
self.x = x
self.y = y
def magnitude(self):
return (self.x ** 2 + self.y ** 2) ** 0.5
def __repr__(self):
return f"Vector({self.x}, {self.y})"
Module-Level Code Execution
Any code at the top level of a module runs when the module is first imported. This is important to understand:
# config.py
print("Loading config module...") # This runs on import!
DATABASE_URL = "postgresql://localhost/mydb"
DEBUG = True
def get_config():
return {"db": DATABASE_URL, "debug": DEBUG}
print("Config module loaded!") # This also runs on import!
# main.py
import config # This triggers the print statements in config.py
# Output:
# Loading config module...
# Config module loaded!
print(config.DATABASE_URL) # postgresql://localhost/mydb
Python caches imported modules, so the top-level code only runs once, even if you import the module multiple times in different files.
Importing Modules
Python provides several ways to import modules, each suited to different situations.
import module
The most straightforward approach — import the entire module:
import math_utils
result = math_utils.add(5, 3) # 8
area = math_utils.circle_area(10) # 314.159...
v = math_utils.Vector(3, 4)
print(v.magnitude()) # 5.0
print(math_utils.PI) # 3.14159265358979
Pros: Clear where each name comes from. No name collisions. Cons: Verbose if you use many items from the module.
from module import item
Import specific items directly into your namespace:
from math_utils import add, multiply, PI
result = add(5, 3) # 8 — no prefix needed
product = multiply(4, 2) # 8
print(PI) # 3.14159265358979
Pros: Concise. Only import what you need. Cons: Can cause name collisions if two modules export the same name.
import module as alias
Give a module a shorter name:
import math_utils as mu
import datetime as dt
result = mu.add(5, 3)
now = dt.datetime.now()
This is extremely common in the Python ecosystem. Many libraries have conventional aliases:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from module import item as alias
Rename specific imports:
from math_utils import circle_area as area
from math_utils import factorial as fact
print(area(5)) # 78.539...
print(fact(10)) # 3628800
This is useful when two modules export items with the same name:
from math_utils import add as math_add
from string_utils import add as string_add # hypothetical
math_add(5, 3) # numeric addition
string_add("hi", "!") # string concatenation
from module import * (Avoid This)
Import everything from a module:
from math_utils import *
# Now add, multiply, PI, Vector, etc. are all in your namespace
print(add(5, 3))
Why you should avoid this:
- Name collisions — You might overwrite existing names without realising
- Unclear origin — Hard to tell where a function came from when reading code
- Maintenance headache — Adding new names to the module can silently break your code
# Dangerous example
from math import *
from cmath import * # Overwrites sqrt, log, etc. from math!
# Which sqrt is this? The real or complex version?
print(sqrt(4)) # cmath.sqrt — returns (2+0j), not 2.0!
The one acceptable use is in interactive sessions (Python REPL) for quick exploration.
Import Search Path (sys.path)
When you write import something, Python searches for the module in a specific order:
- The current directory (directory of the script being run)
- PYTHONPATH environment variable directories (if set)
- Standard library directories
- Site-packages (where pip installs third-party packages)
You can inspect and modify this search path:
import sys
# View the search path
for path in sys.path:
print(path)
# Output (example):
# /home/user/my_project (current directory)
# /usr/lib/python3.12
# /usr/lib/python3.12/lib-dynload
# /home/user/.local/lib/python3.12/site-packages
# Add a custom directory to the search path
sys.path.append("/home/user/my_libraries")
# Now Python will also look in /home/user/my_libraries
import my_custom_module # Found in the appended path
The if __name__ == "__main__" Pattern
This is one of the most important patterns in Python. Every Python developer must understand it.
What Is __name__?
Every module has a built-in attribute called __name__. Its value depends on how the module is being used:
- If the file is run directly (e.g.,
python my_script.py),__name__is set to"__main__" - If the file is imported by another file,
__name__is set to the module name (e.g.,"my_script")
# demo.py
print(f"__name__ is: {__name__}")
# Run directly
$ python demo.py
__name__ is: __main__
# other.py
import demo
# Output: __name__ is: demo
The Guard Pattern
This lets you write code that only runs when the file is executed directly:
# calculator.py
def add(a, b):
"""Return the sum of two numbers."""
return a + b
def subtract(a, b):
"""Return the difference of two numbers."""
return a - b
def multiply(a, b):
"""Return the product of two numbers."""
return a * b
def divide(a, b):
"""Return the quotient. Raises ValueError if b is zero."""
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
# This block ONLY runs when you do: python calculator.py
# It does NOT run when another file does: import calculator
if __name__ == "__main__":
# Test our functions
print("Testing calculator functions:")
print(f"add(10, 5) = {add(10, 5)}") # 15
print(f"subtract(10, 5) = {subtract(10, 5)}") # 5
print(f"multiply(10, 5) = {multiply(10, 5)}") # 50
print(f"divide(10, 5) = {divide(10, 5)}") # 2.0
# Test error handling
try:
divide(10, 0)
except ValueError as e:
print(f"Caught error: {e}") # Caught error: Cannot divide by zero
Now when another file imports calculator, only the functions are available — the test code does not execute:
# main.py
from calculator import add, divide
print(add(100, 200)) # 300
print(divide(100, 4)) # 25.0
# No test output appears!
Practical Patterns
Pattern 1: Module with a CLI interface
# word_counter.py
import sys
def count_words(text):
"""Count words in a string."""
return len(text.split())
def count_lines(text):
"""Count lines in a string."""
return len(text.splitlines())
def analyze_text(text):
"""Return a dictionary of text statistics."""
return {
"words": count_words(text),
"lines": count_lines(text),
"characters": len(text),
"characters_no_spaces": len(text.replace(" ", "")),
}
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python word_counter.py <filename>")
sys.exit(1)
filename = sys.argv[1]
with open(filename, "r") as f:
content = f.read()
stats = analyze_text(content)
for key, value in stats.items():
print(f"{key}: {value}")
Pattern 2: Quick demo / documentation
# shapes.py
import math
def circle_area(radius):
return math.pi * radius ** 2
def rectangle_area(width, height):
return width * height
def triangle_area(base, height):
return 0.5 * base * height
if __name__ == "__main__":
# Serve as live documentation / examples
print("=== Shape Area Calculator ===")
print(f"Circle (r=5): {circle_area(5):.2f}")
print(f"Rectangle (4x6): {rectangle_area(4, 6):.2f}")
print(f"Triangle (b=3, h=8): {triangle_area(3, 8):.2f}")
Packages
As your project grows, you need to organise modules into directories. A package is a directory that contains modules and a special __init__.py file.
Directory Structure
my_project/
├── main.py
└── utils/ # This is a package
├── __init__.py # Makes "utils" a package
├── math_helpers.py # A module in the package
├── string_helpers.py # Another module
└── file_helpers.py # Another module
The __init__.py File
The __init__.py file serves several purposes:
- Marks the directory as a package (required in Python 3.2 and earlier, recommended in all versions)
- Runs when the package is imported — initialisation code goes here
- Controls the package's public API — define what
from package import *exports
# utils/__init__.py
# Import key items so users can access them directly from the package
from .math_helpers import add, multiply, circle_area
from .string_helpers import capitalize_words, slugify
from .file_helpers import read_file, write_file
# Define what "from utils import *" exports
__all__ = [
"add", "multiply", "circle_area",
"capitalize_words", "slugify",
"read_file", "write_file",
]
# Package metadata
__version__ = "1.0.0"
__author__ = "Your Name"
Now users can import cleanly:
# Thanks to __init__.py, these all work:
from utils import add, capitalize_words
from utils import __version__
# Instead of the longer:
from utils.math_helpers import add
from utils.string_helpers import capitalize_words
An empty __init__.py is also perfectly valid — it simply marks the directory as a package without any extra setup.
Nested Packages (Sub-packages)
Packages can contain other packages:
my_project/
├── main.py
└── mylib/
├── __init__.py
├── core/
│ ├── __init__.py
│ ├── engine.py
│ └── config.py
├── utils/
│ ├── __init__.py
│ ├── math_helpers.py
│ └── string_helpers.py
└── io/
├── __init__.py
├── readers.py
└── writers.py
# Importing from nested packages
from mylib.core.engine import Engine
from mylib.utils.math_helpers import add
from mylib.io.readers import read_csv
Relative Imports
Inside a package, you can use relative imports to refer to sibling modules or parent packages:
# mylib/utils/string_helpers.py
# Relative import from the same package (utils/)
from .math_helpers import add # same directory
from . import file_helpers # same directory
# Relative import from parent package (mylib/)
from ..core.config import DATABASE_URL # up one level, then into core/
from ..core import engine # up one level, then into core/
Relative import syntax:
.means "current package"..means "parent package"...means "grandparent package"
Important: Relative imports only work inside packages. They do not work in scripts run directly.
The Python Standard Library
Python's motto is "batteries included" — it ships with a massive standard library covering everything from math to networking to file compression. No installation needed.
Overview by Category
| Category | Key Modules |
|---|---|
| Math & Numbers | math, decimal, fractions, statistics |
| Data Structures | collections, heapq, bisect, array |
| Text Processing | string, re, textwrap, difflib |
| Date & Time | datetime, time, calendar |
| File & I/O | os, pathlib, shutil, glob, tempfile |
| Data Formats | json, csv, xml, configparser |
| Functional | itertools, functools, operator |
| System | sys, os, platform, subprocess |
| Concurrency | threading, multiprocessing, asyncio |
| Networking | urllib, http, socket, email |
| Type System | typing, dataclasses, abc |
| Debugging | logging, pdb, traceback, warnings |
| Testing | unittest, doctest |
| Compression | zipfile, gzip, tarfile |
| Cryptography | hashlib, secrets, hmac |
| Copying | copy |
Let us now explore the most important modules in depth.
math — Mathematical Functions
The math module provides access to mathematical functions defined by the C standard.
import math
Constants:
print(math.pi) # 3.141592653589793
print(math.e) # 2.718281828459045
print(math.tau) # 6.283185307179586 (2 * pi)
print(math.inf) # inf (positive infinity)
print(math.nan) # nan (not a number)
Rounding and Absolute Value:
print(math.ceil(4.2)) # 5 — round up
print(math.ceil(-4.2)) # -4
print(math.floor(4.8)) # 4 — round down
print(math.floor(-4.8)) # -5
print(math.trunc(4.8)) # 4 — truncate toward zero
print(math.trunc(-4.8)) # -4
print(math.fabs(-5.5)) # 5.5 — absolute value (always float)
Powers, Roots, and Logarithms:
print(math.sqrt(16)) # 4.0
print(math.sqrt(2)) # 1.4142135623730951
print(math.pow(2, 10)) # 1024.0 (always returns float)
print(math.log(100, 10)) # 2.0 (log base 10 of 100)
print(math.log(math.e)) # 1.0 (natural log)
print(math.log2(1024)) # 10.0
print(math.log10(1000)) # 3.0
print(math.isqrt(10)) # 3 (integer square root)
Factorials and Combinatorics:
print(math.factorial(5)) # 120 (5! = 5 * 4 * 3 * 2 * 1)
print(math.factorial(10)) # 3628800
print(math.comb(10, 3)) # 120 (10 choose 3)
print(math.perm(10, 3)) # 720 (permutations of 3 from 10)
print(math.gcd(48, 18)) # 6 (greatest common divisor)
print(math.lcm(12, 18)) # 36 (least common multiple)
Trigonometric Functions (angles in radians):
# Convert degrees to radians and back
print(math.radians(180)) # 3.141592653589793
print(math.degrees(math.pi)) # 180.0
# Trigonometric functions
print(math.sin(math.pi / 2)) # 1.0
print(math.cos(0)) # 1.0
print(math.tan(math.pi / 4)) # 0.9999999999999999 (approx 1)
# Inverse trigonometric functions
print(math.asin(1)) # 1.5707963... (pi/2)
print(math.acos(0)) # 1.5707963... (pi/2)
print(math.atan(1)) # 0.7853981... (pi/4)
print(math.atan2(1, 1)) # 0.7853981... (pi/4) — handles quadrants
Special Value Checks:
print(math.isnan(float("nan"))) # True
print(math.isnan(42)) # False
print(math.isinf(float("inf"))) # True
print(math.isinf(42)) # False
print(math.isfinite(42)) # True
print(math.isfinite(float("inf"))) # False
print(math.isclose(0.1 + 0.2, 0.3, rel_tol=1e-9)) # True
random — Random Number Generation
The random module generates pseudo-random numbers for various distributions.
import random
Basic Random Numbers:
# Random float between 0.0 and 1.0
print(random.random()) # e.g., 0.7431448254356782
# Random float in a range
print(random.uniform(1.0, 10.0)) # e.g., 6.234...
# Random integer in a range (inclusive on both ends)
print(random.randint(1, 100)) # e.g., 42
# Random integer in a range (exclusive upper bound)
print(random.randrange(0, 100, 5)) # Random multiple of 5: 0, 5, 10, ... 95
Choosing from Sequences:
colors = ["red", "green", "blue", "yellow", "purple"]
# Pick one random item
print(random.choice(colors)) # e.g., "blue"
# Pick multiple WITH replacement (items can repeat)
print(random.choices(colors, k=3)) # e.g., ["red", "red", "blue"]
# Pick multiple WITHOUT replacement (no repeats)
print(random.sample(colors, k=3)) # e.g., ["green", "purple", "red"]
# Weighted random choices
fruits = ["apple", "banana", "cherry"]
weights = [50, 30, 20] # apple is most likely
picks = random.choices(fruits, weights=weights, k=10)
print(picks) # Mostly apples, some bananas, few cherries
Shuffling:
deck = list(range(1, 53)) # A deck of 52 cards
random.shuffle(deck) # Shuffle in place
print(deck[:5]) # e.g., [34, 7, 51, 22, 3]
Reproducible Results with Seeds:
random.seed(42)
print(random.randint(1, 100)) # Always 82 with seed 42
print(random.randint(1, 100)) # Always 15
random.seed(42) # Reset the seed
print(random.randint(1, 100)) # 82 again — same sequence!
print(random.randint(1, 100)) # 15 again
Statistical Distributions:
# Gaussian (normal) distribution — mean=0, std_dev=1
print(random.gauss(0, 1)) # e.g., -0.234...
# Generate 1000 samples to see the distribution
samples = [random.gauss(100, 15) for _ in range(1000)]
avg = sum(samples) / len(samples)
print(f"Average: {avg:.1f}") # Close to 100
datetime — Dates and Times
The datetime module supplies classes for manipulating dates and times.
from datetime import datetime, date, time, timedelta
Getting the Current Date and Time:
now = datetime.now()
print(now) # 2026-03-24 14:30:45.123456
print(now.year) # 2026
print(now.month) # 3
print(now.day) # 24
print(now.hour) # 14
print(now.minute) # 30
print(now.second) # 45
print(now.weekday()) # 1 (0=Monday, 6=Sunday)
today = date.today()
print(today) # 2026-03-24
Creating Specific Dates and Times:
# Create a date
birthday = date(1995, 8, 15)
print(birthday) # 1995-08-15
# Create a time
alarm = time(7, 30, 0)
print(alarm) # 07:30:00
# Create a full datetime
event = datetime(2026, 12, 31, 23, 59, 59)
print(event) # 2026-12-31 23:59:59
Formatting Dates (strftime):
now = datetime.now()
print(now.strftime("%Y-%m-%d")) # 2026-03-24
print(now.strftime("%d/%m/%Y")) # 24/03/2026
print(now.strftime("%B %d, %Y")) # March 24, 2026
print(now.strftime("%I:%M %p")) # 02:30 PM
print(now.strftime("%A, %B %d, %Y")) # Tuesday, March 24, 2026
print(now.strftime("%d %b %Y %H:%M")) # 24 Mar 2026 14:30
| Code | Meaning | Example |
|---|---|---|
%Y | 4-digit year | 2026 |
%m | Month (zero-padded) | 03 |
%d | Day (zero-padded) | 24 |
%H | Hour (24-hour) | 14 |
%I | Hour (12-hour) | 02 |
%M | Minute | 30 |
%S | Second | 45 |
%p | AM/PM | PM |
%A | Full weekday | Tuesday |
%a | Short weekday | Tue |
%B | Full month | March |
%b | Short month | Mar |
Parsing Date Strings (strptime):
date_str = "24-03-2026"
parsed = datetime.strptime(date_str, "%d-%m-%Y")
print(parsed) # 2026-03-24 00:00:00
date_str2 = "March 24, 2026 02:30 PM"
parsed2 = datetime.strptime(date_str2, "%B %d, %Y %I:%M %p")
print(parsed2) # 2026-03-24 14:30:00
Date Arithmetic with timedelta:
now = datetime.now()
# Add or subtract time
tomorrow = now + timedelta(days=1)
next_week = now + timedelta(weeks=1)
two_hours_later = now + timedelta(hours=2)
last_month_approx = now - timedelta(days=30)
print(f"Tomorrow: {tomorrow.strftime('%Y-%m-%d')}")
print(f"Next week: {next_week.strftime('%Y-%m-%d')}")
# Difference between dates
new_year = datetime(2026, 12, 31)
diff = new_year - now
print(f"{diff.days} days until New Year's Eve")
print(f"That's about {diff.days // 7} weeks")
Timezone Basics:
from datetime import timezone
# UTC time
utc_now = datetime.now(timezone.utc)
print(utc_now) # 2026-03-24 09:30:45.123456+00:00
# Create a timezone offset
ist = timezone(timedelta(hours=5, minutes=30)) # India Standard Time
ist_now = datetime.now(ist)
print(ist_now) # 2026-03-24 15:00:45.123456+05:30
os — Operating System Interface
The os module provides functions for interacting with the operating system.
import os
Working Directory:
# Get current working directory
print(os.getcwd()) # /home/user/my_project
# Change directory (use sparingly — prefer absolute paths)
os.chdir("/tmp")
print(os.getcwd()) # /tmp
Listing and Creating Directories:
# List files and folders in a directory
entries = os.listdir(".")
print(entries) # ['file1.py', 'folder1', 'file2.txt']
# List a specific directory
entries = os.listdir("/home/user/documents")
# Create a single directory
os.mkdir("new_folder")
# Create nested directories (like mkdir -p)
os.makedirs("output/reports/2026", exist_ok=True)
# exist_ok=True prevents error if directory already exists
File Operations:
# Rename a file or directory
os.rename("old_name.txt", "new_name.txt")
# Remove a file
os.remove("unwanted_file.txt")
# Remove an empty directory
os.rmdir("empty_folder")
# Remove nested empty directories
os.removedirs("output/reports/2026")
Path Operations (os.path):
# Join path components (handles OS separators automatically)
path = os.path.join("home", "user", "documents", "file.txt")
print(path) # home/user/documents/file.txt (on Unix)
# Check if path exists
print(os.path.exists("myfile.txt")) # True or False
# Check if it's a file or directory
print(os.path.isfile("myfile.txt")) # True
print(os.path.isdir("my_folder")) # True
# Get file name and directory from a path
print(os.path.basename("/home/user/doc.txt")) # doc.txt
print(os.path.dirname("/home/user/doc.txt")) # /home/user
# Split into directory and filename
print(os.path.split("/home/user/doc.txt")) # ('/home/user', 'doc.txt')
# Get file extension
print(os.path.splitext("report.pdf")) # ('report', '.pdf')
# Get file size in bytes
print(os.path.getsize("myfile.txt")) # 1024
Environment Variables:
# Get an environment variable
home = os.environ.get("HOME")
print(home) # /home/user
# Get with a default value
db_url = os.environ.get("DATABASE_URL", "sqlite:///default.db")
# Set an environment variable (for current process only)
os.environ["MY_APP_MODE"] = "development"
Walking a Directory Tree:
# os.walk traverses all files and subdirectories
for dirpath, dirnames, filenames in os.walk("/home/user/project"):
print(f"Directory: {dirpath}")
for filename in filenames:
full_path = os.path.join(dirpath, filename)
print(f" File: {full_path}")
sys — System-Specific Parameters
The sys module provides access to system-specific parameters and functions.
import sys
Command-Line Arguments:
# script.py
# Run: python script.py hello world 42
print(sys.argv)
# ['script.py', 'hello', 'world', '42']
print(sys.argv[0]) # 'script.py' — the script name
print(sys.argv[1]) # 'hello' — first argument
print(len(sys.argv)) # 4 — total count including script name
Python Version and Platform:
print(sys.version)
# 3.12.0 (main, Oct 2 2024, 00:00:00) [GCC 12.2.0]
print(sys.version_info)
# sys.version_info(major=3, minor=12, micro=0, ...)
print(sys.platform) # 'linux', 'darwin' (macOS), or 'win32'
print(sys.executable) # /usr/bin/python3
Module Search Path:
# View where Python looks for modules
for p in sys.path:
print(p)
# Add a custom directory
sys.path.insert(0, "/my/custom/modules")
Standard Streams:
# Write to stdout
sys.stdout.write("Hello from stdout\n")
# Write to stderr (for error messages)
sys.stderr.write("This is an error message\n")
# Read from stdin
# line = sys.stdin.readline()
Memory and Exit:
# Size of an object in bytes
print(sys.getsizeof(42)) # 28
print(sys.getsizeof("hello")) # 54
print(sys.getsizeof([1, 2, 3])) # 88
print(sys.getsizeof({})) # 64
# Maximum integer size (for recursion limits, etc.)
print(sys.maxsize) # 9223372036854775807 (on 64-bit)
print(sys.getrecursionlimit()) # 1000 (default)
# Exit the program
# sys.exit(0) # Exit with success code
# sys.exit(1) # Exit with error code
# sys.exit("Something went wrong") # Exit with error message
collections — Specialised Container Types
The collections module provides alternatives to Python's built-in containers.
from collections import Counter, defaultdict, OrderedDict, namedtuple, deque, ChainMap
Counter — Count Occurrences:
from collections import Counter
# Count items in a list
fruits = ["apple", "banana", "apple", "cherry", "banana", "apple", "date"]
count = Counter(fruits)
print(count)
# Counter({'apple': 3, 'banana': 2, 'cherry': 1, 'date': 1})
# Count characters in a string
char_count = Counter("mississippi")
print(char_count)
# Counter({'s': 4, 'i': 4, 'p': 2, 'm': 1})
# Most common items
print(count.most_common(2)) # [('apple', 3), ('banana', 2)]
# Arithmetic with Counters
inventory = Counter(apples=5, bananas=3)
sold = Counter(apples=2, bananas=1)
remaining = inventory - sold
print(remaining) # Counter({'apples': 3, 'bananas': 2})
# Total count
print(count.total()) # 7
defaultdict — Dict with Default Values:
from collections import defaultdict
# List as default — great for grouping
students_by_grade = defaultdict(list)
students_by_grade["A"].append("Alice")
students_by_grade["B"].append("Bob")
students_by_grade["A"].append("Arjun")
students_by_grade["C"].append("Charlie")
print(dict(students_by_grade))
# {'A': ['Alice', 'Arjun'], 'B': ['Bob'], 'C': ['Charlie']}
# Int as default — great for counting
word_count = defaultdict(int)
for word in ["the", "cat", "sat", "on", "the", "mat"]:
word_count[word] += 1
print(dict(word_count))
# {'the': 2, 'cat': 1, 'sat': 1, 'on': 1, 'mat': 1}
# Set as default — great for unique collections
tags = defaultdict(set)
tags["python"].add("programming")
tags["python"].add("scripting")
tags["python"].add("programming") # Duplicate ignored
print(dict(tags))
# {'python': {'programming', 'scripting'}}
OrderedDict — Dictionary that Remembers Insertion Order:
from collections import OrderedDict
# In Python 3.7+, regular dicts maintain insertion order.
# OrderedDict is still useful for its extra methods.
od = OrderedDict()
od["first"] = 1
od["second"] = 2
od["third"] = 3
# Move an item to the end
od.move_to_end("first")
print(list(od.keys())) # ['second', 'third', 'first']
# Move an item to the beginning
od.move_to_end("third", last=False)
print(list(od.keys())) # ['third', 'second', 'first']
# Pop the last item
print(od.popitem()) # ('first', 1)
# Pop the first item
print(od.popitem(last=False)) # ('third', 3)
namedtuple — Lightweight Class:
from collections import namedtuple
# Define a named tuple
Point = namedtuple("Point", ["x", "y"])
p = Point(3, 4)
print(p.x, p.y) # 3 4
print(p[0], p[1]) # 3 4 — also supports indexing
print(p) # Point(x=3, y=4)
# Can be used as a dictionary key (tuples are hashable)
Student = namedtuple("Student", ["name", "grade", "age"])
s = Student("Alice", "A", 20)
print(f"{s.name} got grade {s.grade}") # Alice got grade A
# Convert to dictionary
print(s._asdict()) # {'name': 'Alice', 'grade': 'A', 'age': 20}
# Create a modified copy
s2 = s._replace(grade="A+")
print(s2) # Student(name='Alice', grade='A+', age=20)
deque — Double-Ended Queue:
from collections import deque
# Create a deque
dq = deque([1, 2, 3, 4, 5])
# Add to both ends (O(1) — much faster than list for this)
dq.append(6) # Add to right: [1, 2, 3, 4, 5, 6]
dq.appendleft(0) # Add to left: [0, 1, 2, 3, 4, 5, 6]
# Remove from both ends
dq.pop() # Remove from right: returns 6
dq.popleft() # Remove from left: returns 0
print(dq) # deque([1, 2, 3, 4, 5])
# Rotate elements
dq.rotate(2) # Rotate right by 2
print(dq) # deque([4, 5, 1, 2, 3])
dq.rotate(-2) # Rotate left by 2
print(dq) # deque([1, 2, 3, 4, 5])
# Fixed-size deque (oldest items are dropped)
recent = deque(maxlen=3)
recent.append("a")
recent.append("b")
recent.append("c")
recent.append("d") # "a" is dropped
print(recent) # deque(['b', 'c', 'd'], maxlen=3)
ChainMap — Merge Multiple Dictionaries:
from collections import ChainMap
defaults = {"color": "blue", "size": "medium", "font": "Arial"}
user_prefs = {"color": "green", "font": "Helvetica"}
cli_args = {"color": "red"}
# ChainMap searches in order: cli_args -> user_prefs -> defaults
config = ChainMap(cli_args, user_prefs, defaults)
print(config["color"]) # red (from cli_args)
print(config["font"]) # Helvetica (from user_prefs)
print(config["size"]) # medium (from defaults)
itertools — Efficient Iteration Tools
The itertools module provides fast, memory-efficient tools for working with iterators.
import itertools
Combining Iterables:
from itertools import chain
# Flatten multiple lists into one
combined = list(chain([1, 2], [3, 4], [5, 6]))
print(combined) # [1, 2, 3, 4, 5, 6]
# Flatten a list of lists
nested = [[1, 2], [3, 4], [5, 6]]
flat = list(chain.from_iterable(nested))
print(flat) # [1, 2, 3, 4, 5, 6]
Combinatorics:
from itertools import product, combinations, permutations
# Cartesian product (all pairs)
pairs = list(product("AB", [1, 2]))
print(pairs) # [('A', 1), ('A', 2), ('B', 1), ('B', 2)]
# Product with repeat
dice_rolls = list(product(range(1, 7), repeat=2))
print(f"Two dice: {len(dice_rolls)} combinations") # 36
# Combinations (order doesn't matter, no replacement)
combos = list(combinations("ABCD", 2))
print(combos)
# [('A','B'), ('A','C'), ('A','D'), ('B','C'), ('B','D'), ('C','D')]
# Permutations (order matters)
perms = list(permutations("ABC", 2))
print(perms)
# [('A','B'), ('A','C'), ('B','A'), ('B','C'), ('C','A'), ('C','B')]
Infinite Iterators:
from itertools import count, cycle, repeat
# count: infinite counter
for i in count(start=10, step=3):
if i > 25:
break
print(i, end=" ") # 10 13 16 19 22 25
# cycle: repeat an iterable forever
colors = cycle(["red", "green", "blue"])
for _, color in zip(range(7), colors):
print(color, end=" ")
# red green blue red green blue red
# repeat: repeat a value
zeros = list(repeat(0, 5))
print(zeros) # [0, 0, 0, 0, 0]
Slicing and Grouping:
from itertools import islice, groupby, accumulate
# islice: slice an iterator without converting to a list
squares = (x**2 for x in range(100))
first_five = list(islice(squares, 5))
print(first_five) # [0, 1, 4, 9, 16]
# Skip first 3, take next 4
items = list(islice(range(100), 3, 7))
print(items) # [3, 4, 5, 6]
# groupby: group consecutive items (data must be sorted by key)
data = [
("fruit", "apple"), ("fruit", "banana"),
("veggie", "carrot"), ("veggie", "pea"),
("fruit", "cherry"),
]
data.sort(key=lambda x: x[0]) # Must sort first!
for key, group in groupby(data, key=lambda x: x[0]):
items = [item[1] for item in group]
print(f"{key}: {items}")
# fruit: ['apple', 'banana', 'cherry']
# veggie: ['carrot', 'pea']
# accumulate: running totals
nums = [1, 2, 3, 4, 5]
running_sum = list(accumulate(nums))
print(running_sum) # [1, 3, 6, 10, 15]
# Running maximum
import operator
running_max = list(accumulate([3, 1, 4, 1, 5, 9], func=max))
print(running_max) # [3, 3, 4, 4, 5, 9]
functools — Higher-Order Functions
The functools module provides tools for working with functions and callable objects.
from functools import reduce, lru_cache, partial, wraps, total_ordering
reduce — Accumulate a Sequence to a Single Value:
from functools import reduce
# Sum of numbers (same as built-in sum)
total = reduce(lambda a, b: a + b, [1, 2, 3, 4, 5])
print(total) # 15
# Product of numbers
product = reduce(lambda a, b: a * b, [1, 2, 3, 4, 5])
print(product) # 120
# Find maximum (same as built-in max)
largest = reduce(lambda a, b: a if a > b else b, [3, 1, 4, 1, 5, 9])
print(largest) # 9
# Flatten nested list
nested = [[1, 2], [3, 4], [5, 6]]
flat = reduce(lambda a, b: a + b, nested)
print(flat) # [1, 2, 3, 4, 5, 6]
lru_cache — Memoisation (Cache Results):
from functools import lru_cache
@lru_cache(maxsize=128)
def fibonacci(n):
"""Calculate nth Fibonacci number with caching."""
if n < 2:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
# Without cache: extremely slow for large n
# With cache: near-instant!
print(fibonacci(50)) # 12586269025
print(fibonacci(100)) # 354224848179261915075
# Check cache statistics
print(fibonacci.cache_info())
# CacheInfo(hits=98, misses=101, maxsize=128, currsize=101)
# Clear the cache
fibonacci.cache_clear()
partial — Pre-fill Function Arguments:
from functools import partial
def power(base, exponent):
return base ** exponent
# Create specialised versions
square = partial(power, exponent=2)
cube = partial(power, exponent=3)
print(square(5)) # 25
print(cube(3)) # 27
# Practical: create a custom print function
debug_print = partial(print, "[DEBUG]")
debug_print("Starting process") # [DEBUG] Starting process
debug_print("Value is", 42) # [DEBUG] Value is 42
wraps — Preserve Function Metadata in Decorators:
from functools import wraps
import time
def timer(func):
@wraps(func) # Preserves the name, docstring, etc. of the original function
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
elapsed = time.time() - start
print(f"{func.__name__} took {elapsed:.4f}s")
return result
return wrapper
@timer
def slow_function():
"""This function is intentionally slow."""
time.sleep(0.1)
return "done"
slow_function() # slow_function took 0.1003s
print(slow_function.__name__) # slow_function (not "wrapper"!)
print(slow_function.__doc__) # This function is intentionally slow.
total_ordering — Complete Comparison Methods:
from functools import total_ordering
@total_ordering
class Temperature:
"""You only need to define __eq__ and one of __lt__, __gt__, etc."""
def __init__(self, celsius):
self.celsius = celsius
def __eq__(self, other):
return self.celsius == other.celsius
def __lt__(self, other):
return self.celsius < other.celsius
def __repr__(self):
return f"Temperature({self.celsius}C)"
t1 = Temperature(20)
t2 = Temperature(30)
print(t1 < t2) # True
print(t1 > t2) # False (auto-generated!)
print(t1 <= t2) # True (auto-generated!)
print(t1 >= t2) # False (auto-generated!)
json — JSON Encoding and Decoding
The json module handles reading and writing JSON data.
import json
Converting Python to JSON (dumps):
data = {
"name": "Alice",
"age": 30,
"hobbies": ["reading", "coding", "hiking"],
"address": {
"city": "Mumbai",
"country": "India"
},
"active": True,
"score": None
}
# Convert to JSON string
json_str = json.dumps(data)
print(json_str)
# {"name": "Alice", "age": 30, ...}
# Pretty-printed JSON
json_pretty = json.dumps(data, indent=2)
print(json_pretty)
# {
# "name": "Alice",
# "age": 30,
# "hobbies": [
# "reading",
# "coding",
# "hiking"
# ],
# ...
# }
# Sort keys alphabetically
json_sorted = json.dumps(data, indent=2, sort_keys=True)
Converting JSON to Python (loads):
json_string = '{"name": "Bob", "age": 25, "active": true}'
parsed = json.loads(json_string)
print(parsed["name"]) # Bob
print(parsed["active"]) # True (Python bool, not JSON true)
print(type(parsed)) # <class 'dict'>
Reading and Writing JSON Files (load / dump):
# Write JSON to a file
data = {"users": ["Alice", "Bob", "Charlie"], "count": 3}
with open("data.json", "w") as f:
json.dump(data, f, indent=2)
# Read JSON from a file
with open("data.json", "r") as f:
loaded = json.load(f)
print(loaded) # {'users': ['Alice', 'Bob', 'Charlie'], 'count': 3}
Custom Serialisation:
from datetime import datetime
class DateTimeEncoder(json.JSONEncoder):
"""Custom encoder that handles datetime objects."""
def default(self, obj):
if isinstance(obj, datetime):
return obj.isoformat()
return super().default(obj)
event = {
"name": "Meeting",
"time": datetime(2026, 3, 24, 14, 30),
}
# Without custom encoder: TypeError
# With custom encoder: works!
json_str = json.dumps(event, cls=DateTimeEncoder, indent=2)
print(json_str)
# {
# "name": "Meeting",
# "time": "2026-03-24T14:30:00"
# }
re — Regular Expressions
The re module provides pattern matching for strings.
import re
Basic Functions:
text = "My phone number is 123-456-7890 and email is alice@example.com"
# search: find first match anywhere in string
match = re.search(r"\d{3}-\d{3}-\d{4}", text)
if match:
print(match.group()) # 123-456-7890
# match: match at the BEGINNING of string only
result = re.match(r"My", text)
print(result.group()) # My
# findall: find ALL matches (returns list of strings)
numbers = re.findall(r"\d+", text)
print(numbers) # ['123', '456', '7890']
# findall with emails
emails = re.findall(r"[\w.+-]+@[\w-]+\.[\w.]+", text)
print(emails) # ['alice@example.com']
Substitution:
# sub: replace matches
text = "I have 2 cats and 3 dogs"
result = re.sub(r"\d+", "many", text)
print(result) # I have many cats and many dogs
# Replace with a function
def double_number(match):
return str(int(match.group()) * 2)
result = re.sub(r"\d+", double_number, text)
print(result) # I have 4 cats and 6 dogs
Compiling Patterns (for repeated use):
# Compile a pattern for better performance when used multiple times
email_pattern = re.compile(r"[\w.+-]+@[\w-]+\.[\w.]+")
texts = [
"Contact alice@example.com for info",
"Send to bob@company.org please",
"No email here",
]
for t in texts:
found = email_pattern.findall(t)
if found:
print(f"Found: {found}")
# Found: ['alice@example.com']
# Found: ['bob@company.org']
Common Patterns:
| Pattern | Matches | Example |
|---|---|---|
\d | Any digit | 0, 9 |
\w | Word character (letter, digit, _) | a, Z, _ |
\s | Whitespace | space, tab, newline |
. | Any character (except newline) | anything |
^ | Start of string | ^Hello |
$ | End of string | world$ |
* | 0 or more | ab* matches a, ab, abb |
+ | 1 or more | ab+ matches ab, abb |
? | 0 or 1 | ab? matches a, ab |
{n} | Exactly n | \d{3} matches 123 |
{n,m} | Between n and m | \d{2,4} matches 12, 1234 |
[abc] | Any of a, b, c | character set |
[^abc] | Not a, b, or c | negated set |
(...) | Capture group | group matches |
| | Or | cat|dog |
pathlib — Object-Oriented Filesystem Paths
pathlib is the modern, recommended way to work with file paths in Python (preferred over os.path).
from pathlib import Path
Creating Paths:
# Current directory
cwd = Path.cwd()
print(cwd) # /home/user/my_project
# Home directory
home = Path.home()
print(home) # /home/user
# Create a path from a string
p = Path("/home/user/documents/report.txt")
# Join paths with /
data_dir = Path("project") / "data" / "raw"
print(data_dir) # project/data/raw
config_file = Path.home() / ".config" / "myapp" / "settings.json"
print(config_file) # /home/user/.config/myapp/settings.json
Path Properties:
p = Path("/home/user/documents/report.pdf")
print(p.name) # report.pdf
print(p.stem) # report (name without extension)
print(p.suffix) # .pdf
print(p.parent) # /home/user/documents
print(p.parts) # ('/', 'home', 'user', 'documents', 'report.pdf')
print(p.anchor) # /
print(p.is_absolute()) # True
Checking Existence:
p = Path("myfile.txt")
print(p.exists()) # True or False
print(p.is_file()) # True if it is a file
print(p.is_dir()) # True if it is a directory
Reading and Writing Files:
# Write text to a file
p = Path("output.txt")
p.write_text("Hello, World!\nLine 2\n")
# Read text from a file
content = p.read_text()
print(content) # Hello, World!\nLine 2\n
# Write bytes
p.write_bytes(b"\x00\x01\x02")
# Read bytes
data = p.read_bytes()
Globbing (Finding Files by Pattern):
project = Path(".")
# Find all Python files in current directory
for py_file in project.glob("*.py"):
print(py_file)
# Find all Python files recursively
for py_file in project.rglob("*.py"):
print(py_file)
# Find all image files
for img in project.rglob("*.png"):
print(img)
Creating Directories:
new_dir = Path("output") / "reports" / "2026"
new_dir.mkdir(parents=True, exist_ok=True)
# parents=True creates intermediate directories
# exist_ok=True does not raise an error if the directory already exists
string — String Constants and Templates
The string module provides useful string constants and a template class.
import string
String Constants:
print(string.ascii_letters) # abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
print(string.ascii_lowercase) # abcdefghijklmnopqrstuvwxyz
print(string.ascii_uppercase) # ABCDEFGHIJKLMNOPQRSTUVWXYZ
print(string.digits) # 0123456789
print(string.hexdigits) # 0123456789abcdefABCDEF
print(string.punctuation) # !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
print(string.whitespace) # ' \t\n\r\x0b\x0c'
print(string.printable) # All printable characters
Template Strings (Safe Substitution):
from string import Template
# Create a template
t = Template("Hello, $name! You have $count new messages.")
result = t.substitute(name="Alice", count=5)
print(result) # Hello, Alice! You have 5 new messages.
# safe_substitute: does not raise error for missing keys
t2 = Template("$greeting, $name!")
result = t2.safe_substitute(greeting="Hi")
print(result) # Hi, $name! (missing key left as-is)
Practical Use: Generating Random Strings
import string
import random
def generate_password(length=12):
"""Generate a random password."""
chars = string.ascii_letters + string.digits + string.punctuation
return "".join(random.choice(chars) for _ in range(length))
print(generate_password()) # e.g., k9$Tz!mP@2xR
print(generate_password(20)) # e.g., aB3$kLm!Pq9@rT5&wX2z
copy — Shallow and Deep Copy
The copy module provides functions to duplicate objects.
import copy
Why Copying Matters:
# Assignment does NOT copy — both variables point to the same object
original = [1, 2, [3, 4]]
reference = original
reference[0] = 99
print(original) # [99, 2, [3, 4]] — original changed too!
Shallow Copy (copy.copy):
import copy
original = [1, 2, [3, 4]]
shallow = copy.copy(original)
# Top-level changes are independent
shallow[0] = 99
print(original) # [1, 2, [3, 4]] — not affected
# BUT nested objects are still shared!
shallow[2][0] = 99
print(original) # [1, 2, [99, 4]] — nested list WAS affected!
Deep Copy (copy.deepcopy):
import copy
original = [1, 2, [3, 4]]
deep = copy.deepcopy(original)
# Everything is fully independent
deep[2][0] = 99
print(original) # [1, 2, [3, 4]] — completely unaffected!
print(deep) # [1, 2, [99, 4]]
When to Use Each:
| Scenario | Use |
|---|---|
| Simple flat list/dict | copy.copy() (shallow) |
| Nested structures | copy.deepcopy() (deep) |
| Immutable data (strings, tuples of ints) | No copy needed |
| Performance-critical, large data | copy.copy() if possible |
typing — Type Hints
The typing module provides tools for type annotations, which help with code clarity and IDE support.
from typing import List, Dict, Tuple, Optional, Union, Any, Callable
Basic Type Hints:
# Variable annotations
name: str = "Alice"
age: int = 30
score: float = 95.5
active: bool = True
# Function annotations
def greet(name: str) -> str:
return f"Hello, {name}!"
def add(a: int, b: int) -> int:
return a + b
Collection Types:
from typing import List, Dict, Tuple, Set
# List of integers
scores: List[int] = [95, 87, 92, 78]
# Dictionary with string keys and int values
age_map: Dict[str, int] = {"Alice": 30, "Bob": 25}
# Tuple with specific types
coordinate: Tuple[float, float] = (3.5, 7.2)
# Set of strings
tags: Set[str] = {"python", "coding", "tutorial"}
# Note: In Python 3.9+, you can use built-in types directly:
# scores: list[int] = [95, 87, 92, 78]
# age_map: dict[str, int] = {"Alice": 30, "Bob": 25}
Optional and Union:
from typing import Optional, Union
# Optional: value can be the given type or None
def find_user(user_id: int) -> Optional[str]:
"""Return username or None if not found."""
users = {1: "Alice", 2: "Bob"}
return users.get(user_id)
# Union: value can be one of several types
def process(value: Union[int, str]) -> str:
return str(value)
# Python 3.10+ allows: int | str instead of Union[int, str]
Callable:
from typing import Callable
# A function that takes a function as an argument
def apply_operation(x: int, y: int, func: Callable[[int, int], int]) -> int:
return func(x, y)
result = apply_operation(5, 3, lambda a, b: a + b)
print(result) # 8
Any:
from typing import Any
def log_value(value: Any) -> None:
"""Accept any type of value."""
print(f"Value: {value}, Type: {type(value).__name__}")
Type hints are not enforced at runtime — they are documentation and tooling aids. Use tools like mypy to check types statically.
Installing Third-Party Packages
Python's standard library is extensive, but the real power lies in the hundreds of thousands of third-party packages available on PyPI (Python Package Index).
pip — The Package Installer
pip is Python's built-in package manager.
# Install a package
pip install requests
# Install a specific version
pip install pandas==2.1.0
# Install minimum version
pip install numpy>=1.24.0
# Install multiple packages at once
pip install requests pandas numpy matplotlib
# Upgrade a package to the latest version
pip install --upgrade requests
# Uninstall a package
pip uninstall requests
Managing Dependencies
# List all installed packages
pip list
# Show details about a specific package
pip show requests
# Save current environment to requirements.txt
pip freeze > requirements.txt
# Install all packages from requirements.txt
pip install -r requirements.txt
requirements.txt Format
# requirements.txt
requests==2.31.0
pandas>=2.1.0,<3.0.0
numpy~=1.24.0
flask
python-dotenv>=1.0
| Syntax | Meaning |
|---|---|
==2.31.0 | Exact version |
>=2.1.0 | Minimum version |
<=3.0.0 | Maximum version |
>=2.1.0,<3.0.0 | Version range |
~=1.24.0 | Compatible release (>=1.24.0, <1.25.0) |
flask | Any version (latest) |
Editable Install
For developing your own packages:
# Install in "editable" mode — changes to source are immediately reflected
pip install -e .
# Install with optional development dependencies
pip install -e ".[dev]"
Virtual Environments
Every Python project should use a virtual environment to isolate its dependencies.
Why Virtual Environments?
Without virtual environments:
- Project A needs
requests==2.25.0 - Project B needs
requests==2.31.0 - Only one version can exist system-wide — one project breaks.
Virtual environments give each project its own Python installation and packages.
Creating and Using a Virtual Environment
# Create a virtual environment named "venv"
python -m venv venv
# Activate it
# macOS / Linux:
source venv/bin/activate
# Windows:
venv\Scripts\activate
# Your prompt changes to show the active env:
# (venv) $
# Now pip installs go into this venv only
pip install requests pandas
# Verify: packages are isolated
pip list
# Shows only packages installed in this venv
# When done, deactivate
deactivate
Best Practices for Virtual Environments
Always add venv/ to .gitignore:
# .gitignore
venv/
.venv/
env/
__pycache__/
*.pyc
Use requirements.txt to share dependencies (not the venv itself):
# Developer A creates the requirements file
pip freeze > requirements.txt
git add requirements.txt
git commit -m "Add project dependencies"
# Developer B recreates the environment
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Naming conventions:
venvor.venvare the most common names.venv(with dot) keeps it hidden on Unix systems
Popular Third-Party Packages
Here is a curated list of widely-used Python packages every developer should know about:
| Package | Category | Description | Install |
|---|---|---|---|
requests | HTTP | Simple HTTP requests | pip install requests |
httpx | HTTP | Async-capable HTTP client | pip install httpx |
pandas | Data | Data manipulation and analysis | pip install pandas |
numpy | Math | Numerical computing, arrays | pip install numpy |
matplotlib | Visualisation | Plotting and charts | pip install matplotlib |
flask | Web | Lightweight web framework | pip install flask |
fastapi | Web | Modern async web framework | pip install fastapi |
django | Web | Full-featured web framework | pip install django |
sqlalchemy | Database | SQL toolkit and ORM | pip install sqlalchemy |
pytest | Testing | Testing framework | pip install pytest |
beautifulsoup4 | Scraping | HTML/XML parsing | pip install beautifulsoup4 |
selenium | Automation | Browser automation | pip install selenium |
pillow | Images | Image processing | pip install pillow |
click | CLI | Command-line interfaces | pip install click |
rich | CLI | Rich terminal formatting | pip install rich |
Quick Examples
requests — Making HTTP Requests:
import requests
response = requests.get("https://api.github.com/users/octocat")
print(response.status_code) # 200
data = response.json()
print(data["name"]) # The Octocat
print(data["public_repos"]) # 8
pytest — Writing Tests:
# test_calculator.py
def add(a, b):
return a + b
def test_add():
assert add(2, 3) == 5
assert add(-1, 1) == 0
assert add(0, 0) == 0
# Run with: pytest test_calculator.py
Package Distribution Basics
When you want to share your Python code as an installable package, you need a project configuration file.
pyproject.toml (Modern Standard)
The modern way to define a Python project:
# pyproject.toml
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.backends._legacy:_Backend"
[project]
name = "my-awesome-package"
version = "0.1.0"
description = "A short description of your package"
readme = "README.md"
license = {text = "MIT"}
requires-python = ">=3.8"
authors = [
{name = "Your Name", email = "you@example.com"}
]
dependencies = [
"requests>=2.25.0",
"click>=8.0",
]
[project.optional-dependencies]
dev = ["pytest", "mypy", "black"]
setup.py (Legacy but Still Common)
# setup.py
from setuptools import setup, find_packages
setup(
name="my-awesome-package",
version="0.1.0",
packages=find_packages(),
install_requires=[
"requests>=2.25.0",
"click>=8.0",
],
)
Version Management
A common pattern is to define the version in one place and read it from __init__.py:
# my_package/__init__.py
__version__ = "0.1.0"
Publishing to PyPI (Brief Overview)
# 1. Build the package
pip install build
python -m build
# 2. Upload to PyPI (requires an account at pypi.org)
pip install twine
twine upload dist/*
# 3. Now anyone can install it!
# pip install my-awesome-package
Practical Examples
Let us put everything together with practical, real-world examples.
Example 1: Building a Utility Package
Create a package with string, math, and file helpers:
Project structure:
my_utils/
├── __init__.py
├── string_helpers.py
├── math_helpers.py
└── file_helpers.py
string_helpers.py:
# my_utils/string_helpers.py
def slugify(text):
"""Convert text to URL-friendly slug.
>>> slugify("Hello World!")
'hello-world'
"""
import re
text = text.lower().strip()
text = re.sub(r"[^\w\s-]", "", text)
text = re.sub(r"[\s_]+", "-", text)
text = re.sub(r"-+", "-", text)
return text.strip("-")
def truncate(text, max_length=50, suffix="..."):
"""Truncate text to max_length, adding suffix if truncated.
>>> truncate("Hello, World!", max_length=8)
'Hello...'
"""
if len(text) <= max_length:
return text
return text[:max_length - len(suffix)] + suffix
def word_count(text):
"""Count words in text.
>>> word_count("Hello beautiful world")
3
"""
return len(text.split())
def title_case(text):
"""Convert text to title case, handling small words.
>>> title_case("the quick brown fox")
'The Quick Brown Fox'
"""
small_words = {"a", "an", "the", "and", "but", "or", "for", "in", "on", "at", "to"}
words = text.split()
result = []
for i, word in enumerate(words):
if i == 0 or word.lower() not in small_words:
result.append(word.capitalize())
else:
result.append(word.lower())
return " ".join(result)
math_helpers.py:
# my_utils/math_helpers.py
def clamp(value, min_val, max_val):
"""Restrict value to the range [min_val, max_val].
>>> clamp(15, 0, 10)
10
>>> clamp(-5, 0, 10)
0
"""
return max(min_val, min(value, max_val))
def percentage(part, whole):
"""Calculate percentage.
>>> percentage(25, 200)
12.5
"""
if whole == 0:
return 0.0
return (part / whole) * 100
def average(numbers):
"""Calculate the arithmetic mean.
>>> average([10, 20, 30])
20.0
"""
if not numbers:
return 0.0
return sum(numbers) / len(numbers)
def is_prime(n):
"""Check if a number is prime.
>>> is_prime(17)
True
>>> is_prime(4)
False
"""
if n < 2:
return False
for i in range(2, int(n ** 0.5) + 1):
if n % i == 0:
return False
return True
file_helpers.py:
# my_utils/file_helpers.py
from pathlib import Path
def read_lines(filepath):
"""Read a file and return a list of stripped lines."""
return Path(filepath).read_text().strip().splitlines()
def write_lines(filepath, lines):
"""Write a list of strings to a file, one per line."""
Path(filepath).write_text("\n".join(lines) + "\n")
def file_size_human(filepath):
"""Return human-readable file size.
>>> file_size_human("small_file.txt") # If file is 1536 bytes
'1.50 KB'
"""
size = Path(filepath).stat().st_size
for unit in ["B", "KB", "MB", "GB", "TB"]:
if size < 1024:
return f"{size:.2f} {unit}"
size /= 1024
return f"{size:.2f} PB"
def ensure_directory(path):
"""Create a directory (and parents) if it does not exist."""
Path(path).mkdir(parents=True, exist_ok=True)
__init__.py:
# my_utils/__init__.py
from .string_helpers import slugify, truncate, word_count, title_case
from .math_helpers import clamp, percentage, average, is_prime
from .file_helpers import read_lines, write_lines, file_size_human, ensure_directory
__version__ = "1.0.0"
__all__ = [
"slugify", "truncate", "word_count", "title_case",
"clamp", "percentage", "average", "is_prime",
"read_lines", "write_lines", "file_size_human", "ensure_directory",
]
Using the Package:
# main.py
from my_utils import slugify, is_prime, average, ensure_directory
print(slugify("Hello World! This is Great")) # hello-world-this-is-great
print(is_prime(17)) # True
print(average([85, 90, 78, 92, 88])) # 86.6
ensure_directory("output/reports")
Example 2: Password Generator (Using Standard Library)
# password_generator.py
import random
import string
import math
def generate_password(
length=16,
use_uppercase=True,
use_digits=True,
use_symbols=True,
exclude_chars="",
):
"""Generate a secure random password."""
chars = string.ascii_lowercase
if use_uppercase:
chars += string.ascii_uppercase
if use_digits:
chars += string.digits
if use_symbols:
chars += string.punctuation
# Remove excluded characters
for ch in exclude_chars:
chars = chars.replace(ch, "")
if not chars:
raise ValueError("No characters available for password generation")
password = "".join(random.choice(chars) for _ in range(length))
return password
def password_strength(password):
"""Estimate password strength."""
charset_size = 0
if any(c in string.ascii_lowercase for c in password):
charset_size += 26
if any(c in string.ascii_uppercase for c in password):
charset_size += 26
if any(c in string.digits for c in password):
charset_size += 10
if any(c in string.punctuation for c in password):
charset_size += 32
if charset_size == 0:
return "Empty", 0
entropy = len(password) * math.log2(charset_size)
if entropy < 28:
strength = "Very Weak"
elif entropy < 36:
strength = "Weak"
elif entropy < 60:
strength = "Moderate"
elif entropy < 80:
strength = "Strong"
else:
strength = "Very Strong"
return strength, round(entropy, 1)
if __name__ == "__main__":
print("=== Password Generator ===\n")
for length in [8, 12, 16, 24]:
pw = generate_password(length=length)
strength, entropy = password_strength(pw)
print(f"Length {length:2d}: {pw}")
print(f" Strength: {strength} ({entropy} bits of entropy)\n")
# Generate passwords without confusing characters
pw = generate_password(length=16, exclude_chars="0OIl1|")
print(f"Easy-to-read: {pw}")
Example 3: Date Calculator
# date_calculator.py
from datetime import datetime, date, timedelta
import calendar
def days_between(date1_str, date2_str, fmt="%Y-%m-%d"):
"""Calculate the number of days between two date strings."""
d1 = datetime.strptime(date1_str, fmt)
d2 = datetime.strptime(date2_str, fmt)
diff = abs((d2 - d1).days)
return diff
def days_until(target_str, fmt="%Y-%m-%d"):
"""Calculate days from today until a target date."""
target = datetime.strptime(target_str, fmt).date()
today = date.today()
diff = (target - today).days
return diff
def add_business_days(start_date, num_days):
"""Add business days (skipping weekends) to a date."""
current = start_date
added = 0
while added < num_days:
current += timedelta(days=1)
if current.weekday() < 5: # Monday=0 to Friday=4
added += 1
return current
def age_calculator(birth_date_str, fmt="%Y-%m-%d"):
"""Calculate age in years, months, and days."""
birth = datetime.strptime(birth_date_str, fmt).date()
today = date.today()
years = today.year - birth.year
months = today.month - birth.month
days = today.day - birth.day
if days < 0:
months -= 1
# Get days in the previous month
prev_month = today.month - 1 if today.month > 1 else 12
prev_year = today.year if today.month > 1 else today.year - 1
days += calendar.monthrange(prev_year, prev_month)[1]
if months < 0:
years -= 1
months += 12
return years, months, days
if __name__ == "__main__":
print("=== Date Calculator ===\n")
# Days between two dates
d = days_between("2026-01-01", "2026-12-31")
print(f"Days in 2026: {d}")
# Days until New Year
until_ny = days_until("2026-12-31")
print(f"Days until Dec 31, 2026: {until_ny}")
# Business days
start = date.today()
deadline = add_business_days(start, 10)
print(f"10 business days from today: {deadline.strftime('%Y-%m-%d (%A)')}")
# Age calculator
years, months, days = age_calculator("1995-08-15")
print(f"Age (born 1995-08-15): {years} years, {months} months, {days} days")
Example 4: File Organiser Script
# file_organiser.py
"""Organise files in a directory by extension."""
import os
import shutil
from pathlib import Path
from collections import defaultdict, Counter
from datetime import datetime
# Map extensions to folder names
EXTENSION_MAP = {
# Images
".jpg": "Images", ".jpeg": "Images", ".png": "Images",
".gif": "Images", ".bmp": "Images", ".svg": "Images", ".webp": "Images",
# Documents
".pdf": "Documents", ".doc": "Documents", ".docx": "Documents",
".txt": "Documents", ".rtf": "Documents", ".odt": "Documents",
# Spreadsheets
".xls": "Spreadsheets", ".xlsx": "Spreadsheets", ".csv": "Spreadsheets",
# Videos
".mp4": "Videos", ".avi": "Videos", ".mkv": "Videos", ".mov": "Videos",
# Audio
".mp3": "Audio", ".wav": "Audio", ".flac": "Audio", ".aac": "Audio",
# Archives
".zip": "Archives", ".tar": "Archives", ".gz": "Archives", ".rar": "Archives",
# Code
".py": "Code", ".js": "Code", ".html": "Code", ".css": "Code",
".java": "Code", ".cpp": "Code", ".c": "Code", ".rs": "Code",
}
def organise_directory(source_dir, dry_run=True):
"""Organise files in a directory by their extension.
Args:
source_dir: Path to the directory to organise.
dry_run: If True, only prints what would happen without moving files.
Returns:
Dictionary mapping category to list of moved files.
"""
source = Path(source_dir)
if not source.is_dir():
print(f"Error: '{source_dir}' is not a valid directory")
return {}
moved = defaultdict(list)
stats = Counter()
for filepath in source.iterdir():
# Skip directories and hidden files
if filepath.is_dir() or filepath.name.startswith("."):
continue
ext = filepath.suffix.lower()
category = EXTENSION_MAP.get(ext, "Other")
dest_dir = source / category
dest_file = dest_dir / filepath.name
# Handle name conflicts
if dest_file.exists():
stem = filepath.stem
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
dest_file = dest_dir / f"{stem}_{timestamp}{ext}"
if dry_run:
print(f" [DRY RUN] {filepath.name} -> {category}/")
else:
dest_dir.mkdir(exist_ok=True)
shutil.move(str(filepath), str(dest_file))
print(f" Moved: {filepath.name} -> {category}/")
moved[category].append(filepath.name)
stats[category] += 1
return dict(moved)
if __name__ == "__main__":
import sys
target = sys.argv[1] if len(sys.argv) > 1 else "."
dry = "--execute" not in sys.argv
print(f"Organising: {os.path.abspath(target)}")
if dry:
print("(Dry run — add --execute to actually move files)\n")
else:
print("(EXECUTING — files will be moved)\n")
results = organise_directory(target, dry_run=dry)
print(f"\nSummary:")
total = 0
for category, files in sorted(results.items()):
print(f" {category}: {len(files)} files")
total += len(files)
print(f" Total: {total} files")
Best Practices
1. Organise Imports Properly
Follow the standard import ordering convention (also enforced by tools like isort):
# 1. Standard library imports
import os
import sys
from datetime import datetime
from pathlib import Path
# 2. Third-party imports
import requests
import pandas as pd
from flask import Flask, jsonify
# 3. Local / project imports
from myapp.models import User
from myapp.utils import format_date
Each group should be separated by a blank line, and imports within each group should be alphabetically sorted.
2. Use __all__ to Control Exports
Define __all__ to explicitly declare what your module exports:
# mymodule.py
__all__ = ["public_function", "PublicClass"]
def public_function():
"""This will be exported."""
pass
def _private_helper():
"""This will NOT be exported (leading underscore convention)."""
pass
class PublicClass:
"""This will be exported."""
pass
When someone does from mymodule import *, only names listed in __all__ are imported.
3. Avoid Circular Imports
Circular imports happen when two modules import each other:
# BAD: Circular import
# module_a.py
from module_b import function_b # module_b imports module_a!
def function_a():
return function_b()
# module_b.py
from module_a import function_a # module_a imports module_b!
def function_b():
return function_a()
Solutions:
- Move the shared code into a third module
- Use late imports (import inside the function)
- Restructure your code
# Solution: late import
# module_a.py
def function_a():
from module_b import function_b # Import only when needed
return function_b()
4. Prefer Absolute Imports
# GOOD: Absolute imports — clear and unambiguous
from myproject.utils.math_helpers import add
from myproject.models.user import User
# OK: Relative imports within a package (keep them short)
from .math_helpers import add # Same package
from ..models.user import User # Parent package
# AVOID: Deep relative imports
from ...core.base.mixins import LogMixin # Hard to follow
Common Mistakes
1. Naming Files the Same as Standard Library Modules
This is the single most common beginner mistake:
# You create a file called "random.py"
# random.py
import random # This imports YOUR file, not the standard library!
print(random.randint(1, 10)) # AttributeError!
Solution: Never name your files random.py, math.py, os.py, json.py, string.py, email.py, test.py, or any other standard library module name.
2. Circular Imports
As covered above, this happens when module A imports module B, and module B imports module A. The fix is to restructure your code or use late (inside-function) imports.
3. Forgetting __init__.py
While Python 3.3+ supports "namespace packages" without __init__.py, it is best practice to always include it:
# Without __init__.py, imports may behave unexpectedly
my_package/
├── module_a.py
└── module_b.py
# With __init__.py — explicit, clear, reliable
my_package/
├── __init__.py
├── module_a.py
└── module_b.py
4. Import Side Effects
Avoid running significant code at module level. It executes on every import:
# BAD: Side effects on import
# config.py
import requests
# This HTTP request runs every time someone imports config!
response = requests.get("https://api.example.com/config")
CONFIG = response.json()
# GOOD: Wrap side effects in functions
# config.py
import requests
_config_cache = None
def get_config():
"""Load config on first call, then cache it."""
global _config_cache
if _config_cache is None:
response = requests.get("https://api.example.com/config")
_config_cache = response.json()
return _config_cache
5. Using from module import * in Production Code
# BAD: Unclear where names come from, risk of collisions
from os import *
from sys import *
from json import *
# GOOD: Explicit imports
from os import path, getcwd, listdir
from sys import argv, exit
from json import loads, dumps
Practice Exercises
Exercise 1: Module Creation
Create a module called temperature.py with functions:
celsius_to_fahrenheit(c)— convert Celsius to Fahrenheitfahrenheit_to_celsius(f)— convert Fahrenheit to Celsiuscelsius_to_kelvin(c)— convert Celsius to Kelvinis_boiling(celsius)— returnTrueif water boils at this temperature
Add a __name__ == "__main__" guard that tests each function.
# temperature.py
def celsius_to_fahrenheit(c):
return (c * 9 / 5) + 32
def fahrenheit_to_celsius(f):
return (f - 32) * 5 / 9
def celsius_to_kelvin(c):
return c + 273.15
def is_boiling(celsius):
return celsius >= 100
if __name__ == "__main__":
print(celsius_to_fahrenheit(100)) # 212.0
print(fahrenheit_to_celsius(32)) # 0.0
print(celsius_to_kelvin(0)) # 273.15
print(is_boiling(100)) # True
print(is_boiling(99)) # False
Exercise 2: Standard Library Exploration
Write a script that uses at least 5 standard library modules to:
- Generate 10 random integers between 1 and 100 (
random) - Calculate their mean and standard deviation (
math) - Save the results with a timestamp to a JSON file (
json,datetime) - Print the file size (
os)
import random
import math
import json
from datetime import datetime
import os
# 1. Generate random numbers
numbers = [random.randint(1, 100) for _ in range(10)]
# 2. Calculate statistics
mean = sum(numbers) / len(numbers)
variance = sum((x - mean) ** 2 for x in numbers) / len(numbers)
std_dev = math.sqrt(variance)
# 3. Save to JSON with timestamp
result = {
"timestamp": datetime.now().isoformat(),
"numbers": numbers,
"mean": round(mean, 2),
"std_dev": round(std_dev, 2),
}
filename = "stats_output.json"
with open(filename, "w") as f:
json.dump(result, f, indent=2)
# 4. Print file size
size = os.path.getsize(filename)
print(f"Numbers: {numbers}")
print(f"Mean: {mean:.2f}, Std Dev: {std_dev:.2f}")
print(f"Saved to {filename} ({size} bytes)")
Exercise 3: Package Builder
Create a package called texttools with the following structure:
texttools/
├── __init__.py
├── analysis.py (word_count, char_count, sentence_count)
├── transform.py (reverse, to_snake_case, to_camel_case)
└── validate.py (is_email, is_url, is_phone)
Write the __init__.py to export all functions, then write a main.py that uses them.
Exercise 4: Collections Challenge
Given a list of student records, use collections to:
- Count how many students got each grade (use
Counter) - Group students by grade (use
defaultdict) - Find the top 3 most common grades (use
Counter.most_common)
from collections import Counter, defaultdict
students = [
("Alice", "A"), ("Bob", "B"), ("Charlie", "A"),
("Diana", "C"), ("Eve", "A"), ("Frank", "B"),
("Grace", "A"), ("Hank", "B"), ("Ivy", "C"),
("Jack", "A"), ("Kate", "D"), ("Leo", "B"),
]
# 1. Count grades
grade_counts = Counter(grade for _, grade in students)
print(grade_counts) # Counter({'A': 5, 'B': 4, 'C': 2, 'D': 1})
# 2. Group by grade
by_grade = defaultdict(list)
for name, grade in students:
by_grade[grade].append(name)
for grade in sorted(by_grade):
print(f"Grade {grade}: {by_grade[grade]}")
# 3. Top 3 grades
print(grade_counts.most_common(3))
# [('A', 5), ('B', 4), ('C', 2)]
Exercise 5: File Organiser Enhancement
Extend the file organiser example from this chapter:
- Add a
--logflag that writes all moves to a log file - Add support for organising by date (files modified this month, last month, older)
- Add a summary that shows total size moved per category
Exercise 6: Build a CLI Tool
Use sys.argv (or the argparse standard library module) to build a command-line tool that:
- Accepts a filename and an operation (
--count-words,--count-lines,--find PATTERN) - Reads the file and performs the operation
- Prints the result
# text_tool.py
import sys
import re
def count_words(text):
return len(text.split())
def count_lines(text):
return len(text.splitlines())
def find_pattern(text, pattern):
matches = re.findall(pattern, text, re.IGNORECASE)
return matches
if __name__ == "__main__":
if len(sys.argv) < 3:
print("Usage: python text_tool.py <file> <--count-words|--count-lines|--find PATTERN>")
sys.exit(1)
filename = sys.argv[1]
operation = sys.argv[2]
with open(filename, "r") as f:
content = f.read()
if operation == "--count-words":
print(f"Words: {count_words(content)}")
elif operation == "--count-lines":
print(f"Lines: {count_lines(content)}")
elif operation == "--find":
if len(sys.argv) < 4:
print("Error: --find requires a pattern")
sys.exit(1)
pattern = sys.argv[3]
matches = find_pattern(content, pattern)
print(f"Found {len(matches)} matches for '{pattern}':")
for m in matches:
print(f" {m}")
else:
print(f"Unknown operation: {operation}")
sys.exit(1)
Summary
In this final chapter, you learned how Python organises and distributes code:
- Modules are
.pyfiles that group related functions, classes, and variables - Importing gives you access to code from other modules using
import,from ... import, and aliases - The
__name__guard lets you write code that runs only when a file is executed directly - Packages are directories of modules with an
__init__.pyfile, supporting nested sub-packages - The standard library provides a vast collection of ready-to-use modules — from
mathandrandomtojson,collections,itertools,pathlib, and many more pipinstalls third-party packages from PyPI, and virtual environments keep project dependencies isolated- Best practices include organising imports, using
__all__, and avoiding circular imports
Congratulations!
You have completed the entire Python tutorial series. You now have a solid foundation covering:
- Variables, data types, and operators
- Control flow (if/elif/else, loops)
- Data structures (lists, tuples, dictionaries, sets)
- Functions and scope
- String manipulation
- File handling
- Error handling and exceptions
- Object-oriented programming (classes, inheritance)
- List comprehensions and generators
- Decorators and closures
- Modules, packages, and the standard library
What to Build Next
The best way to solidify your knowledge is to build projects. Here are some ideas:
| Project | Skills Practised |
|---|---|
| To-do list CLI app | File I/O, JSON, argparse, OOP |
| Web scraper | requests, beautifulsoup4, csv, file handling |
| Personal budget tracker | Classes, file I/O, datetime, collections |
| Quiz game | random, dictionaries, loops, file I/O |
| Weather app | requests, json, API interaction |
| URL shortener | flask/fastapi, hashlib, databases |
| Markdown to HTML converter | re, file I/O, pathlib |
| Chat bot | String processing, random, json |
Advanced Topics to Explore
Once you are comfortable building projects, dive deeper into:
- Asynchronous programming —
asyncio,async/await - Testing —
pytest, test-driven development (TDD) - Web development — Flask, FastAPI, Django
- Data science — pandas, NumPy, matplotlib
- Databases — SQLAlchemy, SQLite, PostgreSQL
- APIs — Building and consuming REST APIs
- Type checking —
mypyfor static analysis - Packaging — Publishing your own packages on PyPI
- Design patterns — Singleton, Factory, Observer, Strategy
- Concurrency —
threading,multiprocessing,asyncio
Happy coding!