Chapter 14 of 20

ChatGPT API — Getting Started

Understand what the ChatGPT API is, how to get an API key, how pricing works, and how to make your first API call with curl and the Python openai library.

Meritshot13 min read
ChatGPTAPIOpenAIPythoncurlDeveloperIntegration
All ChatGPT Chapters

ChatGPT API — Getting Started

The ChatGPT web interface at chat.openai.com is a polished product built on top of an API. That same API is available to developers, which means you can embed GPT-4o's capabilities directly into your own applications — a customer support bot for your e-commerce site, an automated document summariser, a coding assistant inside your IDE, or a backend service that processes thousands of pieces of text overnight.

This chapter demystifies the API: what it is, how to get access, how pricing works, and how to make your first calls — both with raw curl and with the Python openai library.


1. API vs Chat Interface — When to Use Which

The chat interface at chat.openai.com and the API both use the same underlying model, but they serve fundamentally different purposes.

DimensionChat InterfaceAPI
Target userNon-technical end usersDevelopers and businesses
InteractionManual, one at a timeProgrammatic, can be automated
CustomisationLimited (system prompt via Custom GPTs)Full control over every parameter
VolumeLimited to rate-of-typingThousands of requests per hour
IntegrationStandalone productEmbedded in your application
Cost modelFlat subscription (Plus = $20/month)Pay per token consumed
MemoryBuilt-in conversation history UIStateless — you manage history

Use the chat interface when you are exploring ideas, drafting content, or doing research manually.

Use the API when you want to:

  • Automate repetitive text processing tasks
  • Build products or internal tools powered by AI
  • Process data in bulk (summarising 500 customer reviews, classifying support tickets)
  • Integrate AI into an existing codebase
  • Control the model's behaviour precisely with system prompts and parameters

2. Getting an API Key

Create an OpenAI Account

If you do not already have one, create an account at platform.openai.com. Note that this is separate from your chat.openai.com account, though you can link them using the same email.

Add a Payment Method

The API is billed on usage — you pay for what you use, not a flat fee. You must add a credit or debit card (international cards work; Visa and Mastercard issued by Indian banks that support international transactions are typically accepted). You can also prepay a specific amount.

OpenAI offers a small amount of free credits for new API accounts, though this changes periodically — check the current offer when you sign up.

Generate Your API Key

  1. Go to platform.openai.com/api-keys
  2. Click "Create new secret key"
  3. Give it a descriptive name (e.g., "meritshot-tutorial-key")
  4. Copy the key immediately — OpenAI shows it only once

Your key will look like this:

sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Protect Your Key

An API key is a credential with billing attached. Treat it like a password:

  • Never paste it into code you commit to a public GitHub repository
  • Never share it in a chat message, email, or screenshot
  • Store it in an environment variable, not hardcoded in source files
  • If you suspect a key is compromised, revoke it immediately from the dashboard and generate a new one

The standard practice is to store keys in a .env file locally and load them via os.environ in Python:

# .env file (never commit this to version control)
OPENAI_API_KEY=sk-proj-your-key-here

3. Understanding the Pricing Model

The API charges you per token. A token is roughly 4 characters of English text, or about 0.75 words. "Meritshot is a great learning platform" is approximately 8 tokens.

Pricing has two components:

  • Input tokens — everything you send to the model (system prompt + conversation history + your current message)
  • Output tokens — everything the model generates in response

Output tokens are typically more expensive than input tokens. Prices are listed per million tokens (per 1M).

Current Model Pricing Comparison

OpenAI offers several models at different price points. The two most commonly used for applications are:

ModelInput (per 1M tokens)Output (per 1M tokens)Best for
gpt-4o-mini~$0.15~$0.60High-volume tasks, classification, simple Q&A
gpt-4o~$2.50~$10.00Complex reasoning, nuanced generation, coding
gpt-4.1-nano~$0.10~$0.40Very high volume, simple tasks
gpt-4.1~$2.00~$8.00Balanced performance and cost

Note: Prices change and new models are released frequently. Always check platform.openai.com/docs/models for the current pricing before budgeting.

Practical Cost Calculation

Imagine you are building a product description generator for an Indian e-commerce seller. Each product description request involves:

  • System prompt: 200 tokens
  • Product details input: 100 tokens
  • Generated description output: 300 tokens
  • Total per request: 600 tokens (300 input + 300 output)

Using gpt-4o-mini:

  • Input cost: 300 tokens x ($0.15 / 1,000,000) = $0.000045
  • Output cost: 300 tokens x ($0.60 / 1,000,000) = $0.00018
  • Total per request: approximately $0.000225

At this rate, processing 10,000 product descriptions costs about $2.25 — roughly ₹19. That is dramatically cheaper than hiring a copywriter.

Using gpt-4o for the same task:

  • Total per request: approximately $0.00375
  • 10,000 descriptions: ~$37.50 (₹315)

The choice between models depends on how much quality you need. For simple, structured tasks, gpt-4o-mini is usually sufficient.


4. Your First API Call with curl

curl is a command-line tool for making HTTP requests. It is available on macOS and Linux by default, and on Windows via WSL or Git Bash. This is the most direct way to see the API in action without any code setup.

Open your terminal and run:

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of Rajasthan?"
      }
    ],
    "max_tokens": 50
  }'

The $OPENAI_API_KEY refers to your environment variable. If you have not set it yet, you can set it temporarily in your terminal session:

export OPENAI_API_KEY="sk-proj-your-key-here"

Understanding the Response

The API returns a JSON object:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1720000000,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of Rajasthan is Jaipur."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 17,
    "completion_tokens": 9,
    "total_tokens": 26
  }
}

The fields you care about most:

  • choices[0].message.content — the model's response text
  • choices[0].finish_reason — why it stopped (stop = natural end, length = hit max_tokens limit)
  • usage — token counts for billing purposes

5. Your First API Call with Python

Python is the most common language for working with the OpenAI API. Install the official library:

pip install openai python-dotenv

Then create a file called first_call.py:

import os
from dotenv import load_dotenv
from openai import OpenAI

# Load API key from .env file
load_dotenv()

# Initialise the client
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

# Make the API call
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": "Explain compound interest in two sentences, using a ₹10,000 SIP example."
        }
    ],
    max_tokens=150
)

# Extract and print the response
print(response.choices[0].message.content)
print(f"\nTokens used: {response.usage.total_tokens}")

Run it:

python first_call.py

You should see the model's explanation printed to the terminal, followed by the token count.


6. The Messages Array — Understanding Roles

The core data structure of the Chat Completions API is the messages array. Every call to the API takes a list of messages, each with a role and content. There are three roles:

system

The system message sets the context, persona, and rules for the model. It comes first in the array and shapes how the model interprets everything that follows. If you do not include a system message, the model uses its default helpful-assistant behaviour.

messages = [
    {
        "role": "system",
        "content": "You are a helpful assistant for an Indian tax filing platform. Answer questions about ITR filing clearly and accurately. Always recommend consulting a CA for complex cases."
    }
]

user

The user message is the input from the person (or from your application on the user's behalf). This is the question or instruction.

messages.append({
    "role": "user",
    "content": "I am a freelancer earning ₹8 lakh per year. Which ITR form should I use?"
})

assistant

The assistant message is the model's response. When you are building a multi-turn conversation, you include previous assistant responses in the messages array so the model has context about what it already said.

messages.append({
    "role": "assistant",
    "content": "As a freelancer, you would typically file using ITR-3 or ITR-4 (Sugam), depending on whether you opt for the presumptive taxation scheme under Section 44ADA."
})

A Complete Multi-Turn Example

messages = [
    {"role": "system", "content": "You are a knowledgeable assistant for Indian tax questions."},
    {"role": "user", "content": "Which ITR form should a freelancer use?"},
    {"role": "assistant", "content": "Freelancers typically use ITR-4 (Sugam) under Section 44ADA if their income is under ₹75 lakh."},
    {"role": "user", "content": "What if my income is above ₹75 lakh?"},
]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)
print(response.choices[0].message.content)

By including the prior exchange in the messages array, the model understands that "above ₹75 lakh" refers to the freelancer's income from the previous turn. Without that history, it would have no context.

Why the API Is Stateless

The API has no memory between calls. Every request must include the full conversation history you want the model to be aware of. This is different from the chat interface, which maintains history automatically. The implication for application developers is that you are responsible for storing and managing conversation history — which we cover in depth in the next chapter.


7. Key Parameters

Beyond model and messages, the API accepts several important parameters:

ParameterTypeWhat it does
max_tokensintegerMaximum tokens in the response. Prevents runaway long outputs.
temperaturefloat 0.0–2.0Controls randomness. 0 = deterministic, 1 = default, 2 = very creative.
top_pfloat 0.0–1.0Alternative to temperature for sampling control. Usually leave at 1.0 if using temperature.
nintegerNumber of response choices to generate. Default 1.
stopstring or listSequences at which the model stops generating.
streambooleanIf true, streams tokens as they are generated. Covered in the next chapter.

Temperature in Practice

Temperature is the parameter you will tune most often:

# For a classification task (always want consistent output)
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Classify this review as Positive, Negative, or Neutral: 'Great product, fast delivery!'"}],
    temperature=0.0
)

# For creative writing (want variety)
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Write a tagline for a premium Darjeeling tea brand."}],
    temperature=1.2
)

Common Pitfalls

Pitfall 1 — Hardcoding the API key in source code. If you commit a file with your API key to GitHub, bots scan public repositories and will find it within minutes, generating charges on your account. Always use environment variables.

Pitfall 2 — Not setting max_tokens. Without a limit, the model will generate until it naturally stops. For classification tasks or short-answer queries, this wastes tokens and adds latency. Set a reasonable upper bound.

Pitfall 3 — Choosing the wrong model for the task. Using gpt-4o for simple classification tasks is like using a sledgehammer to crack a nut — expensive and slow. Use gpt-4o-mini for high-volume simple tasks; reserve gpt-4o for tasks requiring nuanced reasoning.

Pitfall 4 — Not checking finish_reason. If finish_reason is length rather than stop, the model hit your max_tokens limit and the response was cut off. If this happens on your output, increase max_tokens.

Pitfall 5 — Ignoring the usage field. Track token consumption from the start. In a production application, log usage for every call so you can monitor costs, spot unexpected spikes, and optimise your prompts.

Pitfall 6 — Using temperature=0 for creative tasks. At temperature 0, the model is nearly deterministic — it will generate the same output for the same input almost every time. This is ideal for classification or data extraction, but produces repetitive, uncreative output for marketing copy or storytelling.


Practice Exercises

  1. Set up your API key as an environment variable and make your first curl call to the Chat Completions endpoint. Request a 3-sentence explanation of how UPI works. Print the response and the token usage.

  2. Write a Python script that reads a product name and price from user input on the command line and generates a 50-word product description for an Indian e-commerce listing. Use gpt-4o-mini and print the token cost alongside the description.

  3. Build a multi-turn conversation in Python: start with a system prompt setting the model as an Indian cooking assistant. Send at least 3 user messages and include the full conversation history in each request. Print each response.

  4. Experiment with temperature: send the prompt "Write a slogan for a vegetarian fast-food chain in India" with temperatures 0.0, 0.7, 1.0, and 1.5. Run each 3 times and observe how the outputs vary across temperatures.

  5. Write a script that processes a list of 10 customer reviews (you can make them up) and classifies each as Positive, Negative, or Neutral using the API with temperature=0. Print the results in a table alongside each review and calculate the total tokens used.


Summary

  • The ChatGPT API exposes the same underlying model as the chat interface but gives developers programmatic, automated, and high-volume access with full parameter control.
  • An API key is generated at platform.openai.com/api-keys and should always be stored in environment variables, never hardcoded or committed to version control.
  • Pricing is per token, with separate rates for input and output tokens. gpt-4o-mini is significantly cheaper than gpt-4o and is the right default for high-volume or simple tasks.
  • The curl command lets you test API calls directly from the terminal without writing any Python.
  • The Python openai library wraps the API in a clean interface: client.chat.completions.create() is the primary method you will use.
  • The messages array is the core data structure: system sets context and rules, user provides input, and assistant holds the model's previous responses for multi-turn conversations.
  • The API is stateless — you must include the full conversation history in every request; there is no built-in memory between calls.
  • Key tuning parameters include max_tokens (limits response length), temperature (controls randomness), and model (balances cost vs. capability).