Now that you understand the fundamentals of LLM APIs and how authentication works, let's get practical. Sending prompts and receiving completions programmatically involves making HTTP requests from your application to the LLM provider's API endpoint. Python, with its rich ecosystem of libraries, provides effective ways to handle these interactions.
There are two primary approaches for making API calls in Python:
requests
allow you to construct and send HTTP requests manually. This gives you fine-grained control over the request details.requests
LibraryThe requests
library is a standard choice for making HTTP requests in Python due to its simplicity and power. If you haven't installed it yet, you can do so using pip:
pip install requests
To interact with an LLM API using requests
, you typically need to send an HTTP POST request. This request includes:
https://api.openai.com/v1/chat/completions
).Authorization
: Contains your API key for authentication (usually prefixed with Bearer
).Content-Type
: Specifies the format of the data you are sending, almost always application/json
.temperature
and max_tokens
.Let's illustrate with an example targeting a hypothetical LLM API endpoint. Assume you have your API key stored in an environment variable LLM_API_KEY
.
import requests
import os
import json
# Retrieve the API key from environment variables
api_key = os.getenv("LLM_API_KEY")
if not api_key:
raise ValueError("API key not found. Set the LLM_API_KEY environment variable.")
# Define the API endpoint
api_url = "https://api.example-llm-provider.com/v1/generate" # Replace with actual endpoint
# Set up the request headers
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
# Construct the request payload (body)
payload = {
"model": "example-model-v2", # Specify the desired LLM
"prompt": "Explain the difference between HTTP GET and POST requests.",
"max_tokens": 150,
"temperature": 0.7,
"stop_sequences": ["\n"] # Optional: sequences where generation should stop
}
try:
# Send the POST request
response = requests.post(api_url, headers=headers, data=json.dumps(payload))
# Check if the request was successful (status code 200 OK)
response.raise_for_status() # Raises an HTTPError for bad responses (4xx or 5xx)
# Parse the JSON response
result = response.json()
print("API Response:")
print(result)
# Extract the generated text (structure depends on the API)
if "choices" in result and len(result["choices"]) > 0:
generated_text = result["choices"][0].get("text", "No text found")
print("\nGenerated Text:")
print(generated_text.strip())
else:
print("\nCould not extract generated text from response.")
except requests.exceptions.RequestException as e:
print(f"An error occurred during the API request: {e}")
except json.JSONDecodeError:
print(f"Failed to decode JSON response: {response.text}")
except Exception as e:
print(f"An unexpected error occurred: {e}")
In this example:
requests
, os
, json
).headers
dictionary, including the Authorization
token and Content-Type
.payload
dictionary containing the prompt and model parameters. The exact structure and parameter names will vary depending on the specific LLM API provider. Consult their documentation.requests.post()
sends the request. We pass the URL, headers, and the JSON-serialized payload (json.dumps(payload)
).response.raise_for_status()
is important for basic error checking. It will raise an exception if the API returns an error status code (like 401 Unauthorized, 429 Rate Limit Exceeded, or 500 Internal Server Error).response.json()
parses the JSON response body into a Python dictionary.try...except
) is included to catch potential network issues, JSON parsing problems, or other exceptions.Using requests
provides transparency and works with any HTTP-based API, but requires you to handle details like JSON serialization and header construction manually.
LLM providers often supply Python SDKs to simplify interaction with their APIs. These libraries abstract away the lower-level HTTP request details, offering a more Pythonic interface.
For example, using the OpenAI Python library (install with pip install openai
), making a chat completion request looks like this:
import os
from openai import OpenAI, OpenAIError
# SDKs often automatically look for the API key in environment variables
# (e.g., OPENAI_API_KEY) or you can pass it during client initialization.
# Ensure OPENAI_API_KEY is set in your environment.
try:
client = OpenAI() # API key is implicitly read from OPENAI_API_KEY env var
# Make the API call using the SDK's methods
chat_completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Explain the concept of recursion in programming.",
}
],
model="gpt-3.5-turbo", # Specify the model
max_tokens=100,
temperature=0.8,
)
# Access the response content through SDK objects
print("API Response Object:")
# print(chat_completion) # The raw SDK object can be verbose
if chat_completion.choices:
response_content = chat_completion.choices[0].message.content
print("\nGenerated Text:")
print(response_content.strip())
else:
print("No response choices found.")
# Print usage information if available
if chat_completion.usage:
print("\nToken Usage:")
print(f"Prompt Tokens: {chat_completion.usage.prompt_tokens}")
print(f"Completion Tokens: {chat_completion.usage.completion_tokens}")
print(f"Total Tokens: {chat_completion.usage.total_tokens}")
except OpenAIError as e:
print(f"An OpenAI API error occurred: {e}")
except Exception as e:
print(f"An unexpected error occurred: {e}")
Notice the differences compared to using requests
:
client
object from the openai
library. Authentication is often handled implicitly by the SDK (e.g., reading OPENAI_API_KEY
from environment variables).client.chat.completions.create
) designed for the chat completion endpoint.model
, messages
, max_tokens
, and temperature
are passed as function arguments.choices[0].message.content
or usage
statistics.OpenAIError
) for better error handling related to the API.requests
when:
For most application development, using the official provider SDK is recommended as it simplifies development and maintenance. However, understanding how to use requests
is valuable for debugging, working with less common APIs, or when you need direct control over the network interaction. Regardless of the method, secure handling of API keys and robust error handling are essential components of reliable application development.
© 2025 ApX Machine Learning