Before we build our first FastAPI application, it's important to understand the underlying mechanisms that make web interactions possible: APIs and HTTP methods.
An API, or Application Programming Interface, acts as an intermediary that allows two different software applications to communicate with each other. Think of it like a menu in a restaurant. The menu (API) provides a list of dishes (operations) you can order, along with a description (data format) for each dish. You (the client application) make a request based on the menu, and the kitchen (the server application) prepares the dish and sends it back to you. You don't need to know the exact recipe or how the kitchen operates; you just need to know how to order from the menu.
In the context of web development and model deployment, we typically work with Web APIs. These APIs use the Hypertext Transfer Protocol (HTTP) to enable communication between a client (like a web browser, a mobile app, or another backend service) and a server (where our FastAPI application and ML model reside). The client sends an HTTP request to the server, and the server sends back an HTTP response.
This interaction follows a standard client-server model:
A simplified view of the client-server interaction using HTTP.
Many web APIs, including those built with FastAPI, adhere to the principles of REST (Representational State Transfer). REST isn't a strict protocol but rather an architectural style that defines a set of constraints for building scalable, stateless, and maintainable web services. Key ideas include:
/models/iris-classifier
or /predict
.HTTP defines several request methods (often called "verbs") that indicate the desired action to be performed on a resource identified by the URL. FastAPI uses these methods to route incoming requests to the correct Python functions in your code. The most common methods you'll encounter are:
GET /models/info/resnet50
). Retrieving past prediction results (GET /predictions/123
).POST /predict/image
). Submitting data to train or fine-tune a model (though often done offline).PUT /models/config/iris-classifier
). Completely replacing a model file (less common via direct API, but possible).DELETE /models/version/spam-filter-v1
). Deleting a stored prediction result (DELETE /predictions/456
).Other methods like PATCH
(for partial updates), HEAD
(like GET but without the response body), and OPTIONS
(to get communication options for a resource) also exist but are generally used less frequently in basic ML deployment APIs compared to GET and POST.
An HTTP request typically consists of:
Content-Type: application/json
indicating the format of the body, Authorization
for credentials).An HTTP response typically includes:
200 OK
, 201 Created
, 400 Bad Request
, 404 Not Found
, 500 Internal Server Error
).Content-Type: application/json
, Content-Length
).FastAPI provides convenient ways to define endpoints that correspond to specific URL paths and HTTP methods. It automatically handles parsing request data (like JSON bodies) and formatting response data based on the Python types and Pydantic models you define, which we will cover in the next chapter. Understanding these fundamental HTTP concepts is foundational for building effective web APIs for your machine learning models.
© 2025 ApX Machine Learning