Let's put the concepts from this chapter into practice. We'll create a Pydantic model to validate the input data for a hypothetical machine learning model, specifically one designed to predict house prices based on certain features. As discussed, ensuring the input data conforms to the expected structure and constraints is fundamental before feeding it into any model for inference.
Imagine our house price prediction model requires the following input features:
area_sqft
: The living area in square feet (must be a positive number).bedrooms
: The number of bedrooms (must be a non-negative integer).bathrooms
: The number of bathrooms (can be half-bathrooms, so a positive float, e.g., 1.5).region
: The region where the house is located (must be one of 'North', 'South', 'East', 'West').We can define a Pydantic model to represent and validate this structure.
First, let's define an Enum
for the allowed regions and then create our Pydantic BaseModel
.
# models.py
from pydantic import BaseModel, Field, validator
from enum import Enum
class HouseRegion(str, Enum):
"""Permitted regions for house location."""
NORTH = "North"
SOUTH = "South"
EAST = "East"
WEST = "West"
class HouseFeatures(BaseModel):
"""Input features for the house price prediction model."""
area_sqft: float = Field(..., gt=0, description="Living area in square feet, must be positive.")
bedrooms: int = Field(..., ge=0, description="Number of bedrooms, must be non-negative.")
bathrooms: float = Field(..., gt=0, description="Number of bathrooms, must be positive (e.g., 1.5 for 1 full, 1 half).")
region: HouseRegion = Field(..., description="Region where the house is located.")
# Example of a custom validator if needed, though Field constraints cover this case
# @validator('area_sqft')
# def area_must_be_positive(cls, v):
# if v <= 0:
# raise ValueError('Area must be positive')
# return v
class Config:
# Provides example data for documentation
schema_extra = {
"example": {
"area_sqft": 1500.5,
"bedrooms": 3,
"bathrooms": 2.5,
"region": "North"
}
}
In this HouseFeatures
model:
float
, int
, HouseRegion
) for basic type validation.Field
is imported from Pydantic and used to add constraints:
...
indicates the field is required.gt=0
means "greater than 0".ge=0
means "greater than or equal to 0".description
helps document the field in the API schema.HouseRegion
enum to restrict the region
field to specific string values.Config.schema_extra
provides an example payload that will appear in the automatically generated API documentation.@validator
is shown as an example, although Pydantic's built-in Field
constraints often suffice for simple checks like positivity.Now, let's create a simple FastAPI application that uses this model to validate incoming request data for a prediction endpoint.
# main.py
from fastapi import FastAPI, HTTPException
from models import HouseFeatures # Assuming models.py is in the same directory
app = FastAPI(
title="House Price Prediction API",
description="API to predict house prices based on features.",
version="0.1.0",
)
@app.post("/predict/house_price")
async def predict_house_price(features: HouseFeatures):
"""
Predicts the price of a house based on its features.
This endpoint accepts house features and returns a placeholder prediction.
Input validation is performed using the `HouseFeatures` model.
"""
# In a real application, you would load your ML model here
# and use the validated features: features.area_sqft, features.bedrooms, etc.
# Example: prediction = model.predict([[features.area_sqft, features.bedrooms, ...]])
# For this practice, we just return the validated data and a dummy prediction
print(f"Received valid features: {features.dict()}")
# Dummy prediction logic
estimated_price = (features.area_sqft * 100) + (features.bedrooms * 5000) + (features.bathrooms * 3000)
if features.region == "North":
estimated_price *= 1.2
elif features.region == "West":
estimated_price *= 1.1
return {"validated_features": features, "estimated_price": round(estimated_price, 2)}
# To run this app: uvicorn main:app --reload
Here, the /predict/house_price
endpoint expects a POST request. By type-hinting the features
parameter with our HouseFeatures
model (features: HouseFeatures
), FastAPI automatically:
HouseFeatures
model.features
variable within the function will be an instance of HouseFeatures
, containing the validated data.422 Unprocessable Entity
HTTP error response detailing the validation errors.Run the application using Uvicorn:
uvicorn main:app --reload
Now, you can test the endpoint using tools like curl
or FastAPI's interactive documentation available at http://127.0.0.1:8000/docs
.
Valid Request:
curl -X POST "http://127.0.0.1:8000/predict/house_price" \
-H "Content-Type: application/json" \
-d '{
"area_sqft": 2150.75,
"bedrooms": 4,
"bathrooms": 3,
"region": "West"
}'
Expected Response (Status Code 200):
{
"validated_features": {
"area_sqft": 2150.75,
"bedrooms": 4,
"bathrooms": 3,
"region": "West"
},
"estimated_price": 268582.5
}
Invalid Request (Negative Area):
curl -X POST "http://127.0.0.1:8000/predict/house_price" \
-H "Content-Type: application/json" \
-d '{
"area_sqft": -100,
"bedrooms": 2,
"bathrooms": 1,
"region": "South"
}'
Expected Response (Status Code 422):
{
"detail": [
{
"loc": [
"body",
"area_sqft"
],
"msg": "ensure this value is greater than 0",
"type": "value_error.number.not_gt",
"ctx": {
"limit_value": 0
}
}
]
}
Invalid Request (Incorrect Region):
curl -X POST "http://127.0.0.1:8000/predict/house_price" \
-H "Content-Type: application/json" \
-d '{
"area_sqft": 1200,
"bedrooms": 2,
"bathrooms": 1.5,
"region": "Central"
}'
Expected Response (Status Code 422):
{
"detail": [
{
"loc": [
"body",
"region"
],
"msg": "value is not a valid enumeration member; permitted: 'North', 'South', 'East', 'West'",
"type": "type_error.enum",
"ctx": {
"enum_values": [
"North",
"South",
"East",
"West"
]
}
}
]
}
This hands-on exercise demonstrates how Pydantic models, when integrated with FastAPI endpoints, provide a powerful and declarative way to enforce data contracts. This automatic validation shields your downstream logic, including ML model inference, from handling malformed or nonsensical input data, leading to more reliable and maintainable applications. The clear error messages also significantly aid API consumers in debugging their requests.
Request validation flow using Pydantic within a FastAPI endpoint. Valid data proceeds to the endpoint logic, while invalid data triggers an automatic error response.
© 2025 ApX Machine Learning