While organizing your application using APIRouter
helps group related endpoints, achieving truly maintainable and testable code requires thinking about how you structure the logic within and supporting those endpoints. This is where the principle of Separation of Concerns (SoC) becomes essential. SoC advocates for breaking down your application into distinct sections, each addressing a specific responsibility.
In the context of a FastAPI application serving machine learning models, applying SoC typically involves dividing the codebase into logical layers:
API Layer (Presentation/Routing): This layer is responsible for handling HTTP requests and responses. Its concerns include:
@router.post
, @router.get
, etc.).routers/
directory (as discussed in the previous section).Business Logic Layer (Service/Domain Layer): This layer encapsulates the core functionality of your application, independent of how it's exposed via an API. For an ML API, this involves:
model.predict()
).services/
or core_logic/
. These modules should ideally have no knowledge of FastAPI or HTTP specifics.Data Layer (Models/Schemas): This layer defines the structure and validation rules for the data your application handles.
schemas.py
or within a models/
directory (distinct from the directory containing serialized ML model files).Consider a typical project layout adhering to SoC:
your_ml_api/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI app creation, include routers
│ ├── routers/
│ │ ├── __init__.py
│ │ └── inference.py # API Layer: Routes for model inference
│ ├── services/
│ │ ├── __init__.py
│ │ └── prediction.py # Business Logic Layer: Prediction service
│ ├── schemas/
│ │ ├── __init__.py
│ │ └── prediction.py # Data Layer: Pydantic input/output models
│ ├── core/ # Optional: Shared components
│ │ ├── __init__.py
│ │ └── model_loader.py # Logic for loading the ML model
│ └── ml_models/ # Directory for serialized model files
│ └── sentiment_model.joblib
├── tests/
│ └── ... # Application tests
└── requirements.txt
In this structure:
app/routers/inference.py
handles the HTTP POST request for predictions. It validates the input using a Pydantic model from app/schemas/prediction.py
and calls a function or method in app/services/prediction.py
.app/services/prediction.py
contains the predict_sentiment
function (or class). It takes the validated data, potentially performs preprocessing, uses the loaded model (perhaps obtained from app/core/model_loader.py
) to make a prediction, performs post-processing, and returns the result. It knows nothing about HTTP requests or FastAPI decorators.app/schemas/prediction.py
defines PredictionInput
and PredictionOutput
Pydantic models.Adhering to this separation offers significant advantages:
services/prediction.py
) independently of the API layer. You can write unit tests that directly call the prediction functions with various inputs without needing to simulate HTTP requests using TestClient
. Similarly, the API layer can be tested by mocking the Business Logic layer it calls.FastAPI's Dependency Injection system, which we'll touch upon later, further facilitates this separation by allowing you to cleanly provide instances of your service classes or other dependencies (like a loaded model) to your API route functions without tightly coupling them.
Diagram illustrating the flow of a request through separated layers in a FastAPI ML application. The client interacts only with the API layer, which orchestrates calls to the business logic and data layers.
© 2025 ApX Machine Learning