Okay, your Flask application now has a dedicated endpoint ready to serve predictions. But how does it actually get the data it needs to make a prediction? When a user or another application wants a prediction, they need to send the input features (like measurements, text, or image data) to your API. This section explains how your Flask service can receive and handle this incoming data, typically formatted as JSON.
Imagine you need to send structured information over the internet. You need a format that's easy for both humans to read and write, and easy for machines to parse and generate. JSON (JavaScript Object Notation) fits this description perfectly.
JSON represents data as key-value pairs, similar to Python dictionaries, and ordered lists, similar to Python lists. Its simplicity and text-based nature make it the de facto standard for transferring data in web APIs.
Here's an example of how input features for a simple model might look in JSON format:
{
"sepal_length": 5.1,
"sepal_width": 3.5,
"petal_length": 1.4,
"petal_width": 0.2
}
Or, if you expect multiple features as a list:
{
"features": [5.1, 3.5, 1.4, 0.2]
}
Using JSON provides a common language for the client sending the request and your Flask API receiving it.
Flask makes it straightforward to access data sent with an incoming web request. When a client sends data (like the JSON above) in the body of an HTTP request (typically using a POST
method), Flask provides a global request
object. You need to import it from the flask
library:
from flask import Flask, request, jsonify
import joblib # Or pickle
# Assuming your Flask app is initialized
app = Flask(__name__)
# Load your model (adjust path as needed)
# model = joblib.load('your_model.pkl')
# preprocessor = joblib.load('your_preprocessor.pkl') # If you have one
@app.route('/predict', methods=['POST'])
def predict():
# Access incoming request data here
pass # We'll fill this in
Inside your route function (predict
in this example), the request
object holds all information about the incoming request, including headers, arguments, and, significantly, the data sent in the request body.
Since we expect the client to send data in JSON format, Flask provides a convenient method: request.get_json()
. This method attempts to parse the request body as JSON and returns a Python dictionary (or list, depending on the JSON structure).
Let's update our predict
function to use it:
# (Flask app setup and model loading above)
@app.route('/predict', methods=['POST'])
def predict():
# Check if the request content type is JSON
if not request.is_json:
return jsonify({"error": "Request must be JSON"}), 400
# Parse the JSON data from the request body
try:
data = request.get_json()
app.logger.info(f"Received data: {data}") # Log received data
except Exception as e:
app.logger.error(f"Error parsing JSON: {e}")
return jsonify({"error": "Could not parse JSON data"}), 400
# --- Next steps: Validate and use the data ---
# (Placeholder for validation and prediction)
return jsonify({"message": "Data received, processing..."}) # Temporary response
Here's what's happening:
request.is_json
. This looks at the Content-Type
header of the request (e.g., application/json
) to give us confidence that the client intended to send JSON. If not, we return an error response with HTTP status code 400 (Bad Request).request.get_json()
to parse the data. This method can raise an exception if the data isn't valid JSON, so we wrap it in a try...except
block for basic error handling.app.logger.info
. Logging is very useful for debugging.Just receiving JSON isn't enough. Your model expects data in a specific structure and format. Before feeding the data to your model, you should perform some validation:
features
, you should check if that key exists.features
should be a list of numbers, check if it's actually a list and potentially if its elements look like numbers. More complex type validation is possible but might be overkill for a simple service.Let's add a simple structure check:
# (Flask app setup and model loading above)
@app.route('/predict', methods=['POST'])
def predict():
if not request.is_json:
return jsonify({"error": "Request must be JSON"}), 400
try:
data = request.get_json()
app.logger.info(f"Received data: {data}")
except Exception as e:
app.logger.error(f"Error parsing JSON: {e}")
return jsonify({"error": "Could not parse JSON data"}), 400
# --- Basic Validation ---
# Example: Check if 'features' key exists and is a list
if 'features' not in data or not isinstance(data['features'], list):
app.logger.error("Invalid input: 'features' key missing or not a list")
return jsonify({"error": "Missing or invalid 'features' key. Expected a list."}), 400
input_features = data['features']
# (Potentially add more checks, e.g., number of features)
app.logger.info(f"Extracted features: {input_features}")
# --- Next: Prepare data for the model and predict ---
# (Placeholder for prediction logic)
return jsonify({"message": f"Received features: {input_features}"}) # Temporary response
This validation step helps prevent errors later when you try to use the data with your model and provides clearer error messages to the client if they send improperly formatted data.
Often, the data format received via JSON isn't exactly what your machine learning model expects. For example, scikit-learn models typically require input as a NumPy array or a similar structure (like a list of lists, where each inner list is a sample).
You'll need to convert the data extracted from the JSON into the required format. If you also saved preprocessing steps (like scalers or encoders), you'll apply them here as well.
# (Flask app setup, model loading, imports like numpy)
import numpy as np # Make sure to import numpy
@app.route('/predict', methods=['POST'])
def predict():
# (JSON parsing and validation as above...)
# --- Prepare data for the model ---
try:
# Assuming 'features' is a list of numbers [f1, f2, f3, ...]
input_features = data['features']
# Example: Convert to a NumPy array suitable for scikit-learn
# Model expects a 2D array: [[f1, f2, f3, ...]] for a single prediction
model_input = np.array(input_features).reshape(1, -1)
app.logger.info(f"Prepared model input shape: {model_input.shape}")
# If you have a preprocessor:
# model_input = preprocessor.transform(model_input)
# app.logger.info("Applied preprocessing")
except Exception as e:
app.logger.error(f"Error preparing data for model: {e}")
return jsonify({"error": "Invalid data format for model processing."}), 400
# --- Next: Make the prediction ---
# prediction = model.predict(model_input)
# (Placeholder for prediction logic and returning results)
return jsonify({"message": f"Model input prepared: {model_input.tolist()}"}) # Temporary response
In this step:
reshape(1, -1)
because most scikit-learn models expect a 2D array where each row is a sample, even if we're predicting for just one sample.Now your Flask application is equipped to receive JSON data, perform basic validation, and transform it into a format ready for your machine learning model. The next step is to actually use this prepared data to make a prediction and return the result.
© 2025 ApX Machine Learning