requires_grad
)backward()
).grad
)torch.nn
torch.nn.Module
Base Classtorch.nn
losses)torch.optim
)torch.utils.data.Dataset
torchvision.transforms
)torch.utils.data.DataLoader
While defining custom network architectures by subclassing torch.nn.Module
offers maximum flexibility, many common models involve a straightforward sequence of layers where the output of one layer feeds directly into the input of the next. For these linear stacks, PyTorch provides a convenient container: torch.nn.Sequential
.
nn.Sequential
acts as a wrapper that takes an ordered sequence of modules (like layers and activation functions) and executes them in that specific order when input data is passed through it. Think of it as building a pipeline for your data transformations. This approach simplifies model definition when you don't need complex data flow logic, skip connections, or multiple input/output paths.
nn.Sequential
You can create a Sequential
model by passing the modules you want to include as arguments to its constructor. The order matters, as it dictates the flow of data.
Let's build a simple two-layer feed-forward network that takes a 784-dimensional input (like a flattened MNIST image), passes it through a hidden layer with 128 units and a ReLU activation, and finally produces a 10-dimensional output (for 10 classes).
import torch
import torch.nn as nn
from collections import OrderedDict
# Define input, hidden, and output dimensions
input_size = 784
hidden_size = 128
output_size = 10
# Method 1: Passing modules directly as arguments
model_v1 = nn.Sequential(
nn.Linear(input_size, hidden_size), # Layer 1: Linear transformation
nn.ReLU(), # Activation 1: Non-linearity
nn.Linear(hidden_size, output_size) # Layer 2: Linear transformation
)
# Print the model structure
print("Model V1 (Unnamed Layers):")
print(model_v1)
# Example Usage: Create a dummy input tensor
# Assume a batch size of 64
dummy_input = torch.randn(64, input_size)
output = model_v1(dummy_input)
print("\nOutput shape:", output.shape) # Expected: torch.Size([64, 10])
This creates a model where input data first goes through nn.Linear(784, 128)
, then the nn.ReLU()
activation is applied, and finally, the result passes through nn.Linear(128, 10)
. Notice how compact the definition is. The Sequential
container automatically handles passing the output of one module as the input to the next.
nn.Sequential
While the previous method works, the layers are only assigned default numerical indices (0
, 1
, 2
, ...). This can make debugging or accessing specific layers harder later. A better practice for clarity and accessibility is to use an OrderedDict
from Python's collections
module to provide names for your layers.
# Method 2: Using an OrderedDict for named layers
model_v2 = nn.Sequential(OrderedDict([
('fc1', nn.Linear(input_size, hidden_size)), # Fully connected layer 1
('relu1', nn.ReLU()), # ReLU activation
('fc2', nn.Linear(hidden_size, output_size)) # Fully connected layer 2
]))
# Print the model structure
print("\nModel V2 (Named Layers):")
print(model_v2)
# Accessing a specific layer by name is now possible
print("\nAccessing fc1 weights shape:", model_v2.fc1.weight.shape)
# You can also access using integer indices if needed
print("Accessing layer at index 0:", model_v2[0])
# Or by the string name directly if using OrderedDict
print("Accessing layer by name 'relu1':", model_v2.relu1)
Using an OrderedDict
preserves the insertion order (which is essential for nn.Sequential
) while allowing you to reference layers like model_v2.fc1
or model_v2.relu1
. This significantly improves code readability and maintainability, especially for slightly longer sequences, making it easier to inspect specific parts of your model.
Data flow through the
model_v2
defined usingnn.Sequential
with named layers. Input passes linearly throughfc1
,relu1
, andfc2
.
nn.Sequential
nn.Sequential
is particularly well-suited for:
Conv2d
, BatchNorm2d
, and ReLU
) that can then be incorporated as single modules into larger, custom nn.Module
structures.The primary limitation of nn.Sequential
is its strictly linear nature. It assumes a single input and a single output, with data flowing sequentially through all contained modules. You cannot use it directly to define models with more complex topologies, such as:
forward
method.nn.Sequential
.For any architecture exhibiting these characteristics, you must define a custom model by subclassing torch.nn.Module
and implementing the forward
method yourself, giving you full control over the data flow, as discussed previously in the "Defining Custom Network Architectures" section.
In summary, nn.Sequential
provides a clean and efficient way to define the common pattern of linearly stacked neural network layers. It serves as a valuable, convenient tool for simpler architectures and component blocks, complementing the more flexible approach of custom nn.Module
classes. Now that you can define model structures using either nn.Module
or nn.Sequential
, the next step is to define the objective function the model will optimize for, which brings us to loss functions.
© 2025 ApX Machine Learning