All Courses

Sequential Containers for Simple Models

While defining custom network architectures by subclassing torch.nn.Module offers maximum flexibility, many common models involve a straightforward sequence of layers where the output of one layer feeds directly into the input of the next. For these linear stacks, PyTorch provides a convenient container: torch.nn.Sequential.

nn.Sequential acts as a wrapper that takes an ordered sequence of modules (like layers and activation functions) and executes them in that specific order when input data is passed through it. Think of it as building a pipeline for your data transformations. This approach simplifies model definition when you don't need complex data flow logic, skip connections, or multiple input/output paths.

Defining Models with `nn.Sequential`

You can create a Sequential model by passing the modules you want to include as arguments to its constructor. The order matters, as it dictates the flow of data.

Let's build a simple two-layer feed-forward network that takes a 784-dimensional input (like a flattened MNIST image), passes it through a hidden layer with 128 units and a ReLU activation, and finally produces a 10-dimensional output (for 10 classes).

import torch
import torch.nn as nn
from collections import OrderedDict

# Define input, hidden, and output dimensions
input_size = 784
hidden_size = 128
output_size = 10

# Method 1: Passing modules directly as arguments
model_v1 = nn.Sequential(
    nn.Linear(input_size, hidden_size), # Layer 1: Linear transformation
    nn.ReLU(),                         # Activation 1: Non-linearity
    nn.Linear(hidden_size, output_size) # Layer 2: Linear transformation
)

# Print the model structure
print("Model V1 (Unnamed Layers):")
print(model_v1)

# Example Usage: Create a dummy input tensor
# Assume a batch size of 64
dummy_input = torch.randn(64, input_size)
output = model_v1(dummy_input)
print("\nOutput shape:", output.shape) # Expected: torch.Size([64, 10])

This creates a model where input data first goes through nn.Linear(784, 128), then the nn.ReLU() activation is applied, and finally, the result passes through nn.Linear(128, 10). Notice how compact the definition is. The Sequential container automatically handles passing the output of one module as the input to the next.

Naming Layers within `nn.Sequential`

While the previous method works, the layers are only assigned default numerical indices (0, 1, 2, ...). This can make debugging or accessing specific layers harder later. A better practice for clarity and accessibility is to use an OrderedDict from Python's collections module to provide names for your layers.

# Method 2: Using an OrderedDict for named layers
model_v2 = nn.Sequential(OrderedDict([
    ('fc1', nn.Linear(input_size, hidden_size)), # Fully connected layer 1
    ('relu1', nn.ReLU()),                      # ReLU activation
    ('fc2', nn.Linear(hidden_size, output_size)) # Fully connected layer 2
]))

# Print the model structure
print("\nModel V2 (Named Layers):")
print(model_v2)

# Accessing a specific layer by name is now possible
print("\nAccessing fc1 weights shape:", model_v2.fc1.weight.shape)
# You can also access using integer indices if needed
print("Accessing layer at index 0:", model_v2[0])
# Or by the string name directly if using OrderedDict
print("Accessing layer by name 'relu1':", model_v2.relu1)

Using an OrderedDict preserves the insertion order (which is essential for nn.Sequential) while allowing you to reference layers like model_v2.fc1 or model_v2.relu1. This significantly improves code readability and maintainability, especially for slightly longer sequences, making it easier to inspect specific parts of your model.

Data flow through the model_v2 defined using nn.Sequential with named layers. Input passes linearly through fc1, relu1, and fc2.

When to Use `nn.Sequential`

nn.Sequential is particularly well-suited for:

Simple Feed-Forward Networks: Models where layers are stacked linearly without branching or skipping, like basic Multi-Layer Perceptrons (MLPs) or the initial feature extraction stages of some CNNs.
Defining Reusable Blocks: Creating self-contained blocks of layers (e.g., a convolution block with Conv2d, BatchNorm2d, and ReLU) that can then be incorporated as single modules into larger, custom nn.Module structures.
Rapid Prototyping: Quickly assembling standard architectures to test ideas or establish baselines.

Limitations

The primary limitation of nn.Sequential is its strictly linear nature. It assumes a single input and a single output, with data flowing sequentially through all contained modules. You cannot use it directly to define models with more complex topologies, such as:

Skip Connections: Architectures like ResNets where the output of an earlier layer is added to the output of a later layer require explicit implementation in a custom forward method.
Multiple Inputs or Outputs: Models that process several distinct input streams or produce multiple output tensors cannot be represented solely by nn.Sequential.
Shared Layers: Architectures where the exact same layer instance is applied at different points in the network topology.
Conditional Logic: Any scenario where the data flow depends on runtime conditions or requires manipulation of data from one layer's output to the next layer's input.

For any architecture exhibiting these characteristics, you must define a custom model by subclassing torch.nn.Module and implementing the forward method yourself, giving you full control over the data flow, as discussed previously in the "Defining Custom Network Architectures" section.

In summary, nn.Sequential provides a clean and efficient way to define the common pattern of linearly stacked neural network layers. It serves as a valuable, convenient tool for simpler architectures and component blocks, complementing the more flexible approach of custom nn.Module classes. Now that you can define model structures using either nn.Module or nn.Sequential, the next step is to define the objective function the model will optimize for, which brings us to loss functions.

Was this section helpful?

Sequential Containers for Simple Models

Defining Models with nn.Sequential

Naming Layers within nn.Sequential

When to Use nn.Sequential

Limitations

Defining Models with `nn.Sequential`

Naming Layers within `nn.Sequential`

When to Use `nn.Sequential`