Having explored the theoretical foundations of the Perceptron, including its structure based on the artificial neuron and its limitation to linearly separable problems, it's time to put theory into practice. This hands-on exercise guides you through implementing a simple Perceptron from scratch using Python and the NumPy library. Building this foundational model will solidify your understanding of how inputs, weights, bias, and the learning rule interact to achieve classification.We'll tackle a classic linearly separable problem: the logical AND gate. An AND gate outputs 1 only if both of its inputs are 1, otherwise it outputs 0.The AND Gate ProblemThe truth table for the AND gate is:Input 1 (x1)Input 2 (x2)Output (y)000010100111Our goal is to train a Perceptron that takes $(x_1, x_2)$ as input and correctly predicts $y$. Because we can draw a straight line to separate the points where $y=1$ from the points where $y=0$ in a 2D plot, this problem is linearly separable and solvable by a single Perceptron.Perceptron Learning Algorithm RecapRecall the Perceptron learning steps:Initialization: Set initial values for the weights ($w_1, w_2$) and the bias ($b$). Often, these are initialized to zero or small random numbers.Activation: For a given input sample $(x_1, x_2)$, calculate the weighted sum plus bias: $$ z = w_1 x_1 + w_2 x_2 + b = \mathbf{w} \cdot \mathbf{x} + b $$Prediction: Apply a step function (specifically, the Heaviside step function) to the activation $z$ to get the predicted output $\hat{y}$: $$ \hat{y} = \begin{cases} 1 & \text{if } z \ge 0 \ 0 & \text{if } z < 0 \end{cases} $$Weight Update: Compare the prediction $\hat{y}$ with the true label $y$. If they differ, update the weights and bias using the Perceptron learning rule: $$ w_i \leftarrow w_i + \eta (y - \hat{y}) x_i $$ $$ b \leftarrow b + \eta (y - \hat{y}) $$ Here, $\eta$ (eta) is the learning rate, a small positive value (e.g., 0.1) that controls the step size of the updates. If the prediction is correct ($y - \hat{y} = 0$), no update occurs.We repeat steps 2-4 for all training samples multiple times (epochs) until the model converges (makes correct predictions for all samples) or a maximum number of epochs is reached.Implementation with Python and NumPyLet's implement this. We'll use NumPy for efficient numerical operations.import numpy as np # Define the AND gate dataset # Inputs (X) including a bias term (column of 1s) for convenience later, # but we'll handle bias separately in the code below for clarity. X = np.array([ [0, 0], [0, 1], [1, 0], [1, 1] ]) # Outputs (y) y = np.array([0, 0, 0, 1]) # Initialize weights and bias # Two inputs, so two weights. Initialize to small random values. np.random.seed(42) # for reproducibility weights = np.random.rand(2) * 0.1 # e.g., [0.037, 0.095] bias = np.random.rand(1) * 0.1 # e.g., [0.073] # Define the step activation function def step_function(z): return np.where(z >= 0, 1, 0) # Set learning parameters learning_rate = 0.1 epochs = 50 # Number of passes through the entire dataset print(f"Initial weights: {weights}, Initial bias: {bias[0]:.3f}") print("-" * 30) # Training loop for epoch in range(epochs): errors = 0 for i in range(len(X)): # Get current input sample and target inputs = X[i] target = y[i] # 1. Calculate weighted sum (activation) z = np.dot(inputs, weights) + bias # 2. Make prediction prediction = step_function(z) # 3. Calculate error error = target - prediction # 4. Update weights and bias if error is non-zero if error != 0: errors += 1 weights += learning_rate * error * inputs bias += learning_rate * error # Print progress (optional) if (epoch + 1) % 10 == 0: print(f"Epoch {epoch+1}/{epochs}, Errors: {errors}, Weights: [{weights[0]:.3f}, {weights[1]:.3f}], Bias: {bias[0]:.3f}") # Check for convergence (no errors in an epoch) if errors == 0 and epoch > 0: print(f"\nConvergence reached at epoch {epoch+1}.") break print("-" * 30) print(f"Final weights: [{weights[0]:.3f}, {weights[1]:.3f}]") print(f"Final bias: {bias[0]:.3f}") # Test the trained Perceptron print("\nTesting the trained Perceptron:") for i in range(len(X)): inputs = X[i] target = y[i] z = np.dot(inputs, weights) + bias prediction = step_function(z) print(f"Input: {inputs}, Target: {target}, Prediction: {prediction[0]}")Analyzing the OutputRunning the code above should produce output similar to this (exact weights might vary slightly due to random initialization):Initial weights: [0.03745401 0.09507143], Initial bias: 0.073 ------------------------------ Epoch 10/50, Errors: 0, Weights: [0.137, 0.095], Bias: -0.127 Convergence reached at epoch 10. ------------------------------ Final weights: [0.137, 0.095] Final bias: -0.127 Testing the trained Perceptron: Input: [0 0], Target: 0, Prediction: 0 Input: [0 1], Target: 0, Prediction: 0 Input: [1 0], Target: 0, Prediction: 0 Input: [1 1], Target: 1, Prediction: 1You can see the weights and bias adjusting over the epochs. The number of errors decreases until it reaches zero, indicating the Perceptron has learned to correctly classify all input patterns for the AND gate. The final test confirms the predictions match the target outputs.Visualizing the Decision BoundaryFor a 2D problem like this, we can visualize the decision boundary learned by the Perceptron. The boundary is the line where the weighted sum equals zero: $w_1 x_1 + w_2 x_2 + b = 0$. We can rewrite this to plot $x_2$ as a function of $x_1$: $x_2 = (-w_1 x_1 - b) / w_2$.{"data": [{"marker": {"color": ["#fa5252", "#fa5252", "#fa5252", "#1c7ed6"], "size": 12, "symbol": "circle"}, "mode": "markers", "name": "Data Points (0 or 1)", "type": "scatter", "x": [0, 0, 1, 1], "y": [0, 1, 0, 1]}, {"line": {"color": "#37b24d", "width": 2}, "mode": "lines", "name": "Decision Boundary", "type": "scatter", "x": [-0.5, 1.5], "y": [0.65, -0.90]}], "layout": {"legend": {"title": "Output (y)"}, "title": "Perceptron Decision Boundary for AND Gate", "xaxis": {"range": [-0.5, 1.5], "title": "Input 1 (x1)"}, "yaxis": {"range": [-0.5, 1.5], "title": "Input 2 (x2)"}}}The plot shows the four input points for the AND gate. Points colored red represent an output of 0, and the blue point represents an output of 1. The green line is the decision boundary ($w_1 x_1 + w_2 x_2 + b = 0$) learned by the Perceptron. All points on one side of the line are classified as 0, and points on the other side are classified as 1.This practical implementation demonstrates the core mechanism of a Perceptron. While simple, it forms the basis for understanding how weights are adjusted during learning. As we saw earlier, this model has limitations (it cannot solve the XOR problem). This motivates the move towards Multi-Layer Perceptrons (MLPs), which add hidden layers to handle more complex, non-linearly separable patterns, as we will explore in subsequent chapters.