Vectors represent data points, and matrices transform them. Combining vectors to create new ones is a process fundamental to understanding vector spaces. This process is called forming linear combinations.A linear combination of a set of vectors $v_1, v_2, ..., v_k$ is any vector $w$ that can be expressed in the form:$$ w = c_1 v_1 + c_2 v_2 + ... + c_k v_k $$where $c_1, c_2, ..., c_k$ are scalar constants (real numbers). Essentially, you take your set of vectors, scale each one by some amount (including potentially zero or negative amounts), and then add the scaled vectors together. The result is a linear combination.Think about it geometrically in 2D. If you have a single vector $v_1$, its linear combinations are just scalar multiples of $v_1$, forming a line passing through the origin in the direction of $v_1$.If you have two vectors, say $v_1 = \begin{bmatrix} 1 \ 0 \end{bmatrix}$ and $v_2 = \begin{bmatrix} 0 \ 1 \end{bmatrix}$ in $\mathbb{R}^2$, what vectors can you create? A linear combination looks like $w = c_1 \begin{bmatrix} 1 \ 0 \end{bmatrix} + c_2 \begin{bmatrix} 0 \ 1 \end{bmatrix} = \begin{bmatrix} c_1 \ c_2 \end{bmatrix}$. By choosing different scalars $c_1$ and $c_2$, you can reach any point (vector) in the 2D plane.The Span of VectorsThis leads us to the concept of span. The span of a set of vectors ${v_1, v_2, ..., v_k}$ is the set of all possible linear combinations of these vectors. It represents the entire region or space that can be "reached" or generated by combining these vectors.Span of a single non-zero vector: A line through the origin.Span of two non-collinear vectors in $\mathbb{R}^2$: The entire 2D plane ($\mathbb{R}^2$).Span of two collinear vectors in $\mathbb{R}^2$: A line through the origin (the same line defined by either vector alone).Span of two non-collinear vectors in $\mathbb{R}^3$: A plane through the origin within the 3D space.Span of three non-coplanar vectors in $\mathbb{R}^3$: The entire 3D space ($\mathbb{R}^3$).Let's visualize the span of two non-collinear vectors in 2D. Consider $v_1 = \begin{bmatrix} 2 \ 1 \end{bmatrix}$ and $v_2 = \begin{bmatrix} -1 \ 3 \end{bmatrix}$. Their span is the entire $\mathbb{R}^2$. Any vector in $\mathbb{R}^2$, like $w = \begin{bmatrix} 4 \ 5 \end{bmatrix}$, can be written as a linear combination $c_1 v_1 + c_2 v_2$.{"layout": {"title": "Span of Two Vectors in R^2", "xaxis": {"range": [-6, 6], "title": "x1", "gridcolor": "#dee2e6", "zerolinecolor": "#adb5bd"}, "yaxis": {"range": [-4, 7], "title": "x2", "gridcolor": "#dee2e6", "zerolinecolor": "#adb5bd"}, "width": 600, "height": 450, "showlegend": true, "annotations": [{"x": 2, "y": 1, "ax": 0, "ay": 0, "text": "v1", "font": {"color": "#1c7ed6"}}, {"x": -1, "y": 3, "ax": 0, "ay": 0, "text": "v2", "font": {"color": "#f03e3e"}}, {"x": 4, "y": 5, "ax": 0, "ay": 0, "text": "w", "font": {"color": "#37b24d"}}]}, "data": [{"type": "scatter", "x": [0, 2], "y": [0, 1], "mode": "lines+markers", "name": "v1 = [2, 1]", "line": {"color": "#1c7ed6", "width": 3}}, {"type": "scatter", "x": [0, -1], "y": [0, 3], "mode": "lines+markers", "name": "v2 = [-1, 3]", "line": {"color": "#f03e3e", "width": 3}}, {"type": "scatter", "x": [0, 4], "y": [0, 5], "mode": "lines+markers", "name": "w = [4, 5]", "line": {"color": "#37b24d", "dash": "dot", "width": 2}}, {"type": "scatter", "x": [0, 8], "y": [0, 4], "mode": "lines", "name": "Line through v1", "line": {"color": "#a5d8ff", "dash": "dash"}}, {"type": "scatter", "x": [0, -2], "y": [0, 6], "mode": "lines", "name": "Line through v2", "line": {"color": "#ffc9c9", "dash": "dash"}}]}Two non-collinear vectors $v_1$ and $v_2$. Any vector $w$ in the plane (like the dotted green vector) can be formed by scaling and adding $v_1$ and $v_2$. The dashed lines represent the infinite lines passing through the origin along $v_1$ and $v_2$. The span fills the entire 2D plane.Why Span Matters in Machine LearningIn machine learning, features often form vectors. A dataset can be seen as a collection of these feature vectors. The concept of span helps us understand the "reach" of our features.Feature Space: The span of your feature vectors defines the feature space your model effectively operates within. If a new data point lies outside the span of the training data features, the model might struggle to generalize.Redundancy: If one feature vector lies within the span of others, it might be redundant. It doesn't add new "direction" or information to the space defined by the other vectors. This relates closely to the idea of linear independence, which we'll discuss next.Dimensionality: The number of vectors needed to span a particular space relates to its dimension. If your data lives in $\mathbb{R}^{100}$ but the feature vectors only span a 10-dimensional subspace, it suggests potential for dimensionality reduction.Calculating Linear Combinations with NumPyNumPy makes calculating linear combinations straightforward.import numpy as np # Define vectors v1 = np.array([2, 1]) v2 = np.array([-1, 3]) # Define scalars c1 = 2.1538 # Approximately 28/13 c2 = 0.3077 # Approximately 4/13 # Calculate the linear combination w = c1*v1 + c2*v2 w = c1 * v1 + c2 * v2 print(f"Vector v1: {v1}") print(f"Vector v2: {v2}") print(f"Scalars c1={c1:.4f}, c2={c2:.4f}") print(f"Linear Combination w: {w}") # The result should be close to [4, 5] due to floating point representation # Let's check if w is indeed [4, 5] target_w = np.array([4, 5]) print(f"Is w approximately equal to [4, 5]? {np.allclose(w, target_w)}")This code snippet demonstrates creating the vector $w = \begin{bmatrix} 4 \ 5 \end{bmatrix}$ as a linear combination of $v_1 = \begin{bmatrix} 2 \ 1 \end{bmatrix}$ and $v_2 = \begin{bmatrix} -1 \ 3 \end{bmatrix}$. We found the specific scalars $c_1 \approx 2.15$ and $c_2 \approx 0.31$ that achieve this. (Finding these scalars involves solving a system of linear equations, a topic covered in Chapter 3).Understanding linear combinations and span provides a geometric and algebraic framework for thinking about how vectors relate to each other and the spaces they generate. This foundation is important as we move towards analyzing the independence of vectors and the structure of feature spaces.