21-241 Matrix and Linear Transformation¶
Linear Independence¶
Definition (Linear Independence)
A set of vectors \((v_1, v_2, \cdots, v_p) \in \mathbb{R}^n\) is linear independent if the vector equation \(x_1v_1 + x_2v_2 + \cdots + x_pv_p = 0\) has only the trivial solution.
Definition (Linear Dependence)
A set of vectors \((v_1, v_2, \cdots, v_p) \in \mathbb{R}^n\) is linear dependent if the vector equation \(x_1v_1 + x_2v_2 + \cdots + x_pv_p = 0\) has non-trivial solution, i.e. \(\exists c_1, c_2, \cdots, c_p \text{ (not all zero), } c_1v_1 + c_2v_2 + \cdots + c_pv_p = 0\).
Important Note (Linear Independence of Matrix Columns)
The columns of a matrix \(\mathbf{A}\) are linear independent iff the equation \(\mathbf{A}x = 0\) has only the trivial solution.
Theorem (1.7.4(a))
A set \(S = \{v_1, v_2, \cdots, v_p\}\) is linearly dependent iff at least one of the vectors in \(S\) is a linear combination of the others.
Theorem (1.7.4(b))
If a set contains more vectors than entries in each vector, then the set is linearly dependent, i.e.
Proof
If \(p > n\) then there must be a free variable in the equation \([v_1, v_2, \cdots, v_p] \cdot \mathbf{x} = 0\). Therefore, non-trivial solution must exist.
Theorem (1.7.5)
If a set \(S = \{v_1, v_2, \cdots, v_p\} \in \mathbb{R}^n\) contains the zero vector, then \(S\) is linearly dependent.
Proof
If \(v'=0\), then \(v_1 \cdot 0 + \cdots v' \cdot 1 + \cdots v_p \cdot 0 = 0\).
Introduction to Linear Transformation¶
Definition (Transformation)
A transformation (or function) \(\mathcal{T}\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is a rule that assigns each vector \(\mathbf{x}\in \mathbb{R}^n\) a vector \(\mathcal{T}(\mathbf{x})\in \mathbb{R}^m\), where \(\mathbb{R}^n\) is called the domain of \(\mathcal{T}\) and \(\mathbb{R}^m\) is called the codomain of \(\mathcal{T}\).
Important Note (Matrix Transform)
For each matrix \(A\) of shape \(m\times n\), we can define a transform from \(\mathbb{R}^n \rightarrow \mathbb{R}^m\) as follows:
Definition (Linear Transformation)
A transformation \(\mathcal{T}\) is linear if:
(i) \(\mathcal{T}(u+v) = \mathcal{T}(u) + \mathcal{T}(v)\) for all \(u,v \in \text{ domain of } \mathcal{T}\)
(ii) \(\mathcal{T}(cu) = c\mathcal{T}(u)\) for all \(u \in \text{domain of } \mathcal{T}\) and scalar \(c\)
Important Note
All transformation defined by matrices are linear, i.e.
Important Note (Properties of Linear Transformation)
If \(\mathcal{T}\) is a linear transformation, then
Note (Range v.s. Codomain)
Note that the range of \(\mathcal{T} \neq \text{the codomain of } \mathcal{T}\).
Theorem (1.9.1)
Let \(\mathcal{T}: \mathbb{R}^n \rightarrow \mathbb{R}^m\) be a linear transformation. Then there exists a unique matrix \(A\) s.t. \(\mathcal{T}(x) = Ax\) for all \(x \in \mathbb{R}^n\). In fact, \(A\), called the standard matrix for the linear transformation, is a \(m \times n\) matrix whose jth column is the vector \(\mathcal{T}(e_j)\), i.e.
Definition (One-to-one Mapping)
A mapping \(\mathcal{T}: \mathbb{R}^n \rightarrow \mathbb{R}^m\) is said to be one-to-one if each \(b \in \mathbb{R}^m\) is the image of at most one \(x \in \mathbb{R}^n\).
Definition (A Mapping ONTO \(\mathbb{R}^m\))
A mapping \(\mathcal{T}: \mathbb{R}^n \rightarrow \mathbb{R}^m\) is said to be onto \(\mathbb{R}^m\) if each \(b \in \mathbb{R}^m\) is the image of at least one \(x \in \mathbb{R}^n\).
Theorem (1.9.3)
Let \(\mathcal{T}: \mathbb{R}^n \rightarrow \mathbb{R}^m\) be a linear transformation. \(\mathcal{T}\) is one-to-one iff \(\mathcal{T}(x) = 0\) has only the trivial solution.
Theorem (1.9.4)
Let \(\mathcal{T}: \mathbb{R}^n \rightarrow \mathbb{R}^m\) be a linear transformation and let \(A\) be the standard matrix.
(a) \(\mathcal{T}\) maps \(\mathbb{R}^n\) onto \(\mathbb{R}^m\) iff the columns of \(A\) span \(\mathbb{R}^m\).
(b) \(\mathcal{T}\) is one-to-one iff the columns of \(A\) are linearly independent.
(c) If \(n>m\), then \(\mathcal{T}\) cannot be one-to-one because the columns of \(A\) are linearly dependent.
(d) If \(m > n\), \(\mathcal{T}\) cannot map \(\mathbb{R}^n\) onto \(\mathbb{R}^m\) because the columns of \(A\) cannot span \(\mathbb{R}^m\).