Some mathematical English
$A$ : capital A
$A^T$: capital A superscript(下标:subscript) capital T
$A^{-1}$: capital A to the minus one
$1/2$: one over two
Some forgotten terms
Banded matrix: like a diagonal matrix, but it has also elements above the main diagonal and below the main diagonal.
Triangular matrix: only has elements above (upper) or below (lower) the main diagonal.
Skew-symmetric matrix: $A^T=-A$ (so that the diagnoal element is zero)
Orthonormal vectors: the two vectors $u,v$ (usually, when we say a vector, its dimension is n by 1, a column vector), are both orthogonal $u^Tv = 0$ and normalized (||u|| = ||v|| = 1)
orthogonal matrix:
- $Q^{-1}= Q^T$
- $QQ^T = I$
- In the multiplication above, if selected rows are the same, results are 1 (diagonal element), otherwise, 0
- That is, the row/column vectors are othernormal vectors
- $(Q X)^T(QX) = ||QX||^2 = X^TQ^TQX = X^TX = ||X||^2$: an orthogonal matrix preserves the norms and lengths.
LU decomposition
Elementary matrix: the identity matrix with one of the zeros replaced by a number
Consider matrix multiplication R = M$\times$A:
$R_1$ (the first row of R) is calculated by the combination sum: $m_{11}\times A_1 + m_{12}\times A_2 +…$
Inverse of elementary matrix: swap the sign of non-zero element (Think $MM^{-1}A = A$)
Eg:
$$M = \begin{bmatrix}
1 & 0 & 0 \
0 & 1 & 0 \ -1& 0 & 1 \end{bmatrix}$$
$$M^{-1} = \begin{bmatrix}
1 & 0 & 0 \
0 & 1 & 0 \ 1& 0 & 1 \end{bmatrix}$$
$$M = \begin{bmatrix}
1 & 0 & 0 \
0 & 1 & 0 \ -1& 0 & 1 \end{bmatrix}$$
LU decomposition: A = LU, L is a lower triangular matrix and U is a upper triangular matrix.
Then, when we try to solve Ax = b, we can actually solve (LU)x = b cause y = Ux, Ly = b are easier to compute compared with Ax= b. (Note L,U are triangular matrices).
Note: it is NOT computational-friendly by LU decomposition if we only calculate Ax=b once. However, if there are multiple b, we can decompose A one time, and benefit from the effecient computation by triangular matrix.
Vector Space
vector space = set of vectors (in this course, column matrices) + set of scalars (real numbers)
should be closed under vector addition and scalar multiplication
Linear Independence
The set of vectors ${u_1,…,u_n}$ are linearly independent if $c_1u_1 + … +c_nu_n = 0$ has only solution $c_1=c_2=…=c_n$.
meaning: no vector in the set can be written as a linear combination of other vectors.
span: we say a set of vector ${u_1,…,u_n}$ span a vector space only consisting of all linear combinations of ${u_1,…,u_n}$.
basis of a vector space is a set of minumum # of vectors that span the space.
The munumum # above is dimension.
Null Space
Null(A) is a vector space of all column vectors x s.t. Ax = 0.
That is, the dot product between row vector and the null vector (vector in the null space) is zero.
Thereofore, row space is orthogonal with null space.
Therefore, dim(Row(A)) + dim(Null(A)) = n (why?)
Left Null Space: $x^TA = 0$
Column Space
$$Mx = \begin{bmatrix}
a & b \
c &d \end{bmatrix} \begin{bmatrix}
x_1 \
x_2 \end{bmatrix} = x_1\begin{bmatrix}
a \
c \end{bmatrix}+x_2\begin{bmatrix}
b \
d \end{bmatrix}$$
Consider row space: $$x^TM = \begin{bmatrix}
x_1 &
x_2 \end{bmatrix}\begin{bmatrix}
a & b \
c &d \end{bmatrix} = x_1\begin{bmatrix}
a &
b \end{bmatrix}+x_2\begin{bmatrix}
c &
d \end{bmatrix}$$$x^TM$ gives you a vector that is in the row space of M
Above: linear combination of columns of the matrix (in the column space); Mx gives you a vector that is in the column space of M.
Dimension of column space eqauls to the dimension of row space, i.e., dim(Col(A)) = dim(Col(A$^T$)), where the dimension is called rank of A.= number of linearly independent columns
Relationships of Null Space, Column Space, and Row space.
- The column space of M can be represented by the space spaned by $Mx$. for any x.
- Similarly, the row space of M can be represented by the space spaned by $x^TM$ for any x.
- Dim(Col(A)) = dim(Row(A)) = rank
- Dim(Null(A)) = n - r
- Dim(Null($A^T$) = m - r
Orthogonal Projections
Project vector down into a subspace. The projected vector is the vector in the subspace that is closest to the original vector.
One example: Solution of the Least-Squares Problem
Ax = b (x is the target ), $A\in \R^{m\times n}, m>n$
b is the y-axis data point value, A is a matrix related to x-axis data point value. The equation Ax=b assumes that the line (represented in x) can go through all points, which is impossible.
Therefore, this is considered as over-determined.
Remeber that Ax gives you a vector that is in the column space of A. If Ax=b is over-determined, that means b is not in the column space of A since there is no solution.
But to find the best solution, what we can do is to project b into the column space of A.
So, we cannot solve Ax=b, instead, we solve Ax= $b_{projCol(A)}$.
write $b = b_{projCol(A)}$ + $b - b_{projCol(A)}$
Consdier the term $b - b_{projCol(A)}$ is orthogonal to the column space of A, (because $b_{projCol(A)}$ is a projection of b to Col(A)), the term is in the subspace Null(A$^T$).
Therefore, we multiply Ax=b by $A^T$ to get rid of the term.
$A^TAx = A^Tb$, normal equation, which has a solution (previous equation is un-solvable) as long as $A^TA$ is invertable,
$x = (A^TA)^{-1}A^Tb$
$Ax = A(A^TA)^{-1}A^Tb = b_{projCol(A)}$
here, $A(A^TA)^{-1}A^T$ is called the projection matrix that projects b into the column matrix of A.
Determinants
Leibniz Formula for calculating determinants
$\det(A)=\sum_{\sigma \in S_n}\text{sgn}(\sigma)\prod_{i=1}^{n} a_{\sigma(i),i}$
where $sgn(\sigma)$ is +/- when the permutation $\sigma$ needs even/odd times of permutations from a normal order. $S_n$ is the permutation group.
Properties
- Det I = 1
- det changes sign when row interchanges
- det is a linear function of all rows. (multiply the first row by a constant, then det is also multiplied by the constant, add works in a similar way)
- by 2 3: det = 0 if there are equal rows
- det D/L/U = product of diagonal elements
- Det(AB) = detA * detB
- Gaussian Elemination does not change the det
Eigenvalue
Given a matrix $A\in \R^{n\times n}$:
$Ax = \lambda x $, $x$ is the eigen vector and $\lambda$ is the eigen value
That is, $(A-\lambda I )x = 0$.
If $A-\lambda x$ can be inversed (the determinant is not zero), then x=0 is a trivial solution.
Therefore, $A-\lambda x$ cannot be an invertiable matrix (det is zero)