暂转pdf暂转pdf.pdf

Mathematical principle and operation
数学原理和运算

This chapter focuses on the proof and operation of SVD, along with the geometric significance and the low-rank approximation of matrices, which are closely connected to the applications of SVD.
本章重点介绍 SVD 的证明和运算，以及矩阵的几何意义和低秩近似，它们与 SVD 的应用密切相关。

Definition 定义

Let A be any

m \times n

matrix with real entries, A can be expressed in the product of three matrices
设 A 是任何

m \times n

具有实数项的矩阵，A 可以用三个矩阵的乘积表示

A = U Σ V^{T}

Where: 哪里：
U is an

m \times m

orthogonal matrix. Its columns

u_{i}

are called the left singular vectors of

A

.
U 是

m \times m

正交矩阵。它的列

u_{i}

称为的

A

左奇异向量。

Σ

is an

m \times n

diagonal matrix with non-negative real entries

σ_{i}

on the diagonal, which are the singular values of A. Generally, we have

σ_{1} \geq σ_{2} \geq \dots \geq σ_{p} \geq 0

p = min (m, n)

Σ

是一个

m \times n

对角矩阵，对角线上有非负实数项

σ_{i}

，它们是 A 的奇异值。通常，我们有

σ_{1} \geq σ_{2} \geq \dots \geq σ_{p} \geq 0

、

p = min (m, n)

。

V is an

m \times n

orthogonal matrix. Its columns

v_{i}

are called the right singular vectors of A .
V 是

m \times n

正交矩阵。它的列

v_{i}

称为 A 的右奇异向量。

Proof 证明

$A^{T} A$ is positive semidefinite (having all non-negative eigenvalues)
$A^{T} A$ 为正半定（具有所有非负特征值）

Proof: Let

A \in R^{m \times n}, x

is an n -dimentional column vector,
证明：设

A \in R^{m \times n}, x

是一个 n 维列向量，

\begin{matrix} x^{T} (A^{T} A) x = x^{T} A^{T} A x \\ = (A x)^{T} (A x) \\ = | A x |^{2} \geq 0 \end{matrix}

A^{T} A

is positive semidefinite.
正半定也是如此

A^{T} A

。

Creating a set of bases in the column space of A
在 A 的列空间中创建一组基

Suppose

A

is an

m \times n

real matrix. Since

A^{T} A

is a real symmetric matrix (diagonalizable and having orthogonal eigenvectors), let the n eigenvalues of

A^{T} A

λ_{1} \geq λ_{2} \geq \dots \geq λ_{n} \geq 0

(

A^{T} A

is positive semidefinite, proven in step 1 ). From the eigen decomposition of

A

, there exists an

n \times n

orthogonal matrix

V

, such that
假设

A

是一个

m \times n

实矩阵。由于

A^{T} A

是一个实对称矩阵（可对角化且具有正交特征向量），因此设 be

λ_{1} \geq λ_{2} \geq \dots \geq λ_{n} \geq 0

的

A^{T} A

n 个特征值（

A^{T} A

是正半定的，在步骤 1 中得到证明）。从

A

的特征分解中，存在一个

n \times n

正交矩阵

V

，使得

A^{T} A = V (\begin{array}{cccc} λ_{1} \\ λ_{2} \\ ⋱ \\ λ_{n} \end{array}) V^{T}

Let

V = [v_{1} v_{2} \dots V_{n}]

(the eigenvectors of

A^{T} A

).
设

V = [v_{1} v_{2} \dots V_{n}]

（

A^{T} A

的特征向量）。
Since V is an orthogonal matrix, to any n -dimentional column vector x , there is a unique set of value (

c_{1}, c_{2}, \dots, c_{n}

), such that
由于 V 是正交矩阵，因此对于任何 n 维列向量 x ，都有一组唯一的值（

c_{1}, c_{2}, \dots, c_{n}

），使得

x = c_{1} v_{1} + c_{2} v_{2} + \dots + c_{n} v_{n}

So we have 所以我们有

A x = A c_{1} v_{1} + A c_{2} v_{2} + \dots + A c_{n} v_{n}

Suppose

rank (A) = r (r \leq min (m, n), λ_{1} \geq λ_{2} \geq \dots \geq λ_{r} > λ_{r + 1} = λ_{r + 2} = \dots = λ_{n} = 0)

, so that
假设

rank (A) = r (r \leq min (m, n), λ_{1} \geq λ_{2} \geq \dots \geq λ_{r} > λ_{r + 1} = λ_{r + 2} = \dots = λ_{n} = 0)

，则

A x = A c_{1} v_{1} + A c_{2} v_{2} + \dots + A c_{r} v_{r}

Else, to any

i, j (i \neq j)

, we have
否则，对于任何

i, j (i \neq j)

，我们都有

{(A v_{i})}^{T} (A v_{j}) = (A^{T} A v_{j}) = λ_{j} v_{j} = 0

A v_{i} ⊥ {Av}_{j}

. That means

{{Av}_{1}, {Av}_{2}, \dots, {Av}_{r}}

is a set of orthogonal bases of the column space of

A

.
所以

A v_{i} ⊥ {Av}_{j}

.这意味着

{{Av}_{1}, {Av}_{2}, \dots, {Av}_{r}}

是的列空间的一组正交基

A

数。

Computing the length of

{Av}_{i}

,
计算

{Av}_{i}

的长度，

| A v_{i} | = \sqrt{{| A v_{i} |}^{2}} = \sqrt{A^{T} A v_{i}} = \sqrt{λ_{i} v_{i}} = \sqrt{λ_{i}}

{\frac{{Av}_{1}}{\sqrt{λ_{1}}}, \frac{{Av}_{2}}{\sqrt{λ_{2}}}, \dots, \frac{{Av}_{r}}{\sqrt{λ_{r}}}}

is a set of orthogonal bases with a length of 1 in the column space of A.
在 A 的列空间中，一组长度为 1 的正交底也是如此

{\frac{{Av}_{1}}{\sqrt{λ_{1}}}, \frac{{Av}_{2}}{\sqrt{λ_{2}}}, \dots, \frac{{Av}_{r}}{\sqrt{λ_{r}}}}

。

The final step 最后一步

To all

i \in {1, 2, \dots, r}

, let

u_{i} = \frac{{Av}_{i}}{\sqrt{λ_{i}}}

, and extend the set of bases to

{u_{1}, u_{2}, \dots, u_{r}, u_{r + 1}, \dots, u_{m}}, u_{r + 1} \sim u_{m}

are the orthogonal bases of the null space of A (with a length of 1). The new set of bases is an orthogonal bases set of

R^{m \times m}

. Computing AV:
对于所有

i \in {1, 2, \dots, r}

，let

u_{i} = \frac{{Av}_{i}}{\sqrt{λ_{i}}}

和 extend 的基数集是

{u_{1}, u_{2}, \dots, u_{r}, u_{r + 1}, \dots, u_{m}}, u_{r + 1} \sim u_{m}

A 的零空间（长度为 1）的正交基数。新的基集是

R^{m \times m}

的正交基集。计算 AV：

\begin{aligned} AV = A [v_{1} v_{2} \dots v_{n}] \\ = [{Av}_{1} {Av}_{2} \dots A v_{n}] \end{aligned}

\begin{aligned} = [{Av}_{1} {Av}_{2} \dots {Av}_{r} 00 \dots 0] \\ = [\sqrt{λ_{1}} u_{1} \sqrt{λ_{2}} u_{2} \dots \sqrt{λ_{r}} u_{r} 00 \dots 0] \\ = [u_{1} u_{2} \dots u_{m}] [\begin{array}{lllll} \sqrt{λ_{1}} \\ \sqrt{λ_{2}} \\ ⋱ \\ \sqrt{λ_{r}} \\ 0 \end{array}] \end{aligned}

Let

U = [u_{1} u_{2} \dots u_{m}], σ_{i} = \sqrt{λ_{i}}

, so

AV = U Σ

, so that
设

U = [u_{1} u_{2} \dots u_{m}], σ_{i} = \sqrt{λ_{i}}

， so

AV = U Σ

， so ，所以

A = U Σ V^{T}

From the above proof, we have found that the singular values of A is the square roots of the (biggest

min (m, n)

) eigenvalues of

A^{T} A

. From the other side (begin with

{AA}^{T}

), using the same method of above, it’s easy to prove that the singular values of

A

is the square roots of the (biggest

min (m, n)

) eigenvalues of

A A^{T}

, and

{u_{1}, u_{2}, \dots, u_{m}}

are the eigenvectors of

A A^{T}

. So the singular values and the singular vectors have the following rules:
从上面的证明中，我们发现 A 的奇异值是的（最大

min (m, n)

）特征值的平方根

A^{T} A

。从另一边（以开始）

{AA}^{T}

来看，使用上述相同的方法，很容易证明的

A

奇异值是

A A^{T}

的（最大

min (m, n)

）特征值的平方根，并且

{u_{1}, u_{2}, \dots, u_{m}}

是

A A^{T}

的特征向量。因此，奇异值和奇异向量具有以下规则：

Right singular vectors:

A^{T} {Av}_{i} = v_{i}

.
右奇异向量：

A^{T} {Av}_{i} = v_{i}

.
Left singular vectors:

A^{T} u_{i} = u_{i}

.
左奇异向量：

A^{T} u_{i} = u_{i}

.
Connections between

u_{i}

and

v_{i} : v_{i} = σ_{i} u_{i}

and

A^{T} u_{i} = σ_{i} v_{i}

.
和和

v_{i} : v_{i} = σ_{i} u_{i}

之间的

u_{i}

连接

A^{T} u_{i} = σ_{i} v_{i}

。

Geometric significance 几何意义

The singular value decomposition (SVD) is an extension to the eigenvalue decomposition (EVD). EVD finds the vectors (directions) which a matrix only scales (and/or reverses/erases) them but doesn’t rotate them (by multiplication). SVD finds the orthogonal vectors (directions) which a matrix only scales and/or rotates them but doesn’t affect the orthogonality between them. Similar to EVD, SVD can also be described as a rotation-stretchingrotation process.
奇异值分解（SVD）是特征值分解（EVD）的扩展。EVD 查找矩阵仅缩放（和/或反转/擦除）它们但不旋转它们（通过乘法）的向量（方向）。SVD 查找正交向量（方向），矩阵仅缩放和/或旋转它们，但不影响它们之间的正交性。与 EVD 类似，SVD 也可以描述为旋转-拉伸旋转过程。

The orthogonal matrix

V^{T}

(or

V^{- 1}

) rotates the orthogonal vectors onto the standard axis. (The core of Chapter 4)
正交矩阵

V^{T}

（或

V^{- 1}

）将正交向量旋转到标准轴上。（第 4 章的核心）

The diagonal matrix

Σ

scales each vector by a value, with a change in the number of dimensions (

R^{n \times n} \to R^{m \times m}

).
对角矩阵按一个值

Σ

缩放每个向量，维数（

R^{n \times n} \to R^{m \times m}

）发生变化。

The orthogonal matrix

U

rotates the vectors in

R^{m \times m}

space.
正交矩阵

U

在

R^{m \times m}

空间中旋转向量。
Here is a simple picture to show the process:
这是一张简单的图片来展示这个过程：

Mathematical principle and operation数学原理和运算