7.1 Singular values and Singular vectors

The SVD separates any matrix into simple pieces.

A is any m by n matrix, square or rectangular, Its rank is r.

Choices from the SVD

\[AA^Tu_i = \sigma_i^{2}u_i \\
A^TAv_i = \sigma_i^{2}v_i \\
Av_i = \sigma_i u_i
\]

\(u_i\)— the left singular vectors (unit eigenvectors of \(AA^T\))

\(v_i\)— the right singular vectors (unit eigenvectors of \(A^TA\))

\(\sigma_i\)— singular values (square roots of the equal eigenvalues of \(AA^T\) and \(A^TA\))

The rank of A is equal to numbers of \(\sigma _i\)

example:

\[A = \left [ \begin{matrix} 1&0 \\ 1&1 \end{matrix}\right] \\
\Downarrow \\
AA^T =
\left [ \begin{matrix} 1&0 \\ 1&1 \end{matrix}\right]
\left [ \begin{matrix} 1&1 \\ 0&1 \end{matrix}\right]
=\left [ \begin{matrix} 1&1 \\ 1&2 \end{matrix}\right]
\\
A^TA =
\left [ \begin{matrix} 1&1 \\ 0&1 \end{matrix}\right]
\left [ \begin{matrix} 1&0 \\ 1&1 \end{matrix}\right]
=\left [ \begin{matrix} 2&1 \\ 1&1 \end{matrix}\right]
\\
\Downarrow \\
det(AA^T - I) = 0 \ \quad \ det(A^TA - I) = 0 \\
\lambda_1 = \frac{3+\sqrt{5}}{2} , \sigma_1=\frac{1+\sqrt{5}}{2},
u_1= \frac{1}{\sqrt{1+\sigma_1^2}}\left [ \begin{matrix} 1 \\ \sigma_1 \end{matrix}\right],
v_1= \frac{1}{\sqrt{1+\sigma_1^2}}\left [ \begin{matrix} \sigma_1 \\ 1 \end{matrix}\right]
\\
\lambda_2 = \frac{3-\sqrt{5}}{2} , \sigma_1=\frac{1-\sqrt{5}}{2},
u_2= \frac{1}{\sqrt{1+\sigma_2^2}}\left [ \begin{matrix} \sigma_1 \\ -1 \end{matrix}\right],
v_2= \frac{1}{\sqrt{1+\sigma_2^2}}\left [ \begin{matrix} 1 \\ -\sigma_1 \end{matrix}\right]\\
\Downarrow \\
A =
\left [ \begin{matrix} u_1&u_2 \end{matrix}\right]
\left [ \begin{matrix} \sigma_1&\\&\sigma_2 \end{matrix}\right]
\left [ \begin{matrix} v_1^T\\v_2^T \end{matrix}\right]
\\
A\left [ \begin{matrix} v_1&v_2 \end{matrix}\right] =
\left [ \begin{matrix} u_1&u_2 \end{matrix}\right]
\left [ \begin{matrix} \sigma_1&\\&\sigma_2 \end{matrix}\right]
\]

7.2 Bases and Matrices in the SVD

Keys:

  1. The SVD produces orthonormal basis of \(u's\) and $ v's$ for the four fundamental subspaces.

    • \(u_1,u_2,...,u_r\) is an orthonormal basis of the column space. (\(R^m\))
    • \(u_{r+1},...,u_{m}\) is an orthonormal basis of the left nullspace. (\(R^m\))
    • \(v_1,v_2,...,v_r\) is an orthonormal basis of the row space. (\(R^n\))
    • \(v_{r+1},...,u_{n}\) is an orthonormal basis of the nullspace.(\(R^n\))
  2. Using those basis, A can be diagonalized :

    Reduced SVD: only with bases for the row space and column space.

    \[A = U_r \Sigma_r V_r^T \\
    U = \left [ \begin{matrix} u_1&\cdots&u_r\\ \end{matrix}\right] ,
    \Sigma_r = \left [ \begin{matrix} \sigma_1&&\\&\ddots&\\&&\sigma_r \end{matrix}\right],
    V_r^T=\left [ \begin{matrix} v_1\\ \vdots \\ v_r \end{matrix}\right] \\
    \Downarrow \\
    A = \left [ \begin{matrix} u_1&\cdots&u_r\\ \end{matrix}\right]
    \left [ \begin{matrix} \sigma_1&&\\&\ddots&\\&&\sigma_r \end{matrix}\right]
    \left [ \begin{matrix} v_1\\ \vdots \\ v_r \end{matrix}\right] \\
    = u_1\sigma_1v_{1}^T + u_2\sigma_2v_{2}^T + \cdots + u_r\sigma_rv_r^T
    \]

    Full SVD: include four subspaces.

    \[A = U \Sigma V^T \\
    U = \left [ \begin{matrix} u_1&\cdots&u_r&\cdots&u_n\\ \end{matrix}\right] ,
    \Sigma_r = \left [ \begin{matrix} \sigma_1&&\\&\ddots&\\&&\sigma_r \\ &&&\ddots \\ &&&&\sigma_n \end{matrix}\right],
    V^T=\left [ \begin{matrix} v_1\\ \vdots \\ v_r \\ \vdots \\ v_m \end{matrix}\right] \\
    \Downarrow \\
    A = \left [ \begin{matrix} u_1&\cdots&u_r&\cdots&u_n\\ \end{matrix}\right]
    \left [ \begin{matrix} \sigma_1&&\\&\ddots&\\&&\sigma_r \\ &&&\ddots \\ &&&&\sigma_n \end{matrix}\right]
    \left [ \begin{matrix} v_1\\ \vdots \\ v_r \\ \vdots \\ v_m \end{matrix}\right] \\
    = u_1\sigma_1v_{1}^T + u_2\sigma_2v_{2}^T + \cdots + u_r\sigma_rv_r^T\cdots + u_n\sigma_n v_n^{T} + \cdots + u_m\sigma_mv_m^T
    \]

    example: \(A=\left [ \begin{matrix} 3&0 \\ 4&5 \end{matrix}\right]\), r=2

    \[A^TA =\left [ \begin{matrix} 25&20 \\ 20&25 \end{matrix}\right],
    AA^T =\left [ \begin{matrix} 9&12 \\ 12&41 \end{matrix}\right] \\
    \lambda_1 = 45, \sigma_1 = \sqrt{45},
    v_1 = \frac{1}{\sqrt{2}}
    \left [ \begin{matrix} 1 \\ 1 \end{matrix}\right],
    u_1 = \frac{1}{\sqrt{10}}
    \left [ \begin{matrix} 1 \\ 3 \end{matrix}\right]\\
    \lambda_2 = 5, \sigma_2 = \sqrt{5} ,
    v_2 = \frac{1}{\sqrt{2}}
    \left [ \begin{matrix} -1 \\ 1 \end{matrix}\right],
    u_2 = \frac{1}{\sqrt{10}}
    \left [ \begin{matrix} -3 \\ 1 \end{matrix}\right]\\
    \Downarrow \\
    U = \frac{1}{\sqrt{10}}
    \left [ \begin{matrix} 1&-3 \\ 3&1 \end{matrix}\right],
    \Sigma = \left [ \begin{matrix} \sqrt{45}& \\ &\sqrt{5} \end{matrix}\right],
    V = \frac{1}{\sqrt{2}}
    \left [ \begin{matrix} 1&-1 \\ 1&1 \end{matrix}\right]
    \]

7.3 The geometry of the SVD

  1. \(A = U\Sigma V^T\) factors into (rotation)(stretching)(rotation), the geometry shows how A transforms vectors x on a circle to vectors Ax on an ellipse.

  1. Polar decomposition factors A into QS : rotation \(Q=UV^T\) times streching \(S=V \Sigma V^T\).

    \[V^TV = I \\
    A = U\Sigma V^T = (UV^T)(V\Sigma V^T) = (Q)(S)
    \]

    Q is orthogonal and inclues both rotations U and \(V^T\), S is symmetric positive semidefinite and gives the stretching directions.

    If A is invertible, S is positive definite.

  2. The Pseudoinverse \(A^{+}: AA^{+}=I\)

    • \(Av_i=\sigma_iu_i\) : A multiplies \(v_i\) in the row space of A to give \(\sigma_i u_i\) in the column space of A.

    • If \(A^{-1}\) exists, \(A^{-1}u_i=\frac{v_i}{\sigma}\) : \(A^{-1}\) multiplies \(u_i\) in the row space of \(A^{-1}\) to give \(\sigma_i u_i\) in the column space of \(A^{-1}\), \(1/\sigma_i\) is singular values of \(A^{-1}\).

    • Pseudoinverse of A: if \(A^{-1}\) exists, then \(A^{+}\) is the same as \(A^{-1}\)

      \[A^{+} = V \Sigma^{+}U^{T} = \left [ \begin{matrix} v_1&\cdots&v_r&\cdots&v_n\\ \end{matrix}\right]
      \left [ \begin{matrix} \sigma_1^{-1}&&\\&\ddots&\\&&\sigma_r^{-1} \\ &&&\ddots \\ &&&&\sigma_n^{-1} \end{matrix}\right]
      \left [ \begin{matrix} u_1\\ \vdots \\ u_r \\ \vdots \\ u_m \end{matrix}\right] \\
      \]

7.4 Principal Component Analysis ( PCA by the SVD)

PCA gives a way to understand a data plot in dimension m, applications mostly are human genetics \ face recognition\ finance \ model order reduction (computation) .

The sample covariance matrix \(S=AA^T/(n-1)\)

The crucial connection to linear algebra is in the singular values and singular vectors of A, which comes from the eigenvalues \(\lambda=\sigma^2\) and the eigenvectors u of the sample covariance matrix \(S=AA^T/(n-1)\)

  1. The total variance in the data is the sum of all eigenvalues and of sample variances \(s^2\) :

    \[T = \sigma_1^2 + \cdots + \sigma_m^2 = s_1^2 + \cdots + s_m^2 = trace(diagonal \ \ sum)
    \]
  2. The first eigenvector \(u_1\) of S points in the most significant direction of the data.That direction accounts for a fraction \(\sigma_1^2/T\) of the total variance.

  3. The next eigenvectors \(u_2\) (orthogonal to \(u_1\)) accounts for a small fraction \(\sigma_2^2/T\).

  4. Stop when those fractions are small. You have the R directions that explain most of the data.The n data points are very near an R-dimensional subspace with basis \(u_1, \cdots, u_R\), which are the principal components.

  5. R is the "effective rank" of A. The true rank r is probably m or n : full rank matrix.

example: \(A = \left[ \begin{matrix} 3&-4&7&-1&-4&-3 \\ 7&-6&8&-1&-1&-7 \end{matrix} \right]\) has sample covariance \(S=AA^T/5 = \left [ \begin{matrix} 20&25 \\ 25&40 \end{matrix}\right]\)

The eigenvalues of S are 57 and 3,so the first rank one piece \(\sqrt{57}u_1v_1^T\) is much larger than the second piece \(\sqrt{3}u_2v_2^T\).

The leading eigenvector \(u_1 = (0.6,0.8)\) shows the direction that you see in the scatter graph.

The SVD of A (centered data) shows the dominant direction in the scatter plot.

The second eigenvector \(u_2\) is perpendicular to \(u_1\). The second singular value \(\sigma_2=\sqrt{3}\) measures the spread across the dominant line.

7. The Singular Value Decomposition(SVD)的更多相关文章

  1. [Math Review] Linear Algebra for Singular Value Decomposition (SVD)

    Matrix and Determinant Let C be an M × N matrix with real-valued entries, i.e. C={cij}mxn Determinan ...

  2. SVD singular value decomposition

    SVD singular value decomposition https://en.wikipedia.org/wiki/Singular_value_decomposition 奇异值分解在统计 ...

  3. 奇异值分解(We Recommend a Singular Value Decomposition)

    奇异值分解(We Recommend a Singular Value Decomposition) 原文作者:David Austin原文链接: http://www.ams.org/samplin ...

  4. We Recommend a Singular Value Decomposition

    We Recommend a Singular Value Decomposition Introduction The topic of this article, the singular val ...

  5. 【转】奇异值分解(We Recommend a Singular Value Decomposition)

    文章转自:奇异值分解(We Recommend a Singular Value Decomposition) 文章写的浅显易懂,很有意思.但是没找到转载方式,所以复制了过来.一个是备忘,一个是分享给 ...

  6. [转]奇异值分解(We Recommend a Singular Value Decomposition)

    原文作者:David Austin原文链接: http://www.ams.org/samplings/feature-column/fcarc-svd译者:richardsun(孙振龙) 在这篇文章 ...

  7. [转载]We Recommend a Singular Value Decomposition

    原文:http://www.ams.org/samplings/feature-column/fcarc-svd Introduction The topic of this article, the ...

  8. Singular value decomposition

    SVD is a factorization of a real or complex matrix. It has many useful applications in signal proces ...

  9. 关于SVD(Singular Value Decomposition)的那些事儿

    SVD简介 SVD不仅是一个数学问题,在机器学习领域,有相当多的应用与奇异值都可以扯上关系,比如做feature reduction的PCA,做数据压缩(以图像压缩为代表)的算法,还有做搜索引擎语义层 ...

  10. 从矩阵(matrix)角度讨论PCA(Principal Component Analysis 主成分分析)、SVD(Singular Value Decomposition 奇异值分解)相关原理

    0. 引言 本文主要的目的在于讨论PAC降维和SVD特征提取原理,围绕这一主题,在文章的开头从涉及的相关矩阵原理切入,逐步深入讨论,希望能够学习这一领域问题的读者朋友有帮助. 这里推荐Mit的Gilb ...

随机推荐

  1. HashMap,TreeMap,LinkedHashMap的默认排序

    简单描述 Map是键值对的集合接口,它的实现类主要包括:HashMap,TreeMap,HashTable以及LinkedHashMap等. TreeMap:能够把它保存的记录根据键(key)排序,默 ...

  2. 【Azure 应用服务】App Service 的.NET Version选择为.NET6,是否可以同时支持运行ASP.NET V4.8的应用呢?

    问题描述 App Service 的.NET Version选择为.NET6,是否可以同时支持运行ASP.NET V4.8的应用呢? 问题解答 答案是可以的,Azure App Service .NE ...

  3. TensorFlow 回归模型

    TensorFlow 回归模型 首先,导入所需的库和模块.代码中使用了numpy进行数值计算,matplotlib进行数据可视化,tensorflow进行机器学习模型的构建和训练,sklearn进行多 ...

  4. kafka的数据同步原理ISR、ACK、LEO、HW

    1.数据可靠性保证,数据同步 为保证 producer 发送的数据,能可靠的发送到指定的 topic,topic 的每个 partition 收到 producer 发送的数据后,都需要向 produ ...

  5. k8s中port-forward 、service的nodeport与ingress区别

    在Kubernetes中,port-forward.Service的NodePort和Ingress都是用于将外部流量引入集群内部的方法,但它们在使用场景.实现方式和功能上有所不同. port-for ...

  6. P2670 [NOIP2015 普及组] 扫雷游戏

    题目背景 NOIP2015 普及组 T2 题目描述 扫雷游戏是一款十分经典的单机小游戏.在 nn 行 mm 列的雷区中有一些格子含有地雷(称之为地雷格),其他格子不含地雷(称之为非地雷格).玩家翻开一 ...

  7. JAVA | Guava EventBus 使用 发布/订阅模式

    系列文章目录 Go | Go 语言打包静态文件以及如何与Gin一起使用Go-bindata Go | Gin 解决跨域问题跨域配置 目录 系列文章目录 前言 一.为什么要用 Observer模式以及 ...

  8. 掌握Python库的Bokeh,就能让你的交互炫目可视化

    本文分享自华为云社区<Bokeh图形魔法:掌握绘图基础与高级技巧,定制炫目可视化>,作者: 柠檬味拥抱. Bokeh是一个用于创建交互式可视化图形的强大Python库.它不仅易于使用,而且 ...

  9. [置顶] java动态控制线程的启动和停止

    最近项目有这样的需求:原来系统有个计算的功能,但该功能执行时间会很长(大概需要几个小时才能完成),如果执行过程中出现了错误的话,也只能默默的等待错误执行完成才行,无法做到动态的对该功能进行停止. 我了 ...

  10. 自己想到的几道Java面试题

    1.在抽象类中能否写main方法,为什么? 2.在接口中能否写main方法,为什么? 3.Java能否使用静态局部变量,为什么? 4.Java类变量,实例变量,局部变量在多线程环境下是否线程安全,为什 ...