Definition of matrix norms

In my previous post, I introduced various definitions of matrix norms in \(\mathbb{R}^{n \times n}\) based on the corresponding vector norms in \(\mathbb{R}^n\). Meanwhile, the equivalence of different vector norms and their induced metrics and topologies in \(\mathbb{R}^n\) is also inherited into \(\mathbb{R}^{n \times n}\). In this article, we’ll show why the above defined matrix norms are valid.

Generally, the definition of a matrix norm in \(\mathbb{R}^{n \times n}\) should satisfy the following four conditions:

Positive definiteness: for all \(A \in \mathbb{R}^{n \times n}\), \(\norm{A} \geq 0\). \(\norm{A} = 0\) if and only if \(A = 0\).
Absolute homogeneity: for all \(\alpha \in \mathbb{R}\) and \(A \in \mathbb{R}^{n \times n}\), \(\norm{\alpha A} = \abs{\alpha} \norm{A}\).
Triangle inequality: for all \(A, B \in \mathbb{R}^{n \times n}\), \(\norm{A + B} \leq \norm{A} + \norm{B}\).
Sub-multiplicity: for all \(A, B \in \mathbb{R}^{n \times n}\), \(\norm{AB} \leq \norm{A} \norm{B}\).

Therefore, we need to prove the following theorem in order to meet the above requirements.

Theorem Let \(\norm{\cdot}\) be a norm on \(\mathbb{R}^n\). Then for all \(A \in \mathbb{R}^{n \times n}\), its matrix norm \(\zeta: \mathbb{R}^{n \times n} \rightarrow \mathbb{R}\) can be defined as
\[
\zeta(A) = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A \vect{x}}}{\norm{\vect{x}}} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \norm{\vect{x}}=1} \norm{A \vect{x}}
\]

Proof a) Positive definiteness and absolute homogeneity directly inherit from vector norms.

b) The triangle inequality can be proved as following.
\[
\begin{aligned}
\zeta(A + B) &= \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{(A + B) \vect{x}}}{\norm{\vect{x}}} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x} + B\vect{x}}}{\norm{\vect{x}}} \\
& \leq \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}} + \norm{B\vect{x}}}{\norm{\vect{x}}} \leq \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}}}{\norm{\vect{x}}} + \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{B\vect{x}}}{\norm{\vect{x}}} \\
&= \zeta(A) + \zeta(B).
\end{aligned}
\]

c) For sub-multiplicity, we have
\[
\begin{aligned}
\zeta(AB) &= \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{AB\vect{x}}}{\norm{\vect{x}}} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{AB\vect{x}} \norm{B\vect{x}}}{\norm{B\vect{x}}\norm{\vect{x}}} \\
&\leq \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}}}{\norm{\vect{x}}} \cdot \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{B\vect{x}}}{\norm{\vect{x}}} = \norm{A} \cdot \norm{B}.
\end{aligned}
\]
d) Prove \(\zeta(A) = \sup_{\forall \vect{x} \in \mathbb{R}^n, \norm{\vect{x}} = 1} \norm{A\vect{x}}\).

Note that \(\frac{1}{\norm{\vect{x}}}\) is a scalar value in \(\mathbb{R}\), then with the proved absolute homogeneity, we have
\[
\zeta(A) = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}}}{\norm{\vect{x}}} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \left\Vert A \cdot \frac{\vect{x}}{\norm{\vect{x}}} \right\Vert.
\]
By letting \(\vect{x}' = \frac{\vect{x}}{\norm{\vect{x}}}\), we have this part proved.

Summarizing a) to d), \(\norm{\cdot}\) is literally a matrix norm induced from the corresponding vector norm.

Next, we prove the validity of the detailed formulations of the matrix norms, i.e.

1-norm: \(\norm{A}_1 = \max_{1 \leq j \leq n} \sum_{i=1}^n \abs{a_{ij}}\), which is the maximum column sum;
2-norm: \(\norm{A}_2 = \sqrt{\rho(A^T A)}\), where \(\rho\) represents the spectral radius, i.e. the maximum eigenvalue of \(A^TA\);
\(\infty\)-norm: \(\norm{A}_{\infty} = \max_{1 \leq i \leq n} \sum_{j=1}^n \abs{a_{ij}}\), which is the maximum row sum.

a) 1-norm: Because
\[
\begin{aligned}
\norm{A\vect{x}}_1 &= \sum_{i=1}^n \left\vert \sum_{j=1}^n a_{ij} x_j \right\vert \leq \sum_{i=1}^n \sum_{j=1}^n \abs{a_{ij} x_j} = \sum_{j=1}^n \left( \abs{x_j} \sum_{i=1}^n \abs{a_{ij}} \right) \\
&\leq \left( \sum_{j=1}^n \abs{x_j} \right) \cdot \max_{1 \leq j \leq n} \left( \sum_{i=1}^n \abs{a_{ij}} \right),
\end{aligned}
\]
we have
\[
\norm{A}_1 \leq \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}}_1}{\norm{\vect{x}}_1} \leq \frac{\left( \sum_{j=1}^n \abs{x_j} \right) \cdot \max_{1 \leq j \leq n} \left( \sum_{i=1}^n \abs{a_{ij}} \right)}{\sum_{j=1}^n \abs{x_j}} = \max_{1 \leq j \leq n} \left( \sum_{i=1}^n \abs{a_{ij}} \right).
\]
Then, we need to show that the maximum value on the right hand side is achievable.

Assume that when \(j = j_0\), \(\sum_{i=1}^n \abs{a_{ij}}\) has the maximum value. If this value is zero, it means \(A\) is a zero matrix and the definition of matrix 1-norm is trivially true. If this value is not zero, by letting \(\vect{x} = (\delta_{ij_0})_{i \geq 1}^n\) with \(\delta_{ij_0}\) being the Kronecker delta, we have
\[
\frac{\norm{A\vect{x}}_1}{\norm{\vect{x}}_1} = \frac{\sum_{i=1}^n \abs{a_{ij_0}}}{1} = \max_{1 \leq j \leq n} \left( \sum_{i=1}^n \abs{a_{ij}} \right).
\]
b) 2-norm: The proof for this part needs the intervention of inner product \(\langle \cdot, \cdot \rangle\) of vectors in \(\mathbb{R}^n\), from which the vector 2-norm can be induced. Then we have
\[
\norm{A}_2 = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}}_2}{\norm{\vect{x}}_2} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \sqrt{\frac{\langle A\vect{x}, A\vect{x} \rangle}{\langle \vect{x}, \vect{x} \rangle}} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \sqrt{\frac{\langle A^*A\vect{x}, \vect{x} \rangle}{\langle \vect{x}, \vect{x} \rangle}},
\]
where \(A^*\) is the adjoint operator, i.e. transpose of \(A\). Therefore, \(A^*A\) is a real valued symmetric matrix which has \(n\) real eigenvalues \(\{\lambda_i\}_{i=1}^n\) with \(0 \leq \lambda_1 \leq \cdots \leq \lambda_n\) and \(n\) corresponding orthonormal eigenvectors \(\{\vect{v}_i\}_{i=1}^n\) (N.B. There may be duplicates in the eigenvalues). For all \(\vect{x} \in \mathbb{R}^n\), it can be expanded as \(\vect{x} = \sum_{i=1}^n a_i \vect{v}_i\) and \(A^*A\vect{x} = \sum_{i=1}^n a_i A^*A \vect{v}_i = \sum_{i=1}^n a_i \lambda_i \vect{v}_i\). Then we have
\[
\begin{aligned}
\langle A^*A\vect{x}, \vect{x} \rangle &= \left\langle \sum_{i=1}^n a_i \lambda_i \vect{v}_i, \sum_{j=1}^n a_j \vect{v}_j \right\rangle = \sum_{i=1}^n \sum_{j=1}^n \lambda_i a_i^2 \langle \vect{v}_i, \vect{v}_j \rangle \\
&= \sum_{i=1}^n \sum_{j=1}^n \lambda_i a_i^2 \delta_{ij} = \sum_{i=1} \lambda_i a_i^2.
\end{aligned}
\]
Meanwhile,
\[
\langle \vect{x}, \vect{x} \rangle = \left\langle \sum_{i=1}^n a_i \vect{v}_i, \sum_{j=1}^n a_j \vect{v}_j \right\rangle = \sum_{i=1}^n \sum_{j=1}^n a_i a_j \langle \vect{v}_i, \vect{v}_j \rangle = \sum_{i=1}^n a_i^2.
\]
Therefore,
\[
\frac{\norm{A\vect{x}}_2}{\norm{\vect{x}}_2} \leq \sqrt{\frac{\lambda_n \sum_{i=1}^n a_i^2}{\sum_{i=1}^n a_i^2}} = \sqrt{\lambda_n}.
\]
By letting \(a_1 = a_2 = \cdots = a_{n-1} = 0\) and \(a_n = 1\), we have \(\frac{\norm{A\vect{x}}_2}{\norm{\vect{x}}_2} = \sqrt{\lambda_n}\). Hence,
\[
\norm{A}_2 = \sqrt{\lambda_n} = \sqrt{\rho(A^*A)}
\]
and the definition of matrix 2-norm is valid.

c) \(\infty\)-norm:
\[
\begin{aligned}
\norm{A\vect{x}}_{\infty} &= \max_{1 \leq i \leq n} \left( \left\vert \sum_{j=1}^n a_{ij} x_j \right\vert \right) \leq \max_{1 \leq i \leq n} \left( \sum_{j=1}^n \abs{a_{ij}} \cdot \abs{x_j} \right) \\
&= \max_{1 \leq i \leq n} \left( \left( \sum_{j=1}^n \abs{a_{ij}} \right) \cdot \left( \max_{1 \leq j \leq n} \abs{x_j} \right) \right) = \left( \max_{1 \leq i \leq n} \sum_{j=1}^n \abs{a_{ij}} \right) \cdot \left( \max_{1 \leq j \leq n} \abs{x_j} \right) \\
\norm{\vect{x}}_{\infty} &= \max_{1 \leq i \leq n} \abs{x_i}
\end{aligned}
\]
Therefore, \(\frac{\norm{A\vect{x}}_{\infty}}{\norm{\vect{x}}_{\infty}} \leq \max_{1 \leq i \leq n} \sum_{j=1}^n \abs{a_{ij}}\). Then, we need to prove this maximum value is achievable.

Assume when \(i = i_0\), \(\sum_{j=1}^n \abs{a_{i_0 j}}\) achieves the maximum. If this value is zero, \(A\) is a zero matrix and the definition of matrix \(\infty\)-norm is trivially true. If this value is not zero, by letting \(\vect{x} = (\sgn(a_{i_0 1}), \cdots, \sgn(a_{i_0 n}))^{\rm T}\), we have \(\norm{\vect{x}}_{\infty} = 1\) and \(\norm{A\vect{x}}_{\infty} = \sum_{j=1}^n \abs{a_{i_0 j}} = \max_{1 \leq i \leq n} \sum_{j=1}^n \abs{a_{ij}}\). Hence, \(\frac{\norm{A\vect{x}}_{\infty}}{\norm{\vect{x}}_{\infty}} = \max_{1 \leq i \leq n} \sum_{j=1}^n \abs{a_{ij}}\) and the definition of \(\infty\)-norm is valid.

Definition of matrix norms的更多相关文章

James Munkres Topology: Theorem 20.3 and metric equivalence
Proof of Theorem 20.3 Theorem 20.3 The topologies on \(\mathbb{R}^n\) induced by the euclidean metri ...
CSharpGL(13)用GLSL实现点光源(point light)和平行光源(directional light)的漫反射(diffuse reflection)
CSharpGL(13)用GLSL实现点光源(point light)和平行光源(directional light)的漫反射(diffuse reflection) 2016-08-13 由于CSh ...
Numpy应用100问
对于从事机器学习的人,python+numpy+scipy+matplotlib是重要的基础:它们基本与matlab相同,而其中最重要的当属numpy:因此,这里列出100个关于numpy函数的问题, ...
Opengl的gl_NormalMatrix【转】
原文地址:http://blog.csdn.net/ichild1964/article/details/9728357 参考:http://www.gamedev.net/topic/598985- ...
Applying Eigenvalues to the Fibonacci Problem
http://scottsievert.github.io/blog/2015/01/31/the-mysterious-eigenvalue/ The Fibonacci problem is a ...
Matlab norm 用法小记
Matlab norm 用法小记 matlab norm (a) 用法以及实例 norm(A,p)当A是向量时norm(A,p) Returns sum(abs(A).^p)^(1/p), for ...
《Machine Learning》系列学习笔记之第二周
第二周第一部分 Multivariate Linear Regression Multiple Features Note: [7:25 - θT is a 1 by (n+1) matrix an ...
一些矩阵范数的subgradients
目录引正交不变范数定理1 定理2 例子:谱范数例子:核范数算子范数定理3 定理4 例子 \(\ell_2\) <Subgradients> Subderivate-wiki S ...
subgradients
目录定义上镜图解释次梯度的存在性性质极值非负数乘 \(\alpha f(x)\) 和,积分,期望仿射变换仿梯度混合函数应用 Pointwise maximum 上确界 suprem ...

随机推荐

localhost 和 127.0.0.1 认识
概念和工作原理 1.概念: localhost:也叫local ,正确的解释是:本地服务器 127.0.0.1:在windows等系统的正确解释是:本机地址(本机服务器) 2.工作原理 localho ...
TF Multi-GPU single input queue
多GPU的数据训练,feed images, labels = cifar10.distorted_inputs() split_images = tf.split(images, FLAGS.num ...
JSON循环遍历解析
使用递归方式遍历JSON,解析JSON用的是:net.sf.json, alibaba.fastjson测试可用 @Test public void test() { String json = &q ...
前端面试题整理—Webpack篇
1.什么是webpack,与grunt和gulp有啥不同 webpack是一个模块打包工具,在webpack里面一切皆模块通过loader转换文件,通过plugin注入钩子,最后输出有多个模块组合成 ...
JSP/Serlet 使用fileupload上传文件
需要引用的jar commons-fileupload-1.3.1.jar commons-io-2.2.jar index.jsp <body> <center> <h ...
Exp4 恶意代码分析 20164314
一.实践目标 1.是监控你自己系统的运行状态,看有没有可疑的程序在运行. 2.是分析一个恶意软件,就分析Exp2或Exp3中生成后门软件:分析工具尽量使用原生指令或sysinternals,systr ...
vue父子组件生命周期执行顺序
之前写了vue的生命周期,本以为明白了vue实例在创建到显示在页面上以及销毁等一系列过程,以及各个生命周期的特点.然而今天被问到父子组件生命周期执行顺序的时候一头雾水,根本不知道怎么回事.然后写了一段 ...
函数语法：原生js判断某个元素是否有指定的class名的几种方法
var aLi = document.querySelectorAll('#tabs li'); for(var i = 0;i <p.length;i++){ //第一种方法,用classLi ...
【翻译】 Guice 动机——依赖注入的动机
原文链接动机将所有的内容连接在一起时应用开发的一个单调乏味的部分.有几种方式来将数据.服务.presetntation类连接到一起.为了对比这些方法,我将为披萨订购网站编写账单代码: public ...
推荐.Net、C# 逆向反编译四大工具利器
转自:https://blog.csdn.net/kongwei521/article/details/54927689 在项目开发过程中,估计也有人和我遇到过同样的经历:运行环境出现了重大Bug亟需 ...

Definition of matrix norms

Definition of matrix norms的更多相关文章

随机推荐

热门专题