Tikhonov regularization 吉洪诺夫正则化

这个知识点很重要，但是，我不懂。

第一个问题：为什么要做正则化？

In mathematics, statistics, and computer science, particularly in the fields of machine learning and inverse problems, regularization is a process of introducing additional information in order to solve an ill-posed problem or to prevent overfitting.

And, what is ill-posed problem?... ...

And, what is overfitting? In statistics, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably", as the next figure shows.

Figure 1. The green curve represents an overfitted model and the black line represents a regularized model. While the green line best follows the training data, it is too dependent on that data and it is likely to have a higher error rate on new unseen data, compared to the black line.

第二个问题：常用的正则化方法有哪些？

第三个问题：The advantages fo Tikhonov regularizatioin

The fourth question: Tikhonov regularization

Tikhonov regularization, named for Andrey Tikhonov, is the most commonly used method of regularization of ill-posed problems. In statistics, the method is known as ridge regression, in machine learning it is known as weight decay, and with multiple independent discoveries, it is also variously known as the Tikhonov–Miller method, the Phillips–Twomey method, the constrained linear inversion method, and the method of linear regularization. It is related to the Levenberg–Marquardt algorithm for non-linear least-squares problems.

Suppose that for a known matrix A and vector b, we wish to find a vector x such that:

The standard approach is ordinary least squares linear regression. However, if no x satisfies the equation or more than one x does—that is, the solution is not unique—the problem is said to be ill posed. In such cases, ordinary least squares estimation leads to an overdetermined (over-fitted), or more often an underdetermined (under-fitted) system of equations. Most real-world phenomena have the effect of low-pass filters in the forward direction where A maps x to b. Therefore, in solving the inverse-problem, the inverse mapping operates as a high-pass filter that has the undesirable tendency of amplifying noise (eigenvalues / singular values are largest in the reverse mapping where they were smallest in the forward mapping). In addition, ordinary least squares implicitly nullifies every element of the reconstructed version of x that is in the null-space of A, rather than allowing for a model to be used as a prior for . Ordinary least squares seeks to minimize the sum of squared residuals, which can be compactly written as:

where is the Euclidean norm.

In order to give preference to a particular solution with desirable properties, a regularization term can be included in this minimization:

for some suitably chosen Tikhonov matrix, . In many cases, this matrix is chosen as a multiple of the identity matrix (), giving preference to solutions with smaller norms; this is known as L₂ regularization.^[1] In other cases, high-pass operators (e.g., a difference operator or a weighted Fourier operator) may be used to enforce smoothness if the underlying vector is believed to be mostly continuous. This regularization improves the conditioning of the problem, thus enabling a direct numerical solution. An explicit solution, denoted by , is given by:

, process can be seen at (https://blog.csdn.net/nomadlx53/article/details/50849941).

The effect of regularization may be varied via the scale of matrix . For this reduces to the unregularized least squares solution provided that (A^TA)⁻¹ exists.

L₂ regularization is used in many contexts aside from linear regression, such as classification with logistic regression or support vector machines,^[2] and matrix factorization.^[3]

对于y=Xw，若X无解或有多个解，称这个问题是病态的。病态问题下，用最小二乘法求解会导致过拟合或欠拟合，用正则化来解决。

设X为m乘n矩阵：

过拟合模型：m<<nm<<n，欠定方程，存在多解的可能性大；
欠拟合模型：m>>nm>>n，超定方程，可能无解，或者有解但准确率很低

REF:

https://blog.csdn.net/darknightt/article/details/70179848

Tikhonov regularization 吉洪诺夫正则化的更多相关文章

matlab-罗曼诺夫斯基准则剔除粗大值
罗曼诺夫斯基准则原理罗曼诺夫斯基准则又称 t检验准则,其特点是首先删除一个可疑的的测得值,然后按 t分布检验被剔除的测量值是否含有粗大误差罗曼诺夫斯基准则 1)选取合适的显著度a,选择合适的数 ...
Tikhonov regularization和岭回归
就实现过程来讲,两者是一样的,都是最小二乘法的改进,对于病态矩阵的正则化,只不过分析的角度不一样,前者是解决机器学习中过拟合问题,机器学习一般是监督学习,是从学习角度来说的,后者是数学家搞的,是为了解 ...
切诺夫界证明（Chernoff bound）
软阈值迭代算法（ISTA）和快速软阈值迭代算法（FISTA）
缺月挂疏桐,漏断人初静. 谁见幽人独往来,缥缈孤鸿影. 惊起却回头,有恨无人省. 拣尽寒枝不肯栖,寂寞沙洲冷.---- 苏轼更多精彩内容请关注微信公众号 "优化与算法" ISTA ...
Machine learning | 机器学习中的范数正则化
目录 1. \(l_0\)范数和\(l_1\)范数 2. \(l_2\)范数 3. 核范数(nuclear norm) 参考文献使用正则化有两大目标: 抑制过拟合: 将先验知识融入学习过程,比如稀疏 ...
Stanford机器学习笔记-3.Bayesian statistics and Regularization
3. Bayesian statistics and Regularization Content 3. Bayesian statistics and Regularization. 3.1 Und ...
柯尔莫可洛夫-斯米洛夫检验（Kolmogorov–Smirnov test，K-S test）
柯尔莫哥洛夫-斯米尔诺夫检验(Колмогоров-Смирнов检验)基于累计分布函数,用以检验两个经验分布是否不同或一个经验分布与另一个理想分布是否不同. 在进行cumulative probab ...
[No0000119]什么是柳比歇夫的时间事件记录法
上图是我过去一年来做的时间事件记录中的某几天的记录文字.从接触到这种方法以来,也就是2009年的7月31日到今天,我已经作了一年多时间的记录.那么什么是时间事件记录?很简单,就像那两幅图片上所展示的, ...
正则化--L2正则化
请查看以下泛化曲线,该曲线显示的是训练集和验证集相对于训练迭代次数的损失. 图 1 显示的是某个模型的训练损失逐渐减少,但验证损失最终增加.换言之,该泛化曲线显示该模型与训练集中的数据过拟合.根据奥卡 ...

随机推荐

C#线程同步（1）- 临界区＆Lock
文章原始出处 http://xxinside.blogbus.com/logs/46441956.html 预备知识:线程的相关概念和知识,有多线程编码的初步经验. 一个机会,索性把线程同步的问题在C ...
MemoryStream请求与接收
//流请求 static void Main(string[] args) { Console.WriteLine("Hello World!"); //Console.ReadL ...
io模型---非阻塞模型
Linux下,可以通过设置socket使其变为non-blocking.当对一个non-blocking socket执行读操作时,流程是这个样子: 从图中可以看出,当用户进程发出read操作时,如果 ...
jQuary学习の五のAJAX
AJAX 是与服务器交换数据的技术,它在不重载全部页面的情况下,实现了对部分网页的更新. 一.jQuery load() 方法 jQuery load() 方法是简单但强大的 AJAX 方法. loa ...
【做题】CSA72G - MST and Rectangles——Borůvka&线段树
原文链接 https://www.cnblogs.com/cly-none/p/CSA72G.html 题意:有一个\(n \times n\)的矩阵\(A\),\(m\)次操作,每次在\(A\)上三 ...
vue页面优化中的v-show和v-if使用比较
在页面中使用了v-if做了一个tab框,点击不同的tab框,并加载不同的内容,由于各tab框对应的内容是4到5张统计图,加载的数据量比较大,发现后台请求响应返回的时间很快,在100ms以内,但点击ta ...
记录一下最近的解决的坑爹bug
最近解决的bug长得都很别致啊,记录一下一 :天气插件引用报403 项目里有一个天气插件引用一直报403 后来确定原因是headers里缺少referer源,无法访问资源的服务器,再后来又发现项目引 ...
js改变数组的两个元素的位子，互换、置顶
//js数组的元素上移和下移动 var fieldData=[ {name:'id',value:'ID'} , {name:'username',value:'用户名'} , {name:'emai ...
Axure下拉列表的交互事件 + 自定义元件库
下拉列表的交互事件: 场景:当点击第一个下拉列表框的江苏时,第二个列表框会显示江苏省的城市:当点击第一个下拉列表框的北京时,第二个列表框会显示北京市的区操作:把第二个列表框设置为动态面板,设置为两种 ...
Go语言学习之13 日志管理平台开发
主要内容: 1. ElasticSearch介绍与使用2. kibana介绍与使用 1. ElasticSearch安装详见上节内容2. kibana安装 (1) 下载ES,下载地址:https:/ ...

Tikhonov regularization 吉洪诺夫 正则化

Tikhonov regularization 吉洪诺夫 正则化的更多相关文章

随机推荐

热门专题

Tikhonov regularization 吉洪诺夫正则化

Tikhonov regularization 吉洪诺夫正则化的更多相关文章