Aggregation Models

chym 2024-10-28 05:56:42 原文

这是Coursera上《机器学习技法》的课程笔记。

　　Aggregation models: mix or combine hypotheses for better performance, and it's a rich family. Aggregation can do better with many (possibly weaker) hypotheses.

　　Suppose we have $T$ hypotheses ,denoted by $g_1$, $g_2$, ... ,$g_T$. There are four different approachs to get a appregation model:

1.Select the best one $g_{t_*}$ from validation error $$G(x)=g_{t_*}(x) with t_*=argmin_{t \in \{1,2,...,T\}}E_{val}(g^-_t)$$

2.Mix all hypotheses uniformly $$G(x)=sign(\sum_{t=1}^T1*g_t(x))$$

3.mix all hypotheses non-uniformly $$G(x)=sign(\sum_{t=1}^T\alpha_t*g_t(x)) \quad with \quad \alpha_t \geq 0$$

　　NOTE: conclude select and mix uniformly.

4.Combine all hypotheses conditionally $$G(x)=sign(\sum_{t=1}^Tq_t(x)*g_t(x)) \quad with \quad q_t(x)\geq 0$$

　　NOTE: conclude non-uniformly

Why aggregation work?

In the left graph, we get a strong $G(x)$ by mixing different weak hypotheses uniformly. In some sense, aggregation can be seen as feature transform.

In the right graph, we get a moderate $G(x)$ by mixing different weak hypotheses uniformly. In some sense, aggregation can be seen as regularization.

appgegation type	blending	learning
uniform	voting/averging	Bagging
non-uniform	linear	Adaboost
conditional	stacking	Decision Tree

Uniform Blending

Classification: $G(x)=sign(\sum_{t=1}^T1*g_t(x))$

Regression:$G(x)=\frac{1}{T}\sum_{t=1}^Tg_t(x)$

And uniformly blending can reduce variance for more stable performance(数学推导可见课件207_handout.pdf).

Linear Blending

Classification:$G(x)=sign(\sum_{t=1}^T\alpha_t*g_t(x)) \quad with \quad \alpha_t \geq 0$

Regression:$G(x)=\frac{1}{T}\sum_{t=1}^T\alpha_t*g_t(x) \quad with \quad \alpha_t \geq 0$

How to choose $\alpha$? We need get some $\alpha$ to minimize $E_{in}$. $$\mathop {\min }\limits_{\alpha_t\geq0}\frac{1}{N}\sum_{n=1}^Nerr\Big(y_n,\sum_{t=1}^T\alpha_tg_t(x_n)\Big)$$

so $ linear blending = LinModel + hypotheses as transform + constraints$.

　　Given $g_1^-$, $g_2^-$, ..., $g_T^-$ from $D_{train}$, transform $(x_n, y_n)$ in $D_{val}$ to $(z_n=\Phi^-(x_n),y_n)$,where $\Phi^-(x)=(g_1^-(x),...,g_T^-(x))$.And

compute $\alpha$ = LinearModel$\Big(\{(z_n,y_n)\}\Big)$
return $G_{LINB}(x)=LinearHypothesis_\alpha(\Phi(x))$

Bootstrap Aggregation(bagging)

Bootstrap sample $\widetilde{D}_t$: resample N examples from $D$ uniformly with replacement - can also use arbitracy N' instead of N.

bootstrap aggregation:

　　consider a physical iterative process that for t=1,2,...,T:

request size-N' data $\widetilde{D}_t$ from bootstrap;
obtain $g_t$ by $\mathcal{A}(\widetilde{D}_t)$, $G=Uniform(\{g_t\})$.

Adaptive Boosting (AdaBoost) Algorithm

Decision Tree

Random Forest

$$RF = bagging +random-subspace C&RT$$

Aggregation Models的更多相关文章

机器学习技法课之Aggregation模型
Courses上台湾大学林轩田老师的机器学习技法课之Aggregation 模型学习笔记. 混合(blending) 本笔记是Course上台湾大学林轩田老师的<机器学习技法课>的学习笔记 ...
机器学习技法-GBDT算法
课程地址:https://class.coursera.org/ntumltwo-002/lecture 之前看过别人的竞赛视频,知道GBDT这个算法应用十分广泛.林在第八讲,简单的介绍了AdaBoo ...
机器学习技法：11 Gradient Boosted Decision Tree
Roadmap Adaptive Boosted Decision Tree Optimization View of AdaBoost Gradient Boosting Summary of Ag ...
机器学习技法笔记：11 Gradient Boosted Decision Tree
Roadmap Adaptive Boosted Decision Tree Optimization View of AdaBoost Gradient Boosting Summary of Ag ...
Django Aggregation聚合 django orm 求平均、去重、总和等常用方法
Django Aggregation聚合在当今根据需求而不断调整而成的应用程序中,通常不仅需要能依常规的字段,如字母顺序或创建日期,来对项目进行排序,还需要按其他某种动态数据对项目进行排序.Djng ...
2：django models Making queries
这是后面要用到的类 class Blog(models.Model): name = models.CharField(max_length=100) tagline = models.TextFie ...
How to Choose the Best Way to Pass Multiple Models in ASP.NET MVC
Snesh Prajapati, 8 Dec 2014 http://www.codeproject.com/Articles/717941/How-to-Choose-the-Best-Way-to ...
The Three Models of ASP.NET MVC Apps
12 June 2012 by Dino Esposito by Dino Esposito We've inherited from the original MVC pattern a ra ...
Django models对象的select_related方法（减少查询次数）
表结构先创建一个新的app python manage.py startapp test01 在settings.py注册一下app INSTALLED_APPS = ( 'django.contr ...

随机推荐

学习《Spring 3.x 企业应用开发实战》Day-1
Day-1 记录自己学习spring的笔记提要:根据<Spring 3.x 企业应用开发实战>开头一个用户登录的例子,按照上面敲的. 1.项目分层
vim中选择匹配文本删除技巧
试举几例如下: 如何只保留匹配内容行而删除其他行? :v/pattern/d :help :v 如何对每行只保留匹配内容而删除这一行中的其它内容 :%s/^.pattern.$/\1/g 删除包含特定 ...
Day9 - Python 多线程、进程
Python之路,Day9, 进程.线程.协程篇本节内容操作系统发展史介绍进程.与线程区别 python GIL全局解释器锁线程语法 join 线程锁之Lock\Rlock\信号量将线 ...
js中的同步与异步
同步:提交后等待服务器的响应,接收服务器返回的数据后再执行下面的代码异步:与上面相反,提交后继续执行下面的代码,而在后台继续监听,服务器响应后有程序做相应处理,异步的操作好处是不必等待服务器而 ...
有意思的字符串反转（JavaScript）
有意思的字符串反转如果问你,实现对一串字符串进行反转操作,你的第一反应的方法是? 第一个我想到的是,利用Array.Reverse来实现: var test = 'Skylor.min'; test ...
asp.net Request.ServerVariables[] 读解
获取客户端的IP地址,代码如下: /// <summary> /// 获取客户端IP地址 /// </summary> /// <returns></retu ...
SQL中 patindex函数的用法
语法格式:PATINDEX ( '%pattern%' , expression ) 返回pattern字符串在表达式expression里第一次出现的位置,起始值从1开始算. pattern字符串在 ...
Android VideoView
这两天公司要让做一个播放视频的小Demo,于是网上学习了下VideoView的使用方法. 先看布局文件,很简单就是一个VideoView和两个ImageView <RelativeLayout ...
Qt零基础教程(四)QWidget详解(3):QWidget的几何结构
Qt零基础教程(四) QWidget详解(3):QWidget的几何结构这篇文章里面分析了QWidget中常用的几种几何结构下图是Qt提供的分析QWidget几何结构的一幅图,在帮助的 Wind ...
【USACO 1.5.1】数字金字塔
[题目描述] 观察下面的数字金字塔. 写一个程序来查找从最高点到底部任意处结束的路径,使路径经过数字的和最大.每一步可以走到左下方的点也可以到达右下方的点. 7 3 8 8 1 0 2 7 4 4 4 ...