Aggregation Models
这是Coursera上《机器学习技法》的课程笔记。
Aggregation models: mix or combine hypotheses for better performance, and it's a rich family. Aggregation can do better with many (possibly weaker) hypotheses.
Suppose we have $T$ hypotheses ,denoted by $g_1$, $g_2$, ... ,$g_T$. There are four different approachs to get a appregation model:
1.Select the best one $g_{t_*}$ from validation error $$G(x)=g_{t_*}(x) with t_*=argmin_{t \in \{1,2,...,T\}}E_{val}(g^-_t)$$
2.Mix all hypotheses uniformly $$G(x)=sign(\sum_{t=1}^T1*g_t(x))$$
3.mix all hypotheses non-uniformly $$G(x)=sign(\sum_{t=1}^T\alpha_t*g_t(x)) \quad with \quad \alpha_t \geq 0$$
NOTE: conclude select and mix uniformly.
4.Combine all hypotheses conditionally $$G(x)=sign(\sum_{t=1}^Tq_t(x)*g_t(x)) \quad with \quad q_t(x)\geq 0$$
NOTE: conclude non-uniformly
Why aggregation work?

In the left graph, we get a strong $G(x)$ by mixing different weak hypotheses uniformly. In some sense, aggregation can be seen as feature transform.
In the right graph, we get a moderate $G(x)$ by mixing different weak hypotheses uniformly. In some sense, aggregation can be seen as regularization.
| appgegation type | blending | learning |
| uniform | voting/averging | Bagging |
| non-uniform | linear | Adaboost |
| conditional | stacking | Decision Tree |
Uniform Blending
Classification: $G(x)=sign(\sum_{t=1}^T1*g_t(x))$
Regression:$G(x)=\frac{1}{T}\sum_{t=1}^Tg_t(x)$
And uniformly blending can reduce variance for more stable performance(数学推导可见课件207_handout.pdf).
Linear Blending
Classification:$G(x)=sign(\sum_{t=1}^T\alpha_t*g_t(x)) \quad with \quad \alpha_t \geq 0$
Regression:$G(x)=\frac{1}{T}\sum_{t=1}^T\alpha_t*g_t(x) \quad with \quad \alpha_t \geq 0$
How to choose $\alpha$? We need get some $\alpha$ to minimize $E_{in}$. $$\mathop {\min }\limits_{\alpha_t\geq0}\frac{1}{N}\sum_{n=1}^Nerr\Big(y_n,\sum_{t=1}^T\alpha_tg_t(x_n)\Big)$$
so $ linear blending = LinModel + hypotheses as transform + constraints$.
Given $g_1^-$, $g_2^-$, ..., $g_T^-$ from $D_{train}$, transform $(x_n, y_n)$ in $D_{val}$ to $(z_n=\Phi^-(x_n),y_n)$,where $\Phi^-(x)=(g_1^-(x),...,g_T^-(x))$.And
- compute $\alpha$ = LinearModel$\Big(\{(z_n,y_n)\}\Big)$
- return $G_{LINB}(x)=LinearHypothesis_\alpha(\Phi(x))$
Bootstrap Aggregation(bagging)
Bootstrap sample $\widetilde{D}_t$: resample N examples from $D$ uniformly with replacement - can also use arbitracy N' instead of N.
bootstrap aggregation:
consider a physical iterative process that for t=1,2,...,T:
- request size-N' data $\widetilde{D}_t$ from bootstrap;
- obtain $g_t$ by $\mathcal{A}(\widetilde{D}_t)$, $G=Uniform(\{g_t\})$.
Adaptive Boosting (AdaBoost) Algorithm

Decision Tree


Random Forest

$$RF = bagging +random-subspace C&RT$$
Aggregation Models的更多相关文章
- 机器学习技法课之Aggregation模型
Courses上台湾大学林轩田老师的机器学习技法课之Aggregation 模型学习笔记. 混合(blending) 本笔记是Course上台湾大学林轩田老师的<机器学习技法课>的学习笔记 ...
- 机器学习技法-GBDT算法
课程地址:https://class.coursera.org/ntumltwo-002/lecture 之前看过别人的竞赛视频,知道GBDT这个算法应用十分广泛.林在第八讲,简单的介绍了AdaBoo ...
- 机器学习技法:11 Gradient Boosted Decision Tree
Roadmap Adaptive Boosted Decision Tree Optimization View of AdaBoost Gradient Boosting Summary of Ag ...
- 机器学习技法笔记:11 Gradient Boosted Decision Tree
Roadmap Adaptive Boosted Decision Tree Optimization View of AdaBoost Gradient Boosting Summary of Ag ...
- Django Aggregation聚合 django orm 求平均、去重、总和等常用方法
Django Aggregation聚合 在当今根据需求而不断调整而成的应用程序中,通常不仅需要能依常规的字段,如字母顺序或创建日期,来对项目进行排序,还需要按其他某种动态数据对项目进行排序.Djng ...
- 2:django models Making queries
这是后面要用到的类 class Blog(models.Model): name = models.CharField(max_length=100) tagline = models.TextFie ...
- How to Choose the Best Way to Pass Multiple Models in ASP.NET MVC
Snesh Prajapati, 8 Dec 2014 http://www.codeproject.com/Articles/717941/How-to-Choose-the-Best-Way-to ...
- The Three Models of ASP.NET MVC Apps
12 June 2012 by Dino Esposito by Dino Esposito We've inherited from the original MVC pattern a ra ...
- Django models对象的select_related方法(减少查询次数)
表结构 先创建一个新的app python manage.py startapp test01 在settings.py注册一下app INSTALLED_APPS = ( 'django.contr ...
随机推荐
- 学习《Spring 3.x 企业应用开发实战》Day-1
Day-1 记录自己学习spring的笔记 提要:根据<Spring 3.x 企业应用开发实战>开头一个用户登录的例子,按照上面敲的. 1.项目分层
- vim中选择匹配文本删除技巧
试举几例如下: 如何只保留匹配内容行而删除其他行? :v/pattern/d :help :v 如何对每行只保留匹配内容而删除这一行中的其它内容 :%s/^.pattern.$/\1/g 删除包含特定 ...
- Day9 - Python 多线程、进程
Python之路,Day9, 进程.线程.协程篇 本节内容 操作系统发展史介绍 进程.与线程区别 python GIL全局解释器锁 线程 语法 join 线程锁之Lock\Rlock\信号量 将线 ...
- js中的同步与异步
同步:提交后等待服务器的响应,接收服务器返回的数据后再执行下面的代码 异步:与上面相反,提交后继续执行下面的代码,而在后台继续监听,服务器响应后有程序做相应处理,异步的操作好处是不必等待服务器而 ...
- 有意思的字符串反转(JavaScript)
有意思的字符串反转 如果问你,实现对一串字符串进行反转操作,你的第一反应的方法是? 第一个我想到的是,利用Array.Reverse来实现: var test = 'Skylor.min'; test ...
- asp.net Request.ServerVariables[] 读解
获取客户端的IP地址,代码如下: /// <summary> /// 获取客户端IP地址 /// </summary> /// <returns></retu ...
- SQL中 patindex函数的用法
语法格式:PATINDEX ( '%pattern%' , expression ) 返回pattern字符串在表达式expression里第一次出现的位置,起始值从1开始算. pattern字符串在 ...
- Android VideoView
这两天公司要让做一个播放视频的小Demo,于是网上学习了下VideoView的使用方法. 先看布局文件,很简单 就是一个VideoView和两个ImageView <RelativeLayout ...
- Qt零基础教程(四)QWidget详解(3):QWidget的几何结构
Qt零基础教程(四) QWidget详解(3):QWidget的几何结构 这篇文章里面分析了QWidget中常用的几种几何结构 下图是Qt提供的分析QWidget几何结构的一幅图,在帮助的 Wind ...
- 【USACO 1.5.1】数字金字塔
[题目描述] 观察下面的数字金字塔. 写一个程序来查找从最高点到底部任意处结束的路径,使路径经过数字的和最大.每一步可以走到左下方的点也可以到达右下方的点. 7 3 8 8 1 0 2 7 4 4 4 ...