Self-Taught Learning to Deep Networks
In this section, we describe how you can fine-tune and further improve the learned features using labeled data. When you have a large amount of labeled training data, this can significantly improve your classifier's performance.
In self-taught learning, we first trained a sparse autoencoder on the unlabeled data. Then, given a new example
, we used the hidden layer to extract features
. This is illustrated in the following diagram:
We are interested in solving a classification task, where our goal is to predict labels
. We have a labeled training set
of
labeled examples. We showed previously that we can replace the original features
with features
computed by the sparse autoencoder (the "replacement" representation). This gives us a training set
. Finally, we train a logistic classifier to map from the features
to the classification label
.
we can draw our logistic regression unit (shown in orange) as follows:
Now, consider the overall classifier (i.e., the input-output mapping) that we have learned using this method. In particular, let us examine the function that our classifier uses to map from from a new test example
to a new prediction p(y = 1 | x). We can draw a representation of this function by putting together the two pictures from above. In particular, the final classifier looks like this:
The parameters of this model were trained in two stages: The first layer of weights mapping from the input
to the hidden unit activations
were trained as part of the sparse autoencoder training process. The second layer of weights
mapping from the activations
to the output
was trained using logistic regression (or softmax regression).
But the form of our overall/final classifier is clearly just a whole big neural network. So, having trained up an initial set of parameters for our model (training the first layer using an autoencoder, and the second layer via logistic/softmax regression), we can further modify all the parameters in our model to try to further reduce the training error. In particular, we can fine-tune the parameters, meaning perform gradient descent (or use L-BFGS) from the current setting of the parameters to try to reduce the training error on our labeled training set .
When fine-tuning is used, sometimes the original unsupervised feature learning steps (i.e., training the autoencoder and the logistic classifier) are called pre-training. The effect of fine-tuning is that the labeled data can be used to modify the weights W(1) as well, so that adjustments can be made to the features a extracted by the layer of hidden units.
if we are using fine-tuning usually we will do so with a network built using the replacement representation. (If you are not using fine-tuning however, then sometimes the concatenation representation can give much better performance.)
When should we use fine-tuning? It is typically used only if you have a large labeled training set; in this setting, fine-tuning can significantly improve the performance of your classifier. However, if you have a large unlabeled dataset (for unsupervised feature learning/pre-training) and only a relatively small labeled training set, then fine-tuning is significantly less likely to help.
Self-Taught Learning to Deep Networks的更多相关文章
- 【论文考古】联邦学习开山之作 Communication-Efficient Learning of Deep Networks from Decentralized Data
B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, "Communication-Efficient Learni ...
- Communication-Efficient Learning of Deep Networks from Decentralized Data
郑重声明:原文参见标题,如有侵权,请联系作者,将会撤销发布! Proceedings of the 20th International Conference on Artificial Intell ...
- Deep Learning 8_深度学习UFLDL教程:Stacked Autocoders and Implement deep networks for digit classification_Exercise(斯坦福大学深度学习教程)
前言 1.理论知识:UFLDL教程.Deep learning:十六(deep networks) 2.实验环境:win7, matlab2015b,16G内存,2T硬盘 3.实验内容:Exercis ...
- (转)Understanding, generalisation, and transfer learning in deep neural networks
Understanding, generalisation, and transfer learning in deep neural networks FEBRUARY 27, 2017 Thi ...
- 论文笔记之:UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORKS
UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORKS ICLR 2 ...
- 深度学习材料:从感知机到深度网络A Deep Learning Tutorial: From Perceptrons to Deep Networks
In recent years, there’s been a resurgence in the field of Artificial Intelligence. It’s spread beyo ...
- This instability is a fundamental problem for gradient-based learning in deep neural networks. vanishing exploding gradient problem
The unstable gradient problem: The fundamental problem here isn't so much the vanishing gradient pro ...
- [译]深度神经网络的多任务学习概览(An Overview of Multi-task Learning in Deep Neural Networks)
译自:http://sebastianruder.com/multi-task/ 1. 前言 在机器学习中,我们通常关心优化某一特定指标,不管这个指标是一个标准值,还是企业KPI.为了达到这个目标,我 ...
- Learning Combinatorial Embedding Networks for Deep Graph Matching(基于图嵌入的深度图匹配)
1. 文献信息 题目: Learning Combinatorial Embedding Networks for Deep Graph Matching(基于图嵌入的深度图匹配) 作者:上海交通大学 ...
随机推荐
- 关于iOS适配问题
大家都知道在iOS开发当中对于UI适配问题可以从如下两个方面去考虑: 1.比例适配 2.利用autolayout自动布局 通常情况来说,利用auto自动布局是一个比较好的方案,开发者可以利用story ...
- Signal programming
Signal programming is used in the same sense as dataflow programming, and is similar to event-driven ...
- Linux 图形文件压缩/解压缩实用程序,归档管理器。
1.ArkArk是KDE桌面环境默认的归档管理器,支持插件设置,允许你创建一个压缩包,查看压缩文件的内容,解压压缩包的内容到你所选定的目录.它能处理多种格式,包括 tar.gzip.bzip2.zip ...
- 修改route.php文件对ThinkPHP快速注册路由
THINKPHP快速注册路由方式可以用 return[ "test"=>"index/index/demo", 'getid/:id'=>'inde ...
- hdu 5412 CRB and Queries(整体二分)
题意 动态区间第k大 (n<=100000,m<=100000) 题解 整体二分的应用. 与静态相比差别不是很大.(和CDQ还有点像)所以直接上代码. #include<iostre ...
- 洛谷—— P1196 银河英雄传说
https://www.luogu.org/problem/show?pid=1196 题目描述 公元五八○一年,地球居民迁至金牛座α第二行星,在那里发表银河联邦创立宣言,同年改元为宇宙历元年,并开始 ...
- 二 HTable 源码导读
户端调优的方法里面无非就这么几种:1)关闭autoFlush2)关闭WAL日志3)把writeBufferSize设大一点,一般说是设置成5MB 经过实践,就第二条关闭日志的效果比较明显 ...
- hdu 1423 最长公共递增子序列 LCIS
最长公共上升子序列(LCIS)的O(n^2)算法 预备知识:动态规划的基本思想,LCS,LIS. 问题:字符串a,字符串b,求a和b的LCIS(最长公共上升子序列). 首先我们可以看到,这个问题具有相 ...
- 适用于OpenGL离屏渲染上下文的初始化代码
说明 近期做图像算法.须要用到shader对图像进行处理,用glut会有窗体,不适合写成UT測试用例,须要创建一个无窗体的OpenGL上下文. 代码 这部分代码事实上是參考 Android的Skia ...
- ZOJ Problem Set - 3229 Shoot the Bullet 【有上下界网络流+流量输出】
题目:problemId=3442" target="_blank">ZOJ Problem Set - 3229 Shoot the Bullet 分类:有源有汇 ...