In this section, we describe how you can fine-tune and further improve the learned features using labeled data. When you have a large amount of labeled training data, this can significantly improve your classifier's performance.

In self-taught learning, we first trained a sparse autoencoder on the unlabeled data. Then, given a new example , we used the hidden layer to extract features . This is illustrated in the following diagram:

We are interested in solving a classification task, where our goal is to predict labels . We have a labeled training set of labeled examples. We showed previously that we can replace the original features with features computed by the sparse autoencoder (the "replacement" representation). This gives us a training set . Finally, we train a logistic classifier to map from the features to the classification label .

we can draw our logistic regression unit (shown in orange) as follows:

Now, consider the overall classifier (i.e., the input-output mapping) that we have learned using this method. In particular, let us examine the function that our classifier uses to map from from a new test example to a new prediction p(y = 1 | x). We can draw a representation of this function by putting together the two pictures from above. In particular, the final classifier looks like this:

The parameters of this model were trained in two stages: The first layer of weights mapping from the input to the hidden unit activations were trained as part of the sparse autoencoder training process. The second layer of weights mapping from the activations to the output was trained using logistic regression (or softmax regression).

But the form of our overall/final classifier is clearly just a whole big neural network. So, having trained up an initial set of parameters for our model (training the first layer using an autoencoder, and the second layer via logistic/softmax regression), we can further modify all the parameters in our model to try to further reduce the training error. In particular, we can fine-tune the parameters, meaning perform gradient descent (or use L-BFGS) from the current setting of the parameters to try to reduce the training error on our labeled training set .

When fine-tuning is used, sometimes the original unsupervised feature learning steps (i.e., training the autoencoder and the logistic classifier) are called pre-training. The effect of fine-tuning is that the labeled data can be used to modify the weights W(1) as well, so that adjustments can be made to the features a extracted by the layer of hidden units.

if we are using fine-tuning usually we will do so with a network built using the replacement representation. (If you are not using fine-tuning however, then sometimes the concatenation representation can give much better performance.)

When should we use fine-tuning? It is typically used only if you have a large labeled training set; in this setting, fine-tuning can significantly improve the performance of your classifier. However, if you have a large unlabeled dataset (for unsupervised feature learning/pre-training) and only a relatively small labeled training set, then fine-tuning is significantly less likely to help.

Self-Taught Learning to Deep Networks的更多相关文章

  1. 【论文考古】联邦学习开山之作 Communication-Efficient Learning of Deep Networks from Decentralized Data

    B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, "Communication-Efficient Learni ...

  2. Communication-Efficient Learning of Deep Networks from Decentralized Data

    郑重声明:原文参见标题,如有侵权,请联系作者,将会撤销发布! Proceedings of the 20th International Conference on Artificial Intell ...

  3. Deep Learning 8_深度学习UFLDL教程:Stacked Autocoders and Implement deep networks for digit classification_Exercise(斯坦福大学深度学习教程)

    前言 1.理论知识:UFLDL教程.Deep learning:十六(deep networks) 2.实验环境:win7, matlab2015b,16G内存,2T硬盘 3.实验内容:Exercis ...

  4. (转)Understanding, generalisation, and transfer learning in deep neural networks

    Understanding, generalisation, and transfer learning in deep neural networks FEBRUARY 27, 2017   Thi ...

  5. 论文笔记之:UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORKS

    UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORKS  ICLR 2 ...

  6. 深度学习材料:从感知机到深度网络A Deep Learning Tutorial: From Perceptrons to Deep Networks

    In recent years, there’s been a resurgence in the field of Artificial Intelligence. It’s spread beyo ...

  7. This instability is a fundamental problem for gradient-based learning in deep neural networks. vanishing exploding gradient problem

    The unstable gradient problem: The fundamental problem here isn't so much the vanishing gradient pro ...

  8. [译]深度神经网络的多任务学习概览(An Overview of Multi-task Learning in Deep Neural Networks)

    译自:http://sebastianruder.com/multi-task/ 1. 前言 在机器学习中,我们通常关心优化某一特定指标,不管这个指标是一个标准值,还是企业KPI.为了达到这个目标,我 ...

  9. Learning Combinatorial Embedding Networks for Deep Graph Matching(基于图嵌入的深度图匹配)

    1. 文献信息 题目: Learning Combinatorial Embedding Networks for Deep Graph Matching(基于图嵌入的深度图匹配) 作者:上海交通大学 ...

随机推荐

  1. Access-Control-Allow-Origin 如何设置多个值呢

    需求就是多个网站请求同一个api服务器和这里一样https://segmentfault.com/q/10... 我不是做后端的,但是我们后端不知道怎么设置,在web.config里设置了一下 < ...

  2. redis动态修改参数

    通过 config get 命令可以查看参数. 通过config set 可以修改某些参数 动态关闭redis的aof功能:(不要忘了也修改配置文件中的aof选项使其保持一致) 127.0.0.1:6 ...

  3. CentOS7.4-btrfs管理及使用

    btrfs, B-tree File System, GPL开源文件系统, 支持CoW即读时写入. 核心特性: 多物理卷支持; btrfs可由多个底层磁盘组成 支持RAID mkfs.btrfs 命令 ...

  4. bzoj 2259: [Oibh]新型计算机 最短路 建模

    Code: #include<cstdio> #include<cstring> #include<algorithm> #include<queue> ...

  5. Redis数据持久化的两种方式RDB和AOF

    由于Redis的数据都存放在内存中,如果没有配置持久化,redis重启后数据就全丢失了,于是需要开启redis的持久化功能,将数据保存到磁 盘上,当redis重启后,可以从磁盘中恢复数据.redis提 ...

  6. PKU 2288 Islands and Bridges 状态dp

    题意: 给你一张地图,上面有一些岛和桥.你要求出最大的三角哈密顿路径,以及他们的数量. 哈密顿路:一条经过所有岛的路径,每个岛只经过一次. 最大三角哈密顿路:满足价值最大的哈密顿路. 价值计算分为以下 ...

  7. malloc()和free()的原理及实现

    在C语言中只能通过malloc()和其派生的函数进行动态的申请内存,而实现的根本是通过系统调用实现的(在linux下是通过sbrk()系统调用实现). malloc()到底从哪里得到了内存空间?答案是 ...

  8. QlikView随意改变图例的位置

    组里面花了大价钱请人设计了一套UI的solution,只是是以站点思路设计的报表样式,可是该报表UI设计团队本身因为没有QlikView的背景,因此设计出来的报表不知道能不能再QlikView中实现, ...

  9. POJ - 3415 Common Substrings(后缀数组求长度不小于 k 的公共子串的个数+单调栈优化)

    Description A substring of a string T is defined as: T( i, k)= TiTi+1... Ti+k-1, 1≤ i≤ i+k-1≤| T|. G ...

  10. Centos7.4 modsecurity with nginx 安装

    1.准备: 系统环境:Centos7.4 软件及版本: nginx:OpenResty1.13.6.1 ModSecurity:ModSecurity v3.0.0rc1 (Linux) modsec ...