Linear Decoders

Sparse Autoencoder Recap

In the sparse autoencoder, we had 3 layers of neurons: an input layer, a hidden layer and an output layer. In our previous description of autoencoders (and of neural networks), every neuron in the neural network used the same activation function. In these notes, we describe a modified version of the autoencoder in which some of the neurons use a different activation function. This will result in a model that is sometimes simpler to apply, and can also be more robust to variations in the parameters.

Recall that each neuron (in the output layer) computed the following:

where a⁽³⁾ is the output. In the autoencoder, a⁽³⁾ is our approximate reconstruction of the input x = a⁽¹⁾.

Because we used a sigmoid activation function for f(z⁽³⁾), we needed to constrain or scale the inputs to be in the range[0,1], since the sigmoid function outputs numbers in the range [0,1].

引入 —— 相同的activation function ，非线性映射会导致输入和输出不等

Linear Decoder

One easy fix for this problem is to set a⁽³⁾ = z⁽³⁾. Formally, this is achieved by having the output nodes use an activation function that's the identity function f(z) = z, so that a⁽³⁾ = f(z⁽³⁾) = z⁽³⁾. 输出结点不适用sigmoid 函数

This particular activation function is called the linear activation function。Note however that in the hidden layer of the network, we still use a sigmoid (or tanh) activation function, so that the hidden unit activations are given by (say) , where is the sigmoid function, x is the input, and W⁽¹⁾ andb⁽¹⁾ are the weight and bias terms for the hidden units. It is only in the output layer that we use the linear activation function.

An autoencoder in this configuration--with a sigmoid (or tanh) hidden layer and a linear output layer--is called a linear decoder.

In this model, we have . Because the output is a now linear function of the hidden unit activations, by varying W⁽²⁾, each output unit a⁽³⁾ can be made to produce values greater than 1 or less than 0 as well. This allows us to train the sparse autoencoder real-valued inputs without needing to pre-scale every example to a specific range.

Since we have changed the activation function of the output units, the gradients of the output units also change. Recall that for each output unit, we had set set the error terms as follows:

where y = x is the desired output, is the output of our autoencoder, and is our activation function. Because in the output layer we now have f(z) = z, that implies f'(z) = 1 and thus the above now simplifies to:

output结点

Of course, when using backpropagation to compute the error terms for the hidden layer:

hidden结点

Because the hidden layer is using a sigmoid (or tanh) activation f, in the equation above should still be the derivative of the sigmoid (or tanh) function.

Linear Decoders的更多相关文章

（六）6.16 Neurons Networks linear decoders and its implements
Sparse AutoEncoder是一个三层结构的网络,分别为输入输出与隐层,前边自编码器的描述可知,神经网络中的神经元都采用相同的激励函数,Linear Decoders 修改了自编码器的定义,对 ...
CS229 6.16 Neurons Networks linear decoders and its implements
Sparse AutoEncoder是一个三层结构的网络,分别为输入输出与隐层,前边自编码器的描述可知,神经网络中的神经元都采用相同的激励函数,Linear Decoders 修改了自编码器的定义,对 ...
Unsupervised Feature Learning and Deep Learning(UFLDL) Exercise 总结
7.27 暑假开始后,稍有时间,“搞完”金融项目,便开始跑跑 Deep Learning的程序 Hinton 在Nature上文章的代码跑了3天也没跑完后来Debug 把batch 从200改到 ...
Deep learning：三十八(Stacked CNN简单介绍)
http://www.cnblogs.com/tornadomeet/archive/2013/05/05/3061457.html 前言: 本节主要是来简单介绍下stacked CNN(深度卷积网络 ...
[UFLDL] Basic Concept
博客内容取材于:http://www.cnblogs.com/tornadomeet/archive/2012/06/24/2560261.html 参考资料: UFLDL wiki UFLDL St ...
[UFLDL] Dimensionality Reduction
博客内容取材于:http://www.cnblogs.com/tornadomeet/archive/2012/06/24/2560261.html Deep learning:三十五(用NN实现数据 ...
Deep Learning 教程翻译
Deep Learning 教程翻译非常激动地宣告,Stanford 教授 Andrew Ng 的 Deep Learning 教程,于今日,2013年4月8日,全部翻译成中文.这是中国屌丝军团,从 ...
ICLR 2013 International Conference on Learning Representations深度学习论文papers
ICLR 2013 International Conference on Learning Representations May 02 - 04, 2013, Scottsdale, Arizon ...
Deep Learning基础--线性解码器、卷积、池化
本文主要是学习下Linear Decoder已经在大图片中经常采用的技术convolution和pooling,分别参考网页http://deeplearning.stanford.edu/wiki/ ...

随机推荐

DefaultView 的作用(对DataSet查询出的来数据进行排序)
DefaultView 的作用收藏一直以来在对数据进行排序, 条件查询都是直接重复构建SQL来进行, 在查询次数和数据量不多的情况下倒没觉得什么, 但慢慢得, 当程序需要对大量数据椐不同条件 ...
springMVC中跳转问题
在使用SpringMVC时遇到了这个跳转的问题很头疼.现在总结出来,对以后的开发有所帮助. . 1.可以采用ModelAndView: @RequestMapping("test1" ...
使用U盘安装win8(win8.1)系统
下载镜像下载系统镜像,可以选32位或64位.下完后放到非系统盘(如果需要重新分区的,要先保存到移动硬盘或U盘) 使用U盘制作PE 准备一个4G以上的U盘(备份好U盘里的资料,因为刻录PE需要格式化U ...
Linux Shell脚本编程-基础2
命令退出状态码 bash每个命令,执行状态都有返回值 0表示成功非0表示失败(1-255) $?特殊变量可以打印出上一条命令的状态返回值脚本的状态返回值是脚本执行的最后一条命令自定义脚本状态返 ...
idea+spring4+springmvc+mybatis+maven实现简单增删改查CRUD
在学习spring4+springmvc+mybatis的ssm框架,idea整合简单实现增删改查功能,在这里记录一下. 原文在这里:https://my.oschina.net/finchxu/bl ...
GenIcam标准（六）
2．9．可用的接口本章用伪代码列出在2.3章介绍过的最重要的接口.对每个接口,实际的实现可以提供更多的方法,例如,除了SetValue(value)方法,还可以用直接映射到SetValue()的方式 ...
【Henu ACM Round#20 B】Contest
[链接] 我是链接,点我呀:) [题意] 在这里输入题意 [题解] 根据时间和原分数. 算出对应的分数就可以了. [代码] #include <bits/stdc++.h> using n ...
Java基础学习总结（26）——JNDI入门简介
JNDI是 Java 命名与目录接口(Java Naming and Directory Interface),在J2EE规范中是重要的规范之一,不少专家认为,没有透彻理解JNDI的意义和作用,就没有 ...
spark transform系列__groupByKey
这个操作的作用依据同样的key的全部的value存储到一个集合中的一个玩意. def groupByKey(): RDD[(K, Iterable[V])] = self.withScope { g ...
CodeForces B. The least round way（dp）
题目链接:http://codeforces.com/problemset/problem/2/B B. The least round way time limit per test 5 secon ...

Linear Decoders

Linear Decoders的更多相关文章

随机推荐

热门专题