前言

练习环境：win7， matlab2015b，16G内存，2T硬盘

练习内容及步骤：Exercise:Self-Taught Learning。具体如下：

一是用29404个无标注数据unlabeledData（手写数字数据库MNIST Dataset中数字为5-9的数据）来训练稀疏自动编码器，得到其权重参数opttheta。这一步的目的是提取这些数据的特征，虽然我们不知道它提取的究竟是哪些特征（当然，可以通过可视化结果看出来，可假设其提取的特征为Features），但是我们知道它提取到的特征实际上就是已训练好的稀疏自动编码器的隐藏层的激活值（即：第2层激活值）。注意：本节所有训练稀疏自动编码器的算法用的都L-BFGS算法。

二是把15298个已标注数据trainData（手写数字数据库MNIST Dataset中数字为0-4的前一半数据）作为训练数据集通过这个已训练好的稀疏自动编码器（即：权重参数为opttheta的稀疏自动编码器），就可提取出跟上一步一样的相同的特征参数，这里trainData提取的特征表达假设为trainFeatures，它其实也是隐藏层的激活值。如果还不明白，这里打一个比方：假设上一步提取的是一个通信信号A(对应unlabeledData)的特征是一阶累积量，而这一步提取的就是通信信号B（对应trainData）的一阶累积量，它们提取的都是同样的特征，只是对象不同而已。同样地，unlabeledData和trainData提取的是同样的特征Features，只是对象不同而已。

注意：如果上一步对unlabeledData做了预处理，一定要把其各种数据预处理参数（比如PCA中主成份U）保存起来，因为这一步的训练数据集trainData和下一步的测试数据集testData也一定要做相同的预处理。本节练习，因为用的是手写数字数据库MNIST Dataset，已经经过了预处理，所以不用再预处理。

具体见：http://ufldl.stanford.edu/wiki/index.php/%E8%87%AA%E6%88%91%E5%AD%A6%E4%B9%A0

三是把15298个已标注数据testData（手写数字数据库MNIST Dataset中数字为0-4的后一半数据）作为测试数据集通过这个已训练好的稀疏自动编码器（即：权重参数为opttheta的稀疏自动编码器），，就可提取出跟上一步一样的相同的特征参数，这里testData提取的特征表达假设为testFeatures，它其实也是隐藏层的激活值。

四是把第二步提取出来的特征trainFeatures和已标注数据trainData的标签trainLabels作为输入来训练softmax分类器，得到其回归模型softmaxModel。

五是把第三步提取出来的特征testFeatures输入训练好的softmax回归模型softmaxModel，从而预测出已标注数据testData的类别pred，再把pred和已标注数据testData本来的标签testLabels对比，就可得出正确率。

综上，Self-taught learning是利用未标注数据，用无监督学习来提取特征参数，然后用有监督学习和提取的特征参数来训练分类器。

本节方法适用范围：

用于在一些拥有大量未标注数据和少量的已标注数据的场景中，本节方法可能是最有效的。即使在只有已标注数据的情况下（这时我们通常忽略训练数据的类标号进行特征学习），以上想法也能得到很好的结果。

一些matlab函数

numel：求元素总数。

n=numel(A)该语句返回数组中元素的总数。

s=size(A),当只有一个输出参数时，返回一个行向量，该行向量的第一个元素时数组的行数，第二个元素是数组的列数。

[r,c]=size(A),当有两个输出参数时，size函数将数组的行数返回到第一个输出变量，将数组的列数返回到第二个输出变量。

round(n)的意思是纯粹的四舍五入，意思与我们以前数学中的四舍五入是一样的！

find

找到非零元素的索引和值

语法：

1. ind = find(X)

2. ind = find(X, k)

3. ind = find(X, k, 'first')

4. ind = find(X, k, 'last')

5. [row,col] = find(X, ...)

6. [row,col,v] = find(X, ...)

说明：

1. ind = find(X)

找出矩阵X中的所有非零元素，并将这些元素的线性索引值（linear indices：按列）返回到向量ind中。

如果X是一个行向量，则ind是一个行向量；否则，ind是一个列向量。

如果X不含非零元素或是一个空矩阵，则ind是一个空矩阵。

2. ind = find(X, k) 或 3. ind = find(X, k, 'first')

返回第一个非零元素k的索引值。

k必须是一个正数，但是它可以是任何数字数值类型。

4. ind = find(X, k, 'last')

返回最后一个非零元素k的索引值。

5. [row,col] = find(X, ...)

返回矩阵X中非零元素的行和列的索引值。

这个语法对于处理稀疏矩阵尤其有用。

如果X是一个N（N>2）维矩阵，col包括列的线性索引。

例如，一个5*7*3的矩阵X，有一个非零元素X（4,2,3），find函数将返回row=4和col=16。也就是说，（第1页有7列）+（第2页有7列）+（第3页有2列）=16。

6. [row,col,v] = find(X, ...)

返回X中非零元素的一个列或行向量v，同时返回行和列的索引值。

如果X是一个逻辑表示，则v是一个逻辑矩阵。

输出向量v包含通过评估X表示得到的逻辑矩阵的非零元素。

例如，

A= magic(4)
A =
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1

[r,c,v]= find(A>10);

r', c', v'
ans =
1 2 4 4 1 3 (按列)
ans =
1 2 2 3 4 4 （按列）
ans =
1 1 1 1 1 1

这里返回的向量v是一个逻辑矩阵，它包含N个非零元素，N=(A>10)

例子：

例1

X = [1 0 4 -3 0 0 0 8 6];
indices = find(X)

返回X中非零元素的线性索引值。

indices =
1 3 4 8 9

例2

你可以用一个逻辑表达方式定义X。例如

find(X > 2)

返回X中大于2的元素的相对应的线性索引值。

ans =
3 8 9

unique:

　　unique为找出向量中的非重复元素并进行排序后输出。

运行结果

权重参数opttheta中W1的可视化结果，也就是所提取特征的可视化结果如下：

Test Accuracy: 98.333115%

Elapsed time is 594.435594 seconds.

结果总结：

1. 为什么Andrew Ng他们训练样本用25分钟，而我所有运行时间不到6分钟？估计前几年电脑配置比现在的电脑配置差很多！

2.为了对比，Andrew Ng团队做了实验，如果不用本节稀疏自动编码器提取的特征代替原始像素值（即：原始数据）训练softmax分类器，准确率最多达到96%。实际上，本节练习和上一节练习Deep Learning六：Softmax Regression_Exercise（斯坦福大学UFLDL深度学习教程）的不同之处，就是本节练习用的是稀疏自动编码器提取的特征训练softmax分类器，而上一节练习用的原始数据训练softmax分类器，上节练习我们得到的准确率实际上只有92.640%，当然，可能Andrew Ng团队的准确率最多达到了96%。

代码

stlExercise.m

%% CS294A/CS294W Self-taught Learning Exercise

%  Instructions

%  ------------

%

%  This file contains code that helps you get started on the

%  self-taught learning. You will need to complete code in feedForwardAutoencoder.m

%  You will also need to have implemented sparseAutoencoderCost.m and

%  softmaxCost.m from previous exercises.

%

%% ======================================================================

%  STEP : Here we provide the relevant parameters values that will

%  allow your sparse autoencoder to get good filters; you do not need to

%  change the parameters below.

tic

inputSize  =  * ;

numLabels  = ;

hiddenSize = ;

sparsityParam = 0.1; % desired average activation of the hidden units.

                     % (This was denoted by the Greek alphabet rho, which looks like a lower-case "p",

                     %  in the lecture notes).

lambda = 3e-;       % weight decay parameter

beta = ;            % weight of sparsity penalty term

maxIter = ;

%% ======================================================================

%  STEP : Load data from the MNIST database

%

%  This loads our training and test data from the MNIST database files.

%  We have sorted the data for you in this so that you will not have to

%  change it.

% Load MNIST database files

mnistData   = loadMNISTImages('train-images.idx3-ubyte');

mnistLabels = loadMNISTLabels('train-labels.idx1-ubyte');

% Set Unlabeled Set (All Images)

% Simulate a Labeled and Unlabeled set

labeledSet   = find(mnistLabels >=  & mnistLabels <= );%返回mnistLabels中元素值大于等于0且小于等于4的数字的行号

unlabeledSet = find(mnistLabels >= );

numTrain = round(numel(labeledSet)/);

trainSet = labeledSet(:numTrain);

testSet  = labeledSet(numTrain+:end);

unlabeledData = mnistData(:, unlabeledSet);% 无标签数据集

trainData   = mnistData(:, trainSet);% mnistData中大于等于0且小于等于4的数字的前一半数字作为有标签的训练数据

trainLabels = mnistLabels(trainSet)' + 1; % Shift Labels to the Range 1-5

testData   = mnistData(:, testSet);% mnistData中大于等于0且小于等于4的数字的后一半数字作为有标签的测试数据

testLabels = mnistLabels(testSet)' + 1;   % Shift Labels to the Range 1-5

% Output Some Statistics

fprintf('# examples in unlabeled set: %d\n', size(unlabeledData, ));

fprintf('# examples in supervised training set: %d\n\n', size(trainData, ));

fprintf('# examples in supervised testing set: %d\n\n', size(testData, ));

%% ======================================================================

%  STEP : Train the sparse autoencoder

%  This trains the sparse autoencoder on the unlabeled training

%  images. 

%  按均匀分布随机初始化theta参数   Randomly initialize the parameters

theta = initializeParameters(hiddenSize, inputSize);

%% ----------------- YOUR CODE HERE ----------------------

%  Find opttheta by running the sparse autoencoder on

%  unlabeledTrainingImages

%  利用L-BFGS算法，用无标签数据集来训练稀疏自动编码器

opttheta = theta; 

addpath minFunc/

options.Method = 'lbfgs';

options.maxIter = ;

options.display = 'on';

[opttheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ...

      inputSize, hiddenSize, ...

      lambda, sparsityParam, ...

      beta, unlabeledData), ...

      theta, options);

%% -----------------------------------------------------

% Visualize weights

W1 = reshape(opttheta(:hiddenSize * inputSize), hiddenSize, inputSize);

display_network(W1');

%%======================================================================

%% STEP : 从有标签数据集中提取特征 Extract Features from the Supervised Dataset

%

%  You need to complete the code in feedForwardAutoencoder.m so that the

%  following command will extract features from the data.

trainFeatures = feedForwardAutoencoder(opttheta, hiddenSize, inputSize, ...

                                       trainData);

testFeatures = feedForwardAutoencoder(opttheta, hiddenSize, inputSize, ...

                                       testData);

%%======================================================================

%% STEP : Train the softmax classifier

softmaxModel = struct;

%% ----------------- YOUR CODE HERE ----------------------

%  Use softmaxTrain.m from the previous exercise to train a multi-class

%  classifier.

%  利用L-BFGS算法，用从有标签训练数据集中提取的特征及其标签，训练softmax回归模型，

%  Use lambda = 1e- for the weight regularization for softmax

lambda = 1e-;

inputSize = hiddenSize;

numClasses = numel(unique(trainLabels));%unique为找出向量中的非重复元素并进行排序

% You need to compute softmaxModel using softmaxTrain on trainFeatures and

% trainLabels

options.maxIter = ; %最大迭代次数

softmaxModel = softmaxTrain(inputSize, numClasses, lambda, ...

                            trainFeatures, trainLabels, options);

%% -----------------------------------------------------

%%======================================================================

%% STEP : Testing 

%% ----------------- YOUR CODE HERE ----------------------

% Compute Predictions on the test set (testFeatures) using softmaxPredict

% and softmaxModel

[pred] = softmaxPredict(softmaxModel, testFeatures);

%% -----------------------------------------------------

% Classification Score

fprintf('Test Accuracy: %f%%\n', *mean(pred(:) == testLabels(:)));

toc

% (note that we shift the labels by , so that digit  now corresponds to

%  label )

%

% Accuracy is the proportion of correctly classified images

% The results for our implementation was:

%

% Accuracy: 98.3%

%

%

feedForwardAutoencoder.m

 function [activation] = feedForwardAutoencoder(theta, hiddenSize, visibleSize, data)

 % theta: trained weights from the autoencoder

 % visibleSize: the number of input units (probably )

 % hiddenSize: the number of hidden units (probably )

 % data: Our matrix containing the training data as columns.  So, data(:,i) is the i-th training example. 

 % We first convert theta to the (W1, W2, b1, b2) matrix/vector format, so that this

 % follows the notation convention of the lecture notes. 

 W1 = reshape(theta(:hiddenSize*visibleSize), hiddenSize, visibleSize);

 b1 = theta(*hiddenSize*visibleSize+:*hiddenSize*visibleSize+hiddenSize);

 %% ---------- YOUR CODE HERE --------------------------------------

 %  Instructions: Compute the activation of the hidden layer for the Sparse Autoencoder.

 activation  = sigmoid(W1*data+repmat(b1,[,size(data,)]));

 %-------------------------------------------------------------------

 end

 %-------------------------------------------------------------------

 % Here's an implementation of the sigmoid function, which you may find useful

 % in your computation of the costs and the gradients.  This inputs a (row or

 % column) vector (say (z1, z2, z3)) and returns (f(z1), f(z2), f(z3)). 

 function sigm = sigmoid(x)

     sigm =  ./ ( + exp(-x));

 end

参考资料：

http://www.cnblogs.com/tornadomeet/archive/2013/03/24/2979408.html

UFLDL教程

……

Deep Learning 7_深度学习UFLDL教程：Self-Taught Learning_Exercise（斯坦福大学深度学习教程）的更多相关文章

Deep Learning 19_深度学习UFLDL教程：Convolutional Neural Network_Exercise（斯坦福大学深度学习教程）
理论知识:Optimization: Stochastic Gradient Descent和Convolutional Neural Network CNN卷积神经网络推导和实现.Deep lear ...
Deep Learning 13_深度学习UFLDL教程：Independent Component Analysis_Exercise（斯坦福大学深度学习教程）
前言理论知识:UFLDL教程.Deep learning:三十三(ICA模型).Deep learning:三十九(ICA模型练习) 实验环境:win7, matlab2015b,16G内存,2T机 ...
Deep Learning 12_深度学习UFLDL教程：Sparse Coding_exercise（斯坦福大学深度学习教程）
前言理论知识:UFLDL教程.Deep learning:二十六(Sparse coding简单理解).Deep learning:二十七(Sparse coding中关于矩阵的范数求导).Deep ...
Deep Learning 11_深度学习UFLDL教程：数据预处理（斯坦福大学深度学习教程）
理论知识:UFLDL数据预处理和http://www.cnblogs.com/tornadomeet/archive/2013/04/20/3033149.html 数据预处理是深度学习中非常重要的一 ...
Deep Learning 10_深度学习UFLDL教程：Convolution and Pooling_exercise（斯坦福大学深度学习教程）
前言理论知识:UFLDL教程和http://www.cnblogs.com/tornadomeet/archive/2013/04/09/3009830.html 实验环境:win7, matlab ...
Deep Learning 9_深度学习UFLDL教程：linear decoder_exercise（斯坦福大学深度学习教程）
前言实验内容:Exercise:Learning color features with Sparse Autoencoders.即:利用线性解码器,从100000张8*8的RGB图像块中提取颜色特 ...
Deep Learning 8_深度学习UFLDL教程：Stacked Autocoders and Implement deep networks for digit classification_Exercise（斯坦福大学深度学习教程）
前言 1.理论知识:UFLDL教程.Deep learning:十六(deep networks) 2.实验环境:win7, matlab2015b,16G内存,2T硬盘 3.实验内容:Exercis ...
Deep Learning 1_深度学习UFLDL教程：Sparse Autoencoder练习（斯坦福大学深度学习教程）
1前言本人写技术博客的目的,其实是感觉好多东西,很长一段时间不动就会忘记了,为了加深学习记忆以及方便以后可能忘记后能很快回忆起自己曾经学过的东西. 首先,在网上找了一些资料,看见介绍说UFLDL很不 ...
Deep Learning 6_深度学习UFLDL教程：Softmax Regression_Exercise（斯坦福大学深度学习教程）
前言练习内容:Exercise:Softmax Regression.完成MNIST手写数字数据库中手写数字的识别,即:用6万个已标注数据(即:6万张28*28的图像块(patches)),作训练数 ...

随机推荐

jquery 图片本地预览
uploadPreview.js /* *名称:图片上传本地预览插件 v1.1 *介绍:基于JQUERY扩展,图片上传预览插件目前兼容浏览器(IE 谷歌火狐) 不支持safari *参数说明: I ...
Echarts 地图（map）插件之鼠标HOVER和tooltip自定义提示框
[自行修改 "引号"] 一.鼠标HOVER时的事件: 参照官方文档解释, 可以看出这款插件有丰富的鼠标事件可供选择: 调用鼠标HOVER事件的方法很简单,只需把以下代码放到char ...
RDIFramework.NET ━ 9.9 角色权限管理 ━ Web部分
RDIFramework.NET ━ .NET快速信息化系统开发框架 9.9 角色权限管理 -Web部分角色权限管理模块主要是对角色的相应权限进行集中设置.在角色权限管理模块中,管理员可以添加或移 ...
使用JavaService.exe(amd64)发布java服务(jdk x64)
最近项目中需要使用java服务,但是java服务已经写好了,就等待部署到windows服务中,遇到了种种困难------在x64服务器中部署jdk x64编译的jar时,遇到了各种纠结. 本文找到了一 ...
AngularJS语法格式小结
//创建一个最大的容器,"唯一的名字" []数组 var a=angular.module("abcd",[]); //控制器 a.controller(&qu ...
C# “快捷方式” 实现程序开机启动
添加引用: COM : Windows Script Host Object Model Name: Interop.IWshRuntimeLibrary 添加命名空间: using IWshRunt ...
ASP.NET中的Image和ImageButton控件
Image 控件用来显示图形.Image 控件可以显示来自位图.图标或元文件的图形,也可以显示增强的元文件.JPEG 或 GIF文件. ImageButton 控件用于显示可点击的图像. Image ...
JSP-11-Servlet
1 初识Servlet Ø Servlet做了什么本身不做业务只接收请求并决定调用哪个JavaBean去处理请求确定用哪个页面来显示处理返回的数据 Ø Servlet 是什么 Servlet ...
jquery file upload 文件上传插件
1. jquery file upload 下载 jquery file upload Demo 地址:https://blueimp.github.io/jQuery-File-Upload/ jq ...
wex5 教程之图文讲解 bind-css和bind-sytle的异同
wex5作为网页开发利器,在前台UI数据交互设计中大量使用了绑定技术,即官方视频教学中也提到了KO,实质是数据绑定与追踪.在前台组件的属性中,为我们提供了两个重要的样式绑定属性,bind-css和bi ...

Deep Learning 7_深度学习UFLDL教程：Self-Taught Learning_Exercise（斯坦福大学深度学习教程）

前言

代码

Deep Learning 7_深度学习UFLDL教程：Self-Taught Learning_Exercise（斯坦福大学深度学习教程）的更多相关文章

随机推荐

热门专题