Convolution Neural Network (CNN) 原理与实现
本文结合Deep learning的一个应用,Convolution Neural Network 进行一些基本应用,参考Lecun的Document 0.1进行部分拓展,与结果展示(in python)。
分为以下几部分:
1. Convolution(卷积)
2. Pooling(降采样过程)
3. CNN结构
4. 跑实验
下面分别介绍。
PS:本篇blog为ese机器学习短期班参考资料(20140516课程),本文只是简要讲最naive最simple的思想,重在实践部分,原理课上详述。
1. Convolution(卷积)
类似于高斯卷积,对imagebatch中的所有image进行卷积。对于一张图,其所有feature map用一个filter卷成一张feature map。 如下面的代码,对一个imagebatch(含两张图)进行操作,每个图初始有3张feature map(R,G,B), 用两个9*9的filter进行卷积,结果是,每张图得到两个feature map。
卷积操作由theano的conv.conv2d实现,这里我们用随机参数W,b。结果有点像edge detector是不是?
Code: (详见注释)
- # -*- coding: utf-8 -*-
- """
- Created on Sat May 10 18:55:26 2014
- @author: rachel
- Function: convolution option of two pictures with same size (width,height)
- input: 3 feature maps (3 channels <RGB> of a picture)
- convolution: two 9*9 convolutional filters
- """
- from theano.tensor.nnet import conv
- import theano.tensor as T
- import numpy, theano
- rng = numpy.random.RandomState(23455)
- # symbol variable
- input = T.tensor4(name = 'input')
- # initial weights
- w_shape = (2,3,9,9) #2 convolutional filters, 3 channels, filter shape: 9*9
- w_bound = numpy.sqrt(3*9*9)
- W = theano.shared(numpy.asarray(rng.uniform(low = -1.0/w_bound, high = 1.0/w_bound,size = w_shape),
- dtype = input.dtype),name = 'W')
- b_shape = (2,)
- b = theano.shared(numpy.asarray(rng.uniform(low = -.5, high = .5, size = b_shape),
- dtype = input.dtype),name = 'b')
- conv_out = conv.conv2d(input,W)
- #T.TensorVariable.dimshuffle() can reshape or broadcast (add dimension)
- #dimshuffle(self,*pattern)
- # >>>b1 = b.dimshuffle('x',0,'x','x')
- # >>>b1.shape.eval()
- # array([1,2,1,1])
- output = T.nnet.sigmoid(conv_out + b.dimshuffle('x',0,'x','x'))
- f = theano.function([input],output)
- # demo
- import pylab
- from PIL import Image
- #minibatch_img = T.tensor4(name = 'minibatch_img')
- #-------------img1---------------
- img1 = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel.jpg'))
- width1,height1 = img1.size
- img1 = numpy.asarray(img1, dtype = 'float32')/256. # (height, width, 3)
- # put image in 4D tensor of shape (1,3,height,width)
- img1_rgb = img1.swapaxes(0,2).swapaxes(1,2).reshape(1,3,height1,width1) #(3,height,width)
- #-------------img2---------------
- img2 = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel1.jpg'))
- width2,height2 = img2.size
- img2 = numpy.asarray(img2,dtype = 'float32')/256.
- img2_rgb = img2.swapaxes(0,2).swapaxes(1,2).reshape(1,3,height2,width2) #(3,height,width)
- #minibatch_img = T.join(0,img1_rgb,img2_rgb)
- minibatch_img = numpy.concatenate((img1_rgb,img2_rgb),axis = 0)
- filtered_img = f(minibatch_img)
- # plot original image and two convoluted results
- pylab.subplot(2,3,1);pylab.axis('off');
- pylab.imshow(img1)
- pylab.subplot(2,3,4);pylab.axis('off');
- pylab.imshow(img2)
- pylab.gray()
- pylab.subplot(2,3,2); pylab.axis("off")
- pylab.imshow(filtered_img[0,0,:,:]) #0:minibatch_index; 0:1-st filter
- pylab.subplot(2,3,3); pylab.axis("off")
- pylab.imshow(filtered_img[0,1,:,:]) #0:minibatch_index; 1:1-st filter
- pylab.subplot(2,3,5); pylab.axis("off")
- pylab.imshow(filtered_img[1,0,:,:]) #0:minibatch_index; 0:1-st filter
- pylab.subplot(2,3,6); pylab.axis("off")
- pylab.imshow(filtered_img[1,1,:,:]) #0:minibatch_index; 1:1-st filter
- pylab.show()
2. Pooling(降采样过程)
最常用的Maxpooling. 解决了两个问题:
1. 减少计算量
2. 旋转不变性 (原因自己悟)
PS:对于旋转不变性,回忆下SIFT,LBP:采用主方向;HOG:选择不同方向的模版
Maxpooling的降采样过程会将feature map的长宽各减半。(下面结果图中没有体现出来,python自动给拉到一样大了,但实际上像素数是减半的)
Code: (详见注释)
- # -*- coding: utf-8 -*-
- """
- Created on Sat May 10 18:55:26 2014
- @author: rachel
- Function: convolution option
- input: 3 feature maps (3 channels <RGB> of a picture)
- convolution: two 9*9 convolutional filters
- """
- from theano.tensor.nnet import conv
- import theano.tensor as T
- import numpy, theano
- rng = numpy.random.RandomState(23455)
- # symbol variable
- input = T.tensor4(name = 'input')
- # initial weights
- w_shape = (2,3,9,9) #2 convolutional filters, 3 channels, filter shape: 9*9
- w_bound = numpy.sqrt(3*9*9)
- W = theano.shared(numpy.asarray(rng.uniform(low = -1.0/w_bound, high = 1.0/w_bound,size = w_shape),
- dtype = input.dtype),name = 'W')
- b_shape = (2,)
- b = theano.shared(numpy.asarray(rng.uniform(low = -.5, high = .5, size = b_shape),
- dtype = input.dtype),name = 'b')
- conv_out = conv.conv2d(input,W)
- #T.TensorVariable.dimshuffle() can reshape or broadcast (add dimension)
- #dimshuffle(self,*pattern)
- # >>>b1 = b.dimshuffle('x',0,'x','x')
- # >>>b1.shape.eval()
- # array([1,2,1,1])
- output = T.nnet.sigmoid(conv_out + b.dimshuffle('x',0,'x','x'))
- f = theano.function([input],output)
- # demo
- import pylab
- from PIL import Image
- from matplotlib.pyplot import *
- #open random image
- img = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel.jpg'))
- width,height = img.size
- img = numpy.asarray(img, dtype = 'float32')/256. # (height, width, 3)
- # put image in 4D tensor of shape (1,3,height,width)
- img_rgb = img.swapaxes(0,2).swapaxes(1,2) #(3,height,width)
- minibatch_img = img_rgb.reshape(1,3,height,width)
- filtered_img = f(minibatch_img)
- # plot original image and two convoluted results
- pylab.figure(1)
- pylab.subplot(1,3,1);pylab.axis('off');
- pylab.imshow(img)
- title('origin image')
- pylab.gray()
- pylab.subplot(2,3,2); pylab.axis("off")
- pylab.imshow(filtered_img[0,0,:,:]) #0:minibatch_index; 0:1-st filter
- title('convolution 1')
- pylab.subplot(2,3,3); pylab.axis("off")
- pylab.imshow(filtered_img[0,1,:,:]) #0:minibatch_index; 1:1-st filter
- title('convolution 2')
- #pylab.show()
- # maxpooling
- from theano.tensor.signal import downsample
- input = T.tensor4('input')
- maxpool_shape = (2,2)
- pooled_img = downsample.max_pool_2d(input,maxpool_shape,ignore_border = False)
- maxpool = theano.function(inputs = [input],
- outputs = [pooled_img])
- pooled_res = numpy.squeeze(maxpool(filtered_img))
- #pylab.figure(2)
- pylab.subplot(235);pylab.axis('off');
- pylab.imshow(pooled_res[0,:,:])
- title('down sampled 1')
- pylab.subplot(236);pylab.axis('off');
- pylab.imshow(pooled_res[1,:,:])
- title('down sampled 2')
- pylab.show()
3. CNN结构
想必大家随便google下CNN的图都滥大街了,这里拖出来那时候学CNN的时候一张图,自认为陪上讲解的话画得还易懂(<!--囧-->)
废话不多说了,直接上Lenet结构图:(从下往上顺着箭头看,最下面为底层original input)
4. CNN代码
- rng = numpy.random.RandomState(23455)
- # transfrom x from (batchsize, 28*28) to (batchsize,feature,28,28))
- # I_shape = (28,28),F_shape = (5,5),
- N_filters_0 = 20
- D_features_0= 1
- layer0_input = x.reshape((batch_size,D_features_0,28,28))
- layer0 = LeNetConvPoolLayer(rng, input = layer0_input, filter_shape = (N_filters_0,D_features_0,5,5),
- image_shape = (batch_size,1,28,28))
- #layer0.output: (batch_size, N_filters_0, (28-5+1)/2, (28-5+1)/2) -> 20*20*12*12
- N_filters_1 = 50
- D_features_1 = N_filters_0
- layer1 = LeNetConvPoolLayer(rng,input = layer0.output, filter_shape = (N_filters_1,D_features_1,5,5),
- image_shape = (batch_size,N_filters_0,12,12))
- # layer1.output: (20,50,4,4)
- layer2_input = layer1.output.flatten(2) # (20,50,4,4)->(20,(50*4*4))
- layer2 = HiddenLayer(rng,layer2_input,n_in = 50*4*4,n_out = 500, activation = T.tanh)
- layer3 = LogisticRegression(input = layer2.output, n_in = 500, n_out = 10)
layer0, layer1 :分别是卷积+降采样
layer2+layer3:组成一个MLP(ANN)
训练模型:
- cost = layer3.negative_log_likelihood(y)
- params = layer3.params + layer2.params + layer1.params + layer0.params
- gparams = T.grad(cost,params)
- updates = []
- for par,gpar in zip(params,gparams):
- updates.append((par, par - learning_rate * gpar))
- train_model = theano.function(inputs = [minibatch_index],
- outputs = [cost],
- updates = updates,
- givens = {x: train_set_x[minibatch_index * batch_size : (minibatch_index+1) * batch_size],
- y: train_set_y[minibatch_index * batch_size : (minibatch_index+1) * batch_size]})
根据cost(最上层MLP的输出NLL),对所有层的parameters进行训练
剩下的具体见代码和注释。
PS:数据为MNIST所有数据
Optimization complete. Best validation score of 0.990000 % obtained at iteration 122500, with test performance 0.950000 %
Convolution Neural Network (CNN) 原理与实现的更多相关文章
- 【面向代码】学习 Deep Learning(三)Convolution Neural Network(CNN)
========================================================================================== 最近一直在看Dee ...
- Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.1
3.Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.1 http://blog.csdn.net/sunbow0 ...
- Deeplearning - Overview of Convolution Neural Network
Finally pass all the Deeplearning.ai courses in March! I highly recommend it! If you already know th ...
- Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.2
3.Spark MLlib Deep Learning Convolution Neural Network(深度学习-卷积神经网络)3.2 http://blog.csdn.net/sunbow0 ...
- Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.3
3.Spark MLlib Deep Learning Convolution Neural Network(深度学习-卷积神经网络)3.3 http://blog.csdn.net/sunbow0 ...
- 卷积神经网络(Convolutional Neural Network, CNN)简析
目录 1 神经网络 2 卷积神经网络 2.1 局部感知 2.2 参数共享 2.3 多卷积核 2.4 Down-pooling 2.5 多层卷积 3 ImageNet-2010网络结构 4 DeepID ...
- Convolutional neural network (CNN) - Pytorch版
import torch import torch.nn as nn import torchvision import torchvision.transforms as transforms # ...
- keras02 - hello convolution neural network 搭建第一个卷积神经网络
本项目参考: https://www.bilibili.com/video/av31500120?t=4657 训练代码 # coding: utf-8 # Learning from Mofan a ...
- 深度学习:卷积神经网络(convolution neural network)
(一)卷积神经网络 卷积神经网络最早是由Lecun在1998年提出的. 卷积神经网络通畅使用的三个基本概念为: 1.局部视觉域: 2.权值共享: 3.池化操作. 在卷积神经网络中,局部接受域表明输入图 ...
随机推荐
- linux学习第一天 (Linux就该这么学) 找到一本不错的Linux电子书,附《Linux就该这么学》章节目录
本书是由全国多名红帽架构师(RHCA)基于最新Linux系统共同编写的高质量Linux技术自学教程,极其适合用于Linux技术入门教程或讲课辅助教材,目前是国内最值得去读的Linux教材,也是最有价值 ...
- React中使用CSS
第一种: 在组件中直接使用style 不需要组件从外部引入css文件,直接在组件中书写. import React, { Component } from "react"; con ...
- NOIP2017提高组预赛详解
NOIP2017预赛终于结束了. 普遍反映今年的卷子难度较大,但事实上是这样吗?马上我将为您详细地分析这张试卷,这样你就能知道到底难不难. 对了答案,鄙人考得还是太差了,只有91分. 那么下面我们就一 ...
- HDU 6129 Just do it
题意:给你一个包含n个数的序列A和一个数m,序列B中的数是序列A经过异或得到的,比如:b[i]=a[1]^a[2]^…..^a[i].现在让你求经过m次异或后,序列B的值. 思路:这题其实和杨辉三角 ...
- kbmMW均衡负载与容灾(1)(转载红鱼儿)
kbmMW为均衡负载与容灾提供了很好的机制,支持多种实现方式,现在看看最简单的一种,客户端控制的容灾和简单的负载均衡. 现在,我们将kbmMWServer部署到不同的服务器,或者在同一服务器部署多份实 ...
- centos7修改root根目录
1.拷贝/root 原目录的东西到新目录中(包括.xxx文件) /abc 2.修改配置文件 vi /etc/passwd root:x:0:0:root:/root:/bin/bash ==> ...
- Softmax && Cross-entropy Error
softmax 函数,被称为 归一化指数函数,是sigmoid函数的推广. 它将向量等比压缩到[0, 1]之间,所有元素和为1. 图解: Example: softmax([1, 2, 3, 4, 1 ...
- 2019.02.07 bzoj4316: 小C的独立集(仙人掌+树形dp)
传送门 题意:给出一个仙人掌森林求其最大独立集. 思路:如果没有环可以用经典的树形dpdpdp解决. fi,0/1f_{i,0/1}fi,0/1表示第iii个点不选/选的最大独立集. 然后fi,0+ ...
- 2018.10.26 bzoj2721: [Violet 5]樱花(数论)
传送门 推一波式子: 1x+1y=1n!\frac 1 x+\frac 1 y=\frac 1 {n!}x1+y1=n!1 =>xy−x∗n!−y∗n!xy-x*n!-y*n!xy−x∗n ...
- hdu-1255(线段树求面积并)模板
题目链接:传送门 思路: (1)建立线段的信息,每个线段存储l到r的线段的x位置和y的起始点与终点. 建立线段树的节点信息,每个节点代表一个区间的信息,x表示区间的横坐标的位置,l,r表示纵坐标的范围 ...