关于卷积操作是如何进行的就不必多说了,结合代码一步一步来看卷积层是怎么实现的。

代码来源:https://github.com/eriklindernoren/ML-From-Scratch

先看一下其基本的组件函数,首先是determine_padding(filter_shape, output_shape="same"):

def determine_padding(filter_shape, output_shape="same"):

    # No padding
if output_shape == "valid":
return (0, 0), (0, 0)
# Pad so that the output shape is the same as input shape (given that stride=1)
elif output_shape == "same":
filter_height, filter_width = filter_shape # Derived from:
# output_height = (height + pad_h - filter_height) / stride + 1
# In this case output_height = height and stride = 1. This gives the
# expression for the padding below.
pad_h1 = int(math.floor((filter_height - 1)/2))
pad_h2 = int(math.ceil((filter_height - 1)/2))
pad_w1 = int(math.floor((filter_width - 1)/2))
pad_w2 = int(math.ceil((filter_width - 1)/2)) return (pad_h1, pad_h2), (pad_w1, pad_w2)

说明:根据卷积核的形状以及padding的方式来计算出padding的值,包括上、下、左、右,其中out_shape=valid表示不填充。

补充:

  • math.floor(x)表示返回小于或等于x的最大整数。
  • math.ceil(x)表示返回大于或等于x的最大整数。

带入实际的参数来看下输出:

pad_h,pad_w=determine_padding((3,3), output_shape="same")

输出:(1,1),(1,1)

然后是image_to_column(images, filter_shape, stride, output_shape='same')函数

def image_to_column(images, filter_shape, stride, output_shape='same'):
filter_height, filter_width = filter_shape
pad_h, pad_w = determine_padding(filter_shape, output_shape)# Add padding to the image
images_padded = np.pad(images, ((0, 0), (0, 0), pad_h, pad_w), mode='constant')# Calculate the indices where the dot products are to be applied between weights
# and the image
k, i, j = get_im2col_indices(images.shape, filter_shape, (pad_h, pad_w), stride) # Get content from image at those indices
cols = images_padded[:, k, i, j]
channels = images.shape[1]
# Reshape content into column shape
cols = cols.transpose(1, 2, 0).reshape(filter_height * filter_width * channels, -1)
return cols

说明:输入的images的形状是[batchsize,channel,height,width],类似于pytorch的图像格式的输入。也就是说images_padded是在height和width上进行padding的。在其中调用了get_im2col_indices()函数,那我们接下来看看它是个什么样子的:

def get_im2col_indices(images_shape, filter_shape, padding, stride=1):
# First figure out what the size of the output should be
batch_size, channels, height, width = images_shape
filter_height, filter_width = filter_shape
pad_h, pad_w = padding
out_height = int((height + np.sum(pad_h) - filter_height) / stride + 1)
out_width = int((width + np.sum(pad_w) - filter_width) / stride + 1) i0 = np.repeat(np.arange(filter_height), filter_width)
i0 = np.tile(i0, channels)
i1 = stride * np.repeat(np.arange(out_height), out_width)
j0 = np.tile(np.arange(filter_width), filter_height * channels)
j1 = stride * np.tile(np.arange(out_width), out_height)
i = i0.reshape(-1, 1) + i1.reshape(1, -1)
j = j0.reshape(-1, 1) + j1.reshape(1, -1)
k = np.repeat(np.arange(channels), filter_height * filter_width).reshape(-1, 1)return (k, i, j)

说明:单独看很难理解,我们还是带着带着实际的参数一步步来看。

get_im2col_indices((1,3,32,32), (3,3), ((1,1),(1,1)), stride=1)

说明:看一下每一个变量的变化情况,out_width和out_height就不多说,是卷积之后的输出的特征图的宽和高维度。

  • i0:np.repeat(np.arange(3),3):[0 ,0,0,1,1,1,2,2,2]
  • i0:np.tile([0,0,0,1,1,1,2,2,2],3):[0,0,0,1,1,1,2,2,2,0,0,0,1,1,1,2,2,2,0,0,0,1,1,1,2,2,2],大小为:(27,)
  • i1:1*np.repeat(np.arange(32),32):[0,0,0......,31,31,31],大小为:(1024,)
  • j0:np.tile(np.arange(3),3*3):[0,1,2,0,1,2,......],大小为:(27,)
  • j1:1*np.tile(np.arange(32),32):[0,1,2,3,......,0,1,2,......,29,30,31],大小为(1024,)
  • i:i0.reshape(-1,1)+i1.reshape(1,-1):大小(27,1024)
  • j:j0.reshape(-1,1)+j1.reshape(1,-1):大小(27,1024)
  • k:np.repeat(np.arange(3),3*3).reshape(-1,1):大小(27,1)

补充:

  • numpy.pad(array, pad_width, mode, **kwargs):array是要要被填充的数据,第二个参数指定填充的长度,mod用于指定填充的数据,默认是0,如果是constant,则需要指定填充的值。
  • numpy.arange(start, stop, step, dtype = None):举例numpy.arange(3),输出[0,1,2]
  • numpy.repeat(array,repeats,axis=None):举例numpy.repeat([0,1,2],3),输出:[0,0,0,1,1,1,2,2,2]
  • numpy.tile(array,reps):举例numpy.tile([0,1,2],3),输出:[0,1,2,0,1,2,0,1,2]
  • 具体的更复杂的用法还是得去查相关资料。这里只列举出与本代码相关的。

有了这些大小还是挺难理解的呀。那么我们继续,需要明确的是k是对通道进行操作,i是对特征图的高,j是对特征图的宽。使用3×3的卷积核在一个通道上进行卷积,每次执行3×3=9个像素操作,共3个通道,所以共对9×3=27个像素点进行操作。而图像大小是32×32,共1024个像素。再回去看这三行代码:

    cols = images_padded[:, k, i, j]
channels = images.shape[1]
# Reshape content into column shape
cols = cols.transpose(1, 2, 0).reshape(filter_height * filter_width * channels, -1)

images_padded的大小是(1,3,34,34),则cols=images_padded的大小是(1,27,1024)

channels的大小是3

最终cols=cols.transpose(1,2,0).reshape(3*3*3,-1)的大小是(27,1024)。

当batchsize的大小不是1,假设是64时,那么最终输出的cols的大小就是:(27,1024×64)=(27,65536)。

最后就是卷积层的实现了:

首先有一个Layer通用基类,通过继承该基类可以实现不同的层,例如卷积层、池化层、批量归一化层等等:

class Layer(object):

    def set_input_shape(self, shape):
""" Sets the shape that the layer expects of the input in the forward
pass method """
self.input_shape = shape def layer_name(self):
""" The name of the layer. Used in model summary. """
return self.__class__.__name__ def parameters(self):
""" The number of trainable parameters used by the layer """
return 0 def forward_pass(self, X, training):
""" Propogates the signal forward in the network """
raise NotImplementedError() def backward_pass(self, accum_grad):
""" Propogates the accumulated gradient backwards in the network.
If the has trainable weights then these weights are also tuned in this method.
As input (accum_grad) it receives the gradient with respect to the output of the layer and
returns the gradient with respect to the output of the previous layer. """
raise NotImplementedError() def output_shape(self):
""" The shape of the output produced by forward_pass """
raise NotImplementedError()

对于子类继承该基类必须要实现的方法,如果没有实现使用raise NotImplementedError()抛出异常。

接着就可以基于该基类实现Conv2D了:

class Conv2D(Layer):
"""A 2D Convolution Layer.
Parameters:
-----------
n_filters: int
The number of filters that will convolve over the input matrix. The number of channels
of the output shape.
filter_shape: tuple
A tuple (filter_height, filter_width).
input_shape: tuple
The shape of the expected input of the layer. (batch_size, channels, height, width)
Only needs to be specified for first layer in the network.
padding: string
Either 'same' or 'valid'. 'same' results in padding being added so that the output height and width
matches the input height and width. For 'valid' no padding is added.
stride: int
The stride length of the filters during the convolution over the input.
"""
def __init__(self, n_filters, filter_shape, input_shape=None, padding='same', stride=1):
self.n_filters = n_filters
self.filter_shape = filter_shape
self.padding = padding
self.stride = stride
self.input_shape = input_shape
self.trainable = True def initialize(self, optimizer):
# Initialize the weights
filter_height, filter_width = self.filter_shape
channels = self.input_shape[0]
limit = 1 / math.sqrt(np.prod(self.filter_shape))
self.W = np.random.uniform(-limit, limit, size=(self.n_filters, channels, filter_height, filter_width))
self.w0 = np.zeros((self.n_filters, 1))
# Weight optimizers
self.W_opt = copy.copy(optimizer)
self.w0_opt = copy.copy(optimizer) def parameters(self):
return np.prod(self.W.shape) + np.prod(self.w0.shape) def forward_pass(self, X, training=True):
batch_size, channels, height, width = X.shape
self.layer_input = X
# Turn image shape into column shape
# (enables dot product between input and weights)
self.X_col = image_to_column(X, self.filter_shape, stride=self.stride, output_shape=self.padding)
# Turn weights into column shape
self.W_col = self.W.reshape((self.n_filters, -1))
# Calculate output
output = self.W_col.dot(self.X_col) + self.w0
# Reshape into (n_filters, out_height, out_width, batch_size)
output = output.reshape(self.output_shape() + (batch_size, ))
# Redistribute axises so that batch size comes first
return output.transpose(3,0,1,2) def backward_pass(self, accum_grad):
# Reshape accumulated gradient into column shape
accum_grad = accum_grad.transpose(1, 2, 3, 0).reshape(self.n_filters, -1) if self.trainable:
# Take dot product between column shaped accum. gradient and column shape
# layer input to determine the gradient at the layer with respect to layer weights
grad_w = accum_grad.dot(self.X_col.T).reshape(self.W.shape)
# The gradient with respect to bias terms is the sum similarly to in Dense layer
grad_w0 = np.sum(accum_grad, axis=1, keepdims=True) # Update the layers weights
self.W = self.W_opt.update(self.W, grad_w)
self.w0 = self.w0_opt.update(self.w0, grad_w0) # Recalculate the gradient which will be propogated back to prev. layer
accum_grad = self.W_col.T.dot(accum_grad)
# Reshape from column shape to image shape
accum_grad = column_to_image(accum_grad,
self.layer_input.shape,
self.filter_shape,
stride=self.stride,
output_shape=self.padding) return accum_grad def output_shape(self):
channels, height, width = self.input_shape
pad_h, pad_w = determine_padding(self.filter_shape, output_shape=self.padding)
output_height = (height + np.sum(pad_h) - self.filter_shape[0]) / self.stride + 1
output_width = (width + np.sum(pad_w) - self.filter_shape[1]) / self.stride + 1
return self.n_filters, int(output_height), int(output_width)

假设输入还是(1,3,32,32)的维度,使用16个3×3的卷积核进行卷积,那么self.W的大小就是(16,3,3,3),self.w0的大小就是(16,1)。

self.X_col的大小就是(27,1024),self.W_col的大小是(16,27),那么output = self.W_col.dot(self.X_col) + self.w0的大小就是(16,1024)

最后是这么使用的:

image = np.random.randint(0,255,size=(1,3,32,32)).astype(np.uint8)
input_shape=image.squeeze().shape
conv2d = Conv2D(16, (3,3), input_shape=input_shape, padding='same', stride=1)
conv2d.initialize(None)
output=conv2d.forward_pass(image,training=True)
print(output.shape)

输出结果:(1,16,32,32)

计算下参数:

print(conv2d.parameters())

输出结果:448

也就是448=3×3×3×16+16

再是一个padding=valid的:

image = np.random.randint(0,255,size=(1,3,32,32)).astype(np.uint8)
input_shape=image.squeeze().shape
conv2d = Conv2D(16, (3,3), input_shape=input_shape, padding='valid', stride=1)
conv2d.initialize(None)
output=conv2d.forward_pass(image,training=True)
print(output.shape)
print(conv2d.parameters())

需要注意的是cols的大小变化了,因为我们卷积之后的输出是(1,16,30,30)

输出:

cols的大小:(27,900)

(1,16,30,30)

448

最后是带步长的:

image = np.random.randint(0,255,size=(1,3,32,32)).astype(np.uint8)
input_shape=image.squeeze().shape
conv2d = Conv2D(16, (3,3), input_shape=input_shape, padding='valid', stride=2)
conv2d.initialize(None)
output=conv2d.forward_pass(image,training=True)
print(output.shape)
print(conv2d.parameters())

cols的大小:(27,225)

(1,16,15,15)

448

最后补充下:

卷积层参数计算公式 :params=卷积核高×卷积核宽×通道数目×卷积核数目+偏置项(卷积核数目)

卷积之后图像大小计算公式:

输出图像的高=(输入图像的高+padding(高)×2-卷积核高)/步长+1

输出图像的宽=(输入图像的宽+padding(宽)×2-卷积核宽)/步长+1

get_im2col_indices()函数中的变换操作是清楚了,至于为什么这么变换的原因还需要好好去琢磨。至于反向传播和优化optimizer等研究好了之后再更新了。

【python实现卷积神经网络】卷积层Conv2D实现(带stride、padding)的更多相关文章

  1. 关于LeNet-5卷积神经网络 S2层与C3层连接的参数计算的思考???

    https://blog.csdn.net/saw009/article/details/80590245 关于LeNet-5卷积神经网络 S2层与C3层连接的参数计算的思考??? 首先图1是LeNe ...

  2. 深度学习原理与框架-Tensorflow卷积神经网络-卷积神经网络mnist分类 1.tf.nn.conv2d(卷积操作) 2.tf.nn.max_pool(最大池化操作) 3.tf.nn.dropout(执行dropout操作) 4.tf.nn.softmax_cross_entropy_with_logits(交叉熵损失) 5.tf.truncated_normal(两个标准差内的正态分布)

    1. tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='SAME')  # 对数据进行卷积操作 参数说明:x表示输入数据,w表示卷积核, stride ...

  3. Python3 卷积神经网络卷积层,池化层,全连接层前馈实现

    # -*- coding: utf-8 -*- """ Created on Sun Mar 4 09:21:41 2018 @author: markli " ...

  4. 经典网络LeNet5看卷积神经网络各层的维度变化

    本文介绍以下几个CNN经典模型:Lenet(1986年).Alexnet(2012年).GoogleNet(2014年).VGG(2014年).Deep Residual Learning(2015年 ...

  5. 【python实现卷积神经网络】卷积层Conv2D反向传播过程

    代码来源:https://github.com/eriklindernoren/ML-From-Scratch 卷积神经网络中卷积层Conv2D(带stride.padding)的具体实现:https ...

  6. 【python实现卷积神经网络】全连接层实现

    代码来源:https://github.com/eriklindernoren/ML-From-Scratch 卷积神经网络中卷积层Conv2D(带stride.padding)的具体实现:https ...

  7. 【python实现卷积神经网络】批量归一化层实现

    代码来源:https://github.com/eriklindernoren/ML-From-Scratch 卷积神经网络中卷积层Conv2D(带stride.padding)的具体实现:https ...

  8. 【python实现卷积神经网络】池化层实现

    代码来源:https://github.com/eriklindernoren/ML-From-Scratch 卷积神经网络中卷积层Conv2D(带stride.padding)的具体实现:https ...

  9. 【python实现卷积神经网络】padding2D层实现

    代码来源:https://github.com/eriklindernoren/ML-From-Scratch 卷积神经网络中卷积层Conv2D(带stride.padding)的具体实现:https ...

  10. 【python实现卷积神经网络】Flatten层实现

    代码来源:https://github.com/eriklindernoren/ML-From-Scratch 卷积神经网络中卷积层Conv2D(带stride.padding)的具体实现:https ...

随机推荐

  1. Windows通过VNC连接并显示Linux桌面(Ubuntu16.04)

    目录 Linux中安装VNC服务 Linux中安装桌面环境 Windows中安装VNC Viewer Linux中安装VNC服务 sudo apt-get update sudo apt-get in ...

  2. Elasticsearch 之聚合分析入门

    本文主要介绍 Elasticsearch 的聚合功能,介绍什么是 Bucket 和 Metric 聚合,以及如何实现嵌套的聚合. 首先来看下聚合(Aggregation): 什么是 Aggregati ...

  3. ArrayList,HashSet,SortedSet之间的区别是什么?

    今天看Redis官方案例,出现了列表和集合概念,列表在Java中指的就是List,集合在Java中指的就是Set,那么怎么实现列表和集合,以及它们有什么区别呢? 我写了个Demo演示下: import ...

  4. angular的性能分析 -随记

    $watch 的实现原理和性能分析 只有双向绑定的 scope 才会被加入$watch队列,或者手动绑定$watch的$scope 所有放在 $scope 中的变量或函数都被加入到了$watch队列当 ...

  5. Unity 游戏框架搭建 2019 (九~十二) 第一章小结&第二章简介&第八个示例

    第一章小结 为了强化教程的重点,会在合适的时候进行总结与快速复习. 第二章 简介 在第一章我们做了知识库的准备,从而让我们更高效地收集示例. 在第二章,我们就用准备好的导出工具试着收集几个示例,这些示 ...

  6. HTML节点操作

    HTML节点操作 HTML节点的基本操作,添加节点,替换节点,删除节点,绑定事件,访问子节点,访问父节点,访问兄弟节点. 文档对象模型Document Object Model,简称DOM,是W3C组 ...

  7. Ubuntu16.04安装QQ机器人

    Ubuntu安装QQ机器人 看了看现在QQ机器人似乎只有酷Q机器人有Docker可以在linux上运行了 那就k开始装酷Q机器人,资源占用也不是很大,大概占用180M内存吧 安装酷Q HTTP 首先安 ...

  8. tcp上传大文件举例、udp实现qq聊天、socketserver模块实现并发

    为什么会出现粘包现象(day31提到过,这里再举个例子) """首先只有在TCP协议中才会出现粘包现象,因为TCP协议是流式协议它的特点是将数据量小并且时间间隔比较短的数 ...

  9. redhat7安装

    ------------恢复内容开始------------ 新建虚拟机向导(自定义) 指定虚拟机安装位置,把他放在固态硬盘提升他的运行速度(不推荐,一般将位置定为非系统盘) 选择系统镜像文件 开机选 ...

  10. 《深入理解 Java 虚拟机》读书笔记:Java 内存模型与线程

    正文 由于计算机的处理器运算速度与它的存储和通信子系统速度的差距太大了,大量的时间都花费在磁盘 I/O.网络通信或者数据库访问上,导致处理器在大部分时间里都处于等待其他资源的状态.因此,为了充分利用计 ...