Mobilenet V1
参考博客: https://cuijiahua.com/blog/2018/02/dl_6.html
1. Depth Separable Convolution
A standard convolution both filters and combines inputs into a new set of outputs in one step. The depthwise separable convolution splits this into two layers, a separate layer for filtering and a separate layer for combining.
一个卷积核处理输入数据时的计算量为(有Padding):
\(D_K ∗D_K ∗M∗D_F ∗D_F\)
M为输入的通道数
\(D_K\)为卷积核的宽和高
\(D_F\)为输入feature map的宽和高
在某一层如果使用N个卷积核,这一个卷积层的计算量为:
\(D_K ∗D_K ∗M∗N*D_F ∗D_F\)
如果采用 Depthwise Convolutional Filters,标准交卷:

而深度可分离卷积:

一组和输入通道数相同的2D卷积核的运算量为:
\(D_K*D_k*M*D_F*D_F\)
3D的1x1卷积核的计算量为:
\(N*M*D_F*D_F\)
因此这种组合方式的计算量:
\(D_K*D_k*M*D_F*D_F+N*M*D_F*D_F\)
相比较,Depthwise Separable Convolution 的计算量:

举一个具体的例子,给定输入图像的为 3 通道的 224x224 的图像,VGG16网络的第3个卷积层conv2_1输入的是尺寸为 112 的特征图,通道数为 64 ,卷积核尺寸为 3,卷积核个数为 128,传统卷积运算量就是:

而将传统3D卷积替换为deep-wise结果1x1方式的卷积,计算量为:

可见这一层,计算量比例:

2. 网络结构
传统的3D卷积常见的使用方式如下图左侧所示,deep-wise卷积的使用方式如下图右边所示:

- deepwise的卷积和后面的1x1卷积被当成了两个独立的模块,都在输出结果的部分加入了Batch Normalization和非线性激活单元。
 
Deepwise结合 1x1 的卷积方式代替传统卷积不仅在理论上会更高效,而且由于大量使用 1x1 的卷积,可以直接使用高度优化的数学库来完成这个操作。以Caffe为例,如果要使用这些数学库,要首先使用 im2col 的方式来对数据进行重新排布,从而确保满足此类数学库的输入形式;但是 1x1 方式的卷积不需要这种预处理。
在MobileNet中,有95%的计算量和75%的参数属于1x1卷积:

3. 宽度因子和分辨率因子
宽度因子
宽度因子α是一个属于(0,1]之间的数,附加于网络的通道数。简单来说就是新网络中每一个模块要使用的卷积核数量相较于标准的MobileNet比例。对于deep-wise结合1x1方式的卷积核,计算量为:

分辨率因子
分辨率因子β的取值范围在(0,1]之间,是作用于每一个模块输入尺寸的约减因子,简单来说就是将输入数据以及由此在每一个模块产生的特征图都变小了,结合宽度因子α,deep-wise结合1x1方式的卷积核计算量为:

4. 代码实现
在代码中,并没有实现分辨率因子,而是多了一个depth_multiplier参数:
"""MobileNet v1 models for Keras.
MobileNet is a general architecture and can be used for multiple use cases.
Depending on the use case, it can use different input layer size and
different width factors. This allows different width models to reduce
the number of multiply-adds and thereby
reduce inference cost on mobile devices.
MobileNets support any input size greater than 32 x 32, with larger image sizes
offering better performance.
The number of parameters and number of multiply-adds
can be modified by using the `alpha` parameter,
which increases/decreases the number of filters in each layer.
By altering the image size and `alpha` parameter,
all 16 models from the paper can be built, with ImageNet weights provided.
The paper demonstrates the performance of MobileNets using `alpha` values of
1.0 (also called 100 % MobileNet), 0.75, 0.5 and 0.25.
For each of these `alpha` values, weights for 4 different input image sizes
are provided (224, 192, 160, 128).
The following table describes the size and accuracy of the 100% MobileNet
on size 224 x 224:
----------------------------------------------------------------------------
Width Multiplier (alpha) | ImageNet Acc |  Multiply-Adds (M) |  Params (M)
----------------------------------------------------------------------------
|   1.0 MobileNet-224    |    70.6 %     |        529        |     4.2     |
|   0.75 MobileNet-224   |    68.4 %     |        325        |     2.6     |
|   0.50 MobileNet-224   |    63.7 %     |        149        |     1.3     |
|   0.25 MobileNet-224   |    50.6 %     |        41         |     0.5     |
----------------------------------------------------------------------------
The following table describes the performance of
the 100 % MobileNet on various input sizes:
------------------------------------------------------------------------
      Resolution      | ImageNet Acc | Multiply-Adds (M) | Params (M)
------------------------------------------------------------------------
|  1.0 MobileNet-224  |    70.6 %    |        529        |     4.2     |
|  1.0 MobileNet-192  |    69.1 %    |        529        |     4.2     |
|  1.0 MobileNet-160  |    67.2 %    |        529        |     4.2     |
|  1.0 MobileNet-128  |    64.4 %    |        529        |     4.2     |
------------------------------------------------------------------------
The weights for all 16 models are obtained and translated
from Tensorflow checkpoints found at
https://github.com/tensorflow/models/blob/master/slim/nets/mobilenet_v1.md
# Reference
- [MobileNets: Efficient Convolutional Neural Networks for
   Mobile Vision Applications](https://arxiv.org/pdf/1704.04861.pdf))
"""
from keras.models import Model
from keras.layers import Input, Activation, Dropout, Reshape, BatchNormalization, GlobalAveragePooling2D, GlobalMaxPooling2D
from keras.layers import Conv2D, DepthwiseConv2D
from keras.utils import  plot_model
from keras import backend as K
def relu6(x):
    return K.relu(x, max_value=6)
def _make_divisiable(v, divisor=8, min_value=8):
    """分段函数,保证能够被divisor整除,最小数是min_value"""
    if min_value is None:
        min_value = divisor
    new_v = max(min_value, int(v + divisor/2) // divisor * divisor)
    # Make sure that round down does not go down by more than 10%.
    if new_v < 0.9 * v:
        new_v += divisor
    return new_v
def _conv_bolck(inputs, filters, alpha, kernel=(3, 3), strides=(1, 1), bn_epsilon=1e-3,
                bn_momentum=0.99, block_id=1):
    """ Adds an initial convolution layer (with batch normalization and relu6).
    Args:
        inputs: Input tensor of shape `(rows, cols, 3)` (with `channels_last` data format)
                or (3, rows, cols) (with `channels_first` data format).
                It should have exactly 3 inputs channels, and width and height should be no smaller than 32.
                E.g. `(224, 224, 3)` would be one valid value.
        filters: Integer, the dimensionality of the output space.
                (i.e. the number output of filters in the convolution).
        alpha: controls the width of the network.
                - If `alpha` < 1.0, proportionally decreases the number of filters in each layer.
                - If `alpha` > 1.0, proportionally increases the number of filters in each layer.
                - If `alpha` = 1, default number of filters from the paper are used at each layer.
        kernel: An integer or tuple/list of 2 integers, specifying the width and height of the 2D convolution window.
                Can be a single integer to specify the same value for all spatial dimensions.
        strides: An integer or tuple/list of 2 integers, specifying the strides of the convolution along the width and height.
                 Can be a single integer to specify the same value for all spatial dimensions.
                 Specifying any stride value != 1 is incompatible with specifying any `dilation_rate` value != 1.
        bn_epsilon: Epsilon value for BatchNormalization
        bn_momentum: Momentum value for BatchNormalization
        block_id: Integer, a unique identification designating the block number.
    Returns:
        Output tensor of block
    Input shape:
        4D tensor with shape: `(samples, channels, rows, cols)` if data_format='channels_first'
                           or `(samples, rows, cols, channels)` if data_format='channels_last'.
    Output shape:
        4D tensor with shape: `(samples, filters, new_rows, new_cols)` if data_format='channels_first'
                           or  `(samples, new_rows, new_cols, filters)` if data_format='channels_last'.
                          `rows` and `cols` values might have changed due to stride.
    """
    channel_axis = 1 if K.image_data_format() == 'channels_first' else -1
    filters = _make_divisiable(filters * alpha)  # 乘以宽度因子后的卷积核数量,可能不能被divisor=8整除
    x = Conv2D(filters, kernel, use_bias=False, strides=strides, name='conv{}'.format(block_id))(inputs)
    x = BatchNormalization(axis=channel_axis, momentum=bn_momentum, epsilon=bn_epsilon, name='conv{}_bn'.format(block_id))(x)
    return Activation(relu6, name='conv{}_relu'.format(block_id))(x)
def _depthwise_conv_block(inputs, pointwise_conv_filters, alpha, depth_multiplier=1,
                          strides=(1, 1), bn_epsilon=1e-3, block_id=1):
    """Adds a depthwise convolution block.
    A depthwise convolution block consists of
    a depthwise conv, batch normalization, relu6,
    pointwise convolution, batch normalization and relu6
    Args:
        inputs: Input tensor of shape `(rows, cols, channels)`(with `channels_last` data format)
                or (channels, rows, cols)(with `channels_first` data format)
        pointwise_conv_filters: Integer, the dimensionality of the output space
                                (i.e. the number output of filters in the pointwise convolution).
        alpha: controls the width of the network.
            - If `alpha` < 1.0, proportionally decreases the number of filters in each layer.
            - If `alpha` > 1.0, proportionally increases the number of filters in each layer.
            - If `alpha` = 1, default number of filters from the paper are used at each layer.
        depth_multiplier: The number of depthwise convolution output channels for each channel.
                        The total number of depthwise convolution output channels
                        will be equal to `filters_in * depth_multiplier`. 每个通道的深度卷积输出通道的数量
        strides:  An integer or tuple/list of 2 integers,
                specifying the strides of the convolution along the width and height.
                Can be a single integer to specify the same value for all spatial dimensions.
                Specifying any stride value != 1 is incompatible with specifying any `dilation_rate` value != 1.
        bn_epsilon: Epsilon value for BatchNormalization
        block_id: Integer, a unique identification designating the block number.
    Returns:
        Output tensor of block
    Input shape:
         4D tensor with shape: `(batch, channels, rows, cols)` if data_format='channels_first'
                                or `(batch, rows, cols, channels)` if data_format='channels_last'.
    Output shape:
        4D tensor with shape: `(batch, filters, new_rows, new_cols)` if data_format='channels_first'
                                or `(batch, new_rows, new_cols, filters)` if data_format='channels_last'.
         `rows` and `cols` values might have changed due to stride.
    """
    channel_axis = 1 if K.image_data_format() == 'channels_first' else -1
    pointwise_conv_filters = _make_divisiable(pointwise_conv_filters * alpha)
    # Depthwise Conv2D
    # 只有depth_multiplier个卷积核,其将卷积操作分解,实际上卷积核shape: 3 x 3 x input_channels x depth_multiplier
    # 以下面为例,DepthwiseConv2D输出的tensor的shape: (batch, rows, cols, input_channels * depth_multiplier)
    x = DepthwiseConv2D(kernel_size=(3, 3),
                        padding='same',
                        depth_multiplier=depth_multiplier,
                        strides=strides,
                        use_bias=False,
                        name='conv_dw_{}'.format(block_id))(inputs)
    x = BatchNormalization(axis=channel_axis, epsilon=bn_epsilon, name='conv_dw_{}_bn'.format(block_id))(x)
    x = Activation(relu6, name='conv_dw_{}_relu'.format(block_id))(x)
    # Pointwise Conv2D  pointwise_conv_filters控制最终out_channels
    x = Conv2D(pointwise_conv_filters,
               kernel_size=(1, 1),
               padding='same',
               use_bias=False,
               strides=(1, 1),
               name='conv_pw_{}'.format(block_id))(x)
    x = BatchNormalization(axis=channel_axis, epsilon=bn_epsilon, name='conv_pw_{}_bn'.format(block_id))(x)
    return Activation(relu6, name='conv_pw_{}_relu'.format(block_id))(x)
def mobilenetv1(input_shape,
                alpha=1.0,
                depth_multiplier=1,
                dropout=1e-3,
                classes=1000):
    """Instantiates the MobileNet architecture.
    Args:
        input_shape: optional shape tuple, only to be specified if `include_top` is False.
                    (otherwise the input shape has to be `(224, 224, 3)` (with `channels_last` data format)
                    or (3, 224, 224) (with `channels_first` data format).
                    It should have exactly 3 inputs channels, and width and height should be no smaller than 32.
                    E.g. `(200, 200, 3)` would be one valid value.
        alpha: controls the width of the network.
                - If `alpha` < 1.0, proportionally decreases the number of filters in each layer.
                - If `alpha` > 1.0, proportionally increases the number of filters in each layer.
                - If `alpha` = 1, default number of filters from the paper are used at each layer.
        depth_multiplier: depth multiplier for depthwise convolution
        dropout: dropout rate
        classes: optional number of classes to classify images into
    Returns:
        A Keras model instance.
    Raises:
        ValueError: in case of invalid argument for `weights`, or invalid input shape.
        RuntimeError: If attempting to run this model with a backend that does not support separable convolutions.
    """
    x_input = Input(shape=input_shape)
    x = _conv_bolck(x_input, 32, alpha, strides=(2, 2))
    x = _depthwise_conv_block(x, 64, alpha, depth_multiplier,
                              block_id=1)
    x = _depthwise_conv_block(x, 128, alpha, depth_multiplier,
                              strides=(2, 2), block_id=2)
    x = _depthwise_conv_block(x, 128, alpha, depth_multiplier,
                              block_id=3)
    x = _depthwise_conv_block(x, 256, alpha, depth_multiplier,
                              strides=(2, 2), block_id=4)
    x = _depthwise_conv_block(x, 256, alpha, depth_multiplier,
                              block_id=5)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier,
                              strides=(2, 2),block_id=6)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier,
                              block_id=7)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier,
                              block_id=8)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier,
                              block_id=9)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier,
                              block_id=10)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier,
                              block_id=11)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier,
                              strides=(2, 2),block_id=12)
    x = _depthwise_conv_block(x, 1024, alpha, depth_multiplier, block_id=13)
    shape = (1, 1, int(1024 * alpha))
    x  = GlobalAveragePooling2D()(x)
    x = Reshape(shape, name='reshape_1')(x)
    x = Dropout(dropout, name='dropout')(x)
    x = Conv2D(classes, (1, 1), padding='same', name='conv_preds')(x)
    x = Activation('softmax', name='act_sotmax')(x)
    x = Reshape((classes,), name='reshape_2')(x)
    return Model(x_input, x)
if __name__ == '__main__':
    alpha = 1
    depth_multiplier = 1
    mobilenet = mobilenetv1(input_shape=(224, 224, 3), alpha=alpha, depth_multiplier=depth_multiplier)
    mobilenet.summary()
    plot_model(mobilenet, show_shapes=True, to_file='mobilenet_alpha{}_depth_multiplier_{}.png'.format(alpha, depth_multiplier))
												
											Mobilenet V1的更多相关文章
- 轻量级CNN模型mobilenet v1
		
mobilenet v1 论文解读 论文地址:https://arxiv.org/abs/1704.04861 核心思想就是通过depthwise conv替代普通conv. 有关depthwise ...
 - 卷积神经网络学习笔记——轻量化网络MobileNet系列(V1,V2,V3)
		
完整代码及其数据,请移步小编的GitHub地址 传送门:请点击我 如果点击有误:https://github.com/LeBron-Jian/DeepLearningNote 这里结合网络的资料和Mo ...
 - 机器视觉:MobileNet 和 ShuffleNet
		
虽然很多CNN模型在图像识别领域取得了巨大的成功,但是一个越来越突出的问题就是模型的复杂度太高,无法在手机端使用,为了能在手机端将CNN模型跑起来,并且能取得不错的效果,有很多研究人员做了很多有意义的 ...
 - 从Inception v1,v2,v3,v4,RexNeXt到Xception再到MobileNets,ShuffleNet,MobileNetV2
		
from:https://blog.csdn.net/qq_14845119/article/details/73648100 Inception v1的网络,主要提出了Inceptionmodule ...
 - 图像分类丨浅析轻量级网络「SqueezeNet、MobileNet、ShuffleNet」
		
前言 深度卷积网络除了准确度,计算复杂度也是考虑的重要指标.本文列出了近年主流的轻量级网络,简单地阐述了它们的思想.由于本人水平有限,对这部分的理解还不够深入,还需要继续学习和完善. 最后我参考部分列 ...
 - MobileNet V2深入理解
		
转载:https://zhuanlan.zhihu.com/p/33075914 MobileNet V2 论文初读 转载:https://blog.csdn.net/wfei101/article/ ...
 - 深度学习笔记(十一)网络 Inception, Xception, MobileNet, ShuffeNet, ResNeXt, SqueezeNet, EfficientNet, MixConv
		
1. Abstract 本文旨在简单介绍下各种轻量级网络,纳尼?!好吧,不限于轻量级 2. Introduction 2.1 Inception 在最初的版本 Inception/GoogleNet, ...
 - MobileNet系列
		
最近一段时间,重新研读了谷歌的mobilenet系列,对该系列有新的认识. 1.MobileNet V1 这篇论文是谷歌在2017年提出了,专注于移动端或者嵌入式设备中的轻量级CNN网络.该论文最大的 ...
 - 论文阅读笔记---ShuffleNet V1
		
01 ShuffleNet V1要解决什么问题 为算力有限的嵌入式场景下专门设计一个高效的神经网络架构. 02 亮点 使用了两个新的操作:pointwise group convolution和cha ...
 
随机推荐
- 基于NACOS和JAVA反射机制动态更新JAVA静态常量非@Value注解
			
1.前言 项目中都会使用常量类文件, 这些值如果需要变动需要重新提交代码,或者基于@Value注解实现动态刷新, 如果常量太多也是很麻烦; 那么 能不能有更加简便的实现方式呢? 本文讲述的方式是, 一 ...
 - python多线程+生产者和消费者模型+queue使用
			
多线程简介 多线程:在一个进程内部,要同时干很多事情,就需要同时执行多个子任务,我们把进程内的这些子任务叫线程. 线程的内存空间是共享的,每个线程都共享同一个进程的资源 模块: 1._thread模块 ...
 - 都在讲DevOps,但你知道它的发展趋势吗?
			
根据最近的一项集体研究,DevOps的市场在2017年创造了约29亿美元的产值,预计到2022年,这个数字将达到约66亿美元.人工智能的融入和安全性的融入,加上向自动化的巨大转变,可合理预测,在202 ...
 - Tensorflow入门学习笔记汇总
			
一.环境准备 1.安装python:下载地址https://www.python.org/downloads/windows/下载并安装(推荐python3) 2.安装对应python版本的库:htt ...
 - pikachu靶场-CSRF
			
xss和csrf区别: CSRF是借用户的权限完成攻击,攻击者并没有拿到用户的权限,而XSS是直接盗取到了用户的权限,然后实施破坏. PS: 由于之前将php5升级到php7,导致靶场环境出现以下问题 ...
 - 创建windows窗口
			
from tkinter import * win=Tk() #创建窗口对象 win.title("我的第一个gu ...
 - CSS技术让高度自适应减少很多不必要的检测
			
高度自适应第一种情况 1.高度不去设置,或者高度设置auto 内容撑开父元素的高度.2.内容撑开父元素的高度 -> 最小高度的设置 min-height3.浮动元素添加高度自适应 -> 添 ...
 - h5移动端实现图片文件上传
			
PC端上传文件多半用插件,引入flash都没关系,但是移动端要是还用各种冗余的插件估计得被喷死,项目里面需要做图片上传的功能,既然H5已经有相关的接口且兼容性良好,当然优先考虑用H5来实现. JS代码 ...
 - rsync+inotify 备份
			
一,服务端安装(备份服务器): #安装rsync cd /usr/local/src/ wget http://rsync.samba.org/ftp/rsync/src/rsync-3.0.9.ta ...
 - 洛谷 P4042 [AHOI2014/JSOI2014]骑士游戏
			
题意 有\(n\)个怪物,可以消耗\(k\)的代价消灭一个怪物或者消耗\(s\)的代价将它变成另外一个或多个新的怪物,求消灭怪物$的最小代价 思路 \(DP\)+最短路 这几天做的第一道自己能\(yy ...