https://blog.csdn.net/pandav5/article/details/53993684

(1)Mxnet的数据格式为NDArray,当需要读取可观看的数据,就要调用:

numpy_d = d.asnumpy()
converts it to a Numpy array.

(2)list_arguments (给出当前符号d的输入变量)与list_outputs(给出符号d的输出变量)的说明

import mxnet as mx
a = mx.sym.Variable("A") # represent a placeholder. These can be inputs, weights, or anything else.
b = mx.sym.Variable("B")
c = (a + b) / 10
d = c + 1
调用list_arguments 得到的一定就是用于d计算的所有symbol
d.list_arguments()
# ['A', 'B']
调用list_outputs()得到的就是输出的名字:

d.list_outputs()
# ['_plusscalar0_output'] This is the default name from adding to scalar.,
上面在查看名称,下面教你如何查看各个层的大小

# define input shapes
inp_shapes = {'A':(10,), 'B':(10,)}
arg_shapes, out_shapes, aux_shapes = d.infer_shape(**inp_shapes)

arg_shapes # the shapes of all the inputs to the graph. Order matches d.list_arguments()
# [(10, ), (10, )]

out_shapes # the shapes of all outputs. Order matches d.list_outputs()
# [(10, )]

aux_shapes # the shapes of auxiliary variables. These are variables that are not trainable such as batch normalization population statistics. For now, they are save to ignore.
# []

关于Grad_req的使用,符号描述完后,需要bind,得到一个executor

在使用bing进行绑定,且不需要做反向递归时:

input_arguments = {}
input_arguments['A'] = mx.nd.ones((10, ), ctx=mx.cpu())
input_arguments['B'] = mx.nd.ones((10, ), ctx=mx.cpu())
executor = d.bind(ctx=mx.cpu(),
args=input_arguments, # this can be a list or a dictionary mapping names of inputs to NDArray
grad_req='null') # don't request gradients
args :指出输入的符号以及大小,以词典类型传入
grad_req : 设置为Null,说明不需要进行gradient计算

bind完之后,还需要调用一个forward(),就可以运算整个过程。当然,还可以通过executor,对输入的

变量再次进行相关的赋值。

import numpy as np
# The executor
executor.arg_dict
# {'A': NDArray, 'B': NDArray}

executor.arg_dict['A'][:] = np.random.rand(10,) # Note the [:]. This sets the contents of the array instead of setting the array to a new value instead of overwriting the variable.
executor.arg_dict['B'][:] = np.random.rand(10,)
executor.forward()
executor.outputs
# [NDArray]
output_value = executor.outputs[0].asnumpy()
executor.arg_dict['A']是NDArray类型,再使用executor.arg_dict['A'][:]=赋值,表示以numpy的值覆盖NDArray类型的值,类型依旧是NDArray;如果不加[:],表示以numpy值的array类型直接覆盖。但运算的结果却仍然是以mx.nd.ones(10,)得到的.

获取输出的结果:excutor.outputs[0].asnumpy()

本章最重要的一个环节出现了:与上面的例子的区别在于,添加了一个后向传播过程。那么就需要对grad_req = 'write' ,同时调用backforwad.

# allocate space for inputs
input_arguments = {}
input_arguments['A'] = mx.nd.ones((10, ), ctx=mx.cpu())
input_arguments['B'] = mx.nd.ones((10, ), ctx=mx.cpu())
# allocate space for gradients
grad_arguments = {}
grad_arguments['A'] = mx.nd.ones((10, ), ctx=mx.cpu())
grad_arguments['B'] = mx.nd.ones((10, ), ctx=mx.cpu())

executor = d.bind(ctx=mx.cpu(),
args=input_arguments, # this can be a list or a dictionary mapping names of inputs to NDArray
args_grad=grad_arguments, # this can be a list or a dictionary mapping names of inputs to NDArray
grad_req='write') # instead of null, tell the executor to write gradients. This replaces the contents of grad_arguments with the gradients computed.

executor.arg_dict['A'][:] = np.random.rand(10,)
executor.arg_dict['B'][:] = np.random.rand(10,)

executor.forward()
# in this particular example, the output symbol is not a scalar or loss symbol.
# Thus taking its gradient is not possible.
# What is commonly done instead is to feed in the gradient from a future computation.
# this is essentially how backpropagation works.
out_grad = mx.nd.ones((10,), ctx=mx.cpu())
executor.backward([out_grad]) # because the graph only has one output, only one output grad is needed.

executor.grad_arrays
# [NDarray, NDArray]
在调用Bind时,需要提前手动为gradient分配一个空间args_grad并且传入,同时grad_req 设置为 write。
再调用executor.forward()前向运行。

再调用excutor.backward()后向运行。输出的symbol既不是一个单量,也不是loss symbol。需要手动传入梯度。

与bind 相对的是 simple_bind,他有一个好处:不需要手动分配计算的梯度空间大小。

input_shapes = {'A': (10,), 'B': (10, )}
executor = d.simple_bind(ctx=mx.cpu(),
grad_req='write', # instead of null, tell the executor to write gradients
**input_shapes)
executor.arg_dict['A'][:] = np.random.rand(10,)
executor.arg_dict['B'][:] = np.random.rand(10,)

executor.forward()
out_grad = mx.nd.ones((10,), ctx=mx.cpu())
executor.backward([out_grad])
只需要为simple_bind 设定 输入的大小,它会自动推断梯度所需的空间大小。

一套清晰简单的网络流程就为你摆放在面前了:

import mxnet as mx
import numpy as np
# First, the symbol needs to be defined
data = mx.sym.Variable("data") # input features, mxnet commonly calls this 'data'
label = mx.sym.Variable("softmax_label")

# One can either manually specify all the inputs to ops (data, weight and bias)
w1 = mx.sym.Variable("weight1")
b1 = mx.sym.Variable("bias1")
l1 = mx.sym.FullyConnected(data=data, num_hidden=128, name="layer1", weight=w1, bias=b1)
a1 = mx.sym.Activation(data=l1, act_type="relu", name="act1")

# Or let MXNet automatically create the needed arguments to ops
l2 = mx.sym.FullyConnected(data=a1, num_hidden=10, name="layer2")

# Create some loss symbol
cost_classification = mx.sym.SoftmaxOutput(data=l2, label=label)

# Bind an executor of a given batch size to do forward pass and get gradients
batch_size = 128
input_shapes = {"data": (batch_size, 28*28), "softmax_label": (batch_size, )}
executor = cost_classification.simple_bind(ctx=mx.gpu(0),
grad_req='write',
**input_shapes)
此时executor是训练时用

# The above executor computes gradients. When evaluating test data we don't need this.
# We want this executor to share weights with the above one, so we will use bind
# (instead of simple_bind) and use the other executor's arguments.
executor_test = cost_classification.bind(ctx=mx.gpu(0),
grad_req='null',
args=executor.arg_arrays)
executor_test 是测试时用
# executor 里含有arg_dict表示每层的名称
:bias1,data,layer2_bias,layer2_weight...
#executor 里含有 arg_arrays对应每层的具体数(诀窍:带arrays的表示数值)

# initialize the weights
for r in executor.arg_arrays:
r[:] = np.random.randn(*r.shape)*0.02

# Using skdata to get mnist data. This is for portability. Can sub in any data loading you like.
from skdata.mnist.views import OfficialVectorClassification

data = OfficialVectorClassification()
trIdx = data.sel_idxs[:]
teIdx = data.val_idxs[:]
for epoch in range(10):
print "Starting epoch", epoch
np.random.shuffle(trIdx)
#每128个样本,作为一个batchsize
for x in range(0, len(trIdx), batch_size):
# extract a batch from mnist
batchX = data.all_vectors[trIdx[x:x+batch_size]]
batchY = data.all_labels[trIdx[x:x+batch_size]]

# our executor was bound to 128 size. Throw out non matching batches.
if batchX.shape[0] != batch_size:
continue
# Store batch in executor 'data'
#通过executor的 arg_dict 给予“名称”,就能获取该层的数值信息,例如设置'data',也就是赋予
#输入数据信息。一定要加上[:] ,表示overwritting
executor.arg_dict['data'][:] = batchX / 255.
# Store label's in 'softmax_label'
executor.arg_dict['softmax_label'][:] = batchY
executor.forward()
executor.backward()

#进行一次forward以及一次backward之后,需要对权值进行一次更新。
#pname表示
# do weight updates in imperative
for pname, W, G in zip(cost_classification.list_arguments(), executor.arg_arrays, executor.grad_arrays):
# Don't update inputs
# MXNet makes no distinction between weights and data.
if pname in ['data', 'softmax_label']:
continue
# what ever fancy update to modify the parameters
W[:] = W - G * .001

# Evaluation at each epoch
num_correct = 0
num_total = 0
for x in range(0, len(teIdx), batch_size):
batchX = data.all_vectors[teIdx[x:x+batch_size]]
batchY = data.all_labels[teIdx[x:x+batch_size]]
if batchX.shape[0] != batch_size:
continue
# use the test executor as we don't care about gradients
executor_test.arg_dict['data'][:] = batchX / 255.
executor_test.forward()
num_correct += sum(batchY == np.argmax(executor_test.outputs[0].asnumpy(), axis=1))
num_total += len(batchY)
print "Accuracy thus far", num_correct / float(num_total)
---------------------
作者:不良CV研究生
来源:CSDN
原文:https://blog.csdn.net/pandav5/article/details/53993684
版权声明:本文为博主原创文章,转载请附上博文链接!

固定权重 关于Mxnet的一些基础知识理解(1)的更多相关文章

  1. python基础知识理解

    一.概述 看了一天的python基础语法,基本对python语法有了一个大概的了解(其实之前断断续续也看过python),学习网址:Python 基础教程.因为之前我学过C++,因此在学习python ...

  2. nacos基础知识理解

    概念 Nacos是阿里巴巴开源的一款支持服务注册与发现,配置管理以及微服务管理的组件.用来取代以前常用的注册中心(zookeeper , eureka等等),以及配置中心(spring cloud c ...

  3. rman基础知识理解(一)

    rman用于对数据库的备份和恢复. 他的命令主要分成两大类:独立命令和批处理命令: 独立命令只能在rman的提示符下执行,主要的命令有: CONNECT CONFIGURE CREATE CATALO ...

  4. [C# 基础知识系列]专题一:深入解析委托——C#中为什么要引入委托

    转自http://www.cnblogs.com/zhili/archive/2012/10/22/Delegate.html 引言: 对于一些刚接触C# 不久的朋友可能会对C#中一些基本特性理解的不 ...

  5. Windows核心编程 第六章 线程基础知识 (上)

    第6章 线程的基础知识 理解线程是非常关键的,因为每个进程至少需要一个线程.本章将更加详细地介绍线程的知识.尤其是要讲述进程与线程之间存在多大的差别,它们各自具有什么作用.还要介绍系统如何使用线程内核 ...

  6. css+js+html基础知识总结

    css+js+html基础知识总结 一.CSS相关 1.css的盒子模型:IE盒子模型.标准W3C盒子模型: 2.CSS优先级机制: 选择器的优先权:!important>style(内联样式) ...

  7. 〖前端开发〗HTML/CSS基础知识学习笔记

    经过一天的学习,把慕课网的HTML/CSS基础知识学完了,笔记整理: 1. 文件结构: HTML文件的固定结构: <html> <head>...</head> & ...

  8. elasticsearch基础知识杂记

    日常工作中用到的ES相关基础知识和总结.不足之处请指正,会持续更新. 1.集群的健康状况为 yellow 则表示全部主分片都正常运行(集群可以正常服务所有请求),但是 副本 分片没有全部处在正常状态. ...

  9. 前端基础知识之html和css全解

    前端回顾 目录 前端回顾 基础知识 HTTP协议 认识HTML HTML组成 HTML标签 div和span标签 特殊的属性 常用标签 认识css 选择器 属性 前端就是展示给用户并且与用户进行交互的 ...

随机推荐

  1. php 执行大量sql语句 MySQL server has gone away

    php 设置超时时间单位秒 set_time_limit(3600);   php 设置内存限制ini_set('memory_limit', '1024M');   mysql服务端接收到的包的大小 ...

  2. 190919 centos系统中python2卸载重装

    问题:某些原因卸载了python2,连带卸载了yum工具. 解决思路: 如果服务器没有什么东西,重装系统最省事.但是如果不允许重装,那就只能按部就班的恢复python2和yum. 步骤: 删除pyth ...

  3. mysql 压力测试工具sysbench

    2.1 只读示例 ./bin/sysbench --test=/usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-host=1 ...

  4. Linux 中 /proc/kcore为啥如此之大

    What Is /proc/kcore?None of the files in /proc are really there--they're all, "pretend," f ...

  5. opencv摄像头读取图片

    # 摄像头捕获图像或视频import numpy as np import cv2 # 创建相机的对象 cap = cv2.VideoCapture(0) while(True): # 读取相机所拍到 ...

  6. Celebrate it, this is my first time on this blog.

    After hovered around the technology edge many years(so ashamed), I want to collected all the points ...

  7. WPF系列——简单绑定学习

    1. 绑定到元素对象.(实际项目中用处不大) 界面上两个关联的控件之间绑定,比如一个TextBlock 的FontSize和一个Slider 的Value绑定: <Slider Name=&qu ...

  8. springboot 之 使用poi进行数据的导出(一)

    使用的是idea+restful风格 第一:引入依赖为: <!--poi--> <dependency> <groupId>org.apache.xmlbeans& ...

  9. memcached的缺点

    上篇博客说了为什么引入memcached,主要讲述了memcached的优点,接下来就是我们在使用中必须要注意的内容,memcached的缺点,只有正确认识它,才能运用自如,接下来先看一下memcac ...

  10. ORACLE对字符串去空格处理(trim)

    首先便是这Trim函数.Trim 函数具有删除任意指定字符的功能,而去除字符串首尾空格则是trim函数被使用频率最高的一种.语法Trim ( string ) ,参数string:string类型,指 ...