Ternsorflow 学习：002-Tensorflow 基础知识

前言：

使用 TensorFlow 之前你需要了解关于 TensorFlow 的以下基础知识:

使用图(graphs) 来表示计算
在会话(session) 中执行图
使用张量(tensors) 来代表数据
通过变量(variables) 维护状态
使用供给(feeds) 和取回(fetches) 将数据传入或传出任何操作

总览

TensorFlow是一个以图(graphs)来表示计算的编程系统,图中的节点被称之为op(operation的缩写). 一个op获得零或多个张量(tensors)执行计算,产生零或多个张量。张量是一个按类型划分的多维数组。例如, 你可以将一小组图像集表示为一个四维浮点数数组, 这四个维度分别是[batch, height, width, channels]。

TensorFlow 的图是一种对计算的抽象描述。在计算开始前, 图必须在会话 (Session())中被启动；会话将图的op分发到如CPU或GPU之类的设备(Devices())上,同时提供执行 op 的方法。

这些方法执行后, 将产生的张量 (tensor) 返回。（在 Python 语言中, 将返回numpy的ndarray对象; 在 C 和 C++ 语言中, 将返回tensorflow::Tensor实例。）

TensorFlow支持C、C++、Python编程语言。目前,TensorFlow的Python库更加易用, 它提供了大量的辅助函数来简化构建图的工作, 而这些函数在 C 和 C++ 库中尚不被支持。

计算图

通常，TensorFlow 编程可按两个阶段组织起来: 构建阶段和执行阶段; 前者用于组织计算图，而后者利用 session 中执行计算图中的 op 操作。

例如, 在构建阶段创建一个图来表示和训练神经网络，然后在执行阶段反复执行一组 op 来实现图中的训练。

1.构建计算图

刚开始基于 op 建立图的时候一般不需要任何的输入源 (sourceop)，例如输入常量 (Constance)，再将它们传递给其它 op 执行运算。

Python 库中的 op 构造函数返回代表已被组织好的 op 作为输出对象，这些对象可以传递给其它 op 构造函数作为输入。

TensorFlowPython库有一个可被op构造函数加入计算结点的默认图(defaultgraph)。对大多数应用来说，这个默认图已经足够用了。

阅读Graph类文档来了解如何明晰的管理多个图。

import tensorflow as tf

# Create a Constant op that produces a 1x2 maxtrix.

# The op is added as a node to the default graph.

# The value returned by the constructor represents the output of the Constant op.

maxtrix1 = tf.constant([[3., 3.]])

# Create another Constant that produces a 2x1 matrix.

maxtrix2 = tf.constant([[2.], [2.]])

# Create a Matuml op that takes 'matrix1' and 'matrix2' as inputs.

# The returned value, 'product', represents the result of the matrix multiplication.

product = tf.matmul(maxtrix1, maxtrix2)

# 默认图现在拥有三个节点，两个constant() op，一个matmul() op.

# 为了真正进行矩阵乘法运算，得到乘法结果, 你必须在一个会话 (session) 中载入动这个图。

2.在会话中载入图

构建过程完成后就可运行执行过程。为了载入之前所构建的图，必须先创建一个会话对象 (Sessionobject)。

_会话构建器在未指明参数时会载入默认的图。 _

完整的会话 API 资料，请参见会话类(Sessionobject)

# Launch the default graph.

sess = tf.Session()

# To run the matmul op we call the session 'run()' method, passing ' product'

# which represents the output of the matmul op. This indicates to the call

# that we want to get the output of the matmul op back.

# All inputs needed by the op are run automatically by the session. They typically are run in parallel.

# The call 'run(product)' thus causes the execution of threes ops in the graph:

#  the two constants and matmul.

# The output of the op is returned in 'result' as a numpy `ndarray` object.

# 下面的注释与  with 语句块有关3行是等价的

'''

result = sess.run(product)

print(result);

    # output :  [[ 12. ]]

# Close the Session when we're done.

sess.close()

'''

with tf.Session() as sess:

    result = sess.run([product])

    print(result)

指定GPU[可选]

TensorFlow事实上通过一个“翻译”过程，将定义的图转化为不同的可用计算资源间实现分布计算的操作，如 CPU 或是显卡 GPU。通常不需要用户指定具体使用的 CPU 或 GPU，TensorFlow 能自动检测并尽可能的充分利用找到的第一个 GPU 进行运算。

如果你的设备上有不止一个GPU，你需要明确指定op操作到不同的运算设备以调用它们。使用with...Device语句明确指定哪个 CPU 或 GPU 将被调用:

# 基于上文的完整例程

import tensorflow as tf

with tf.Session() as sess:

    with tf.device("/cpu:0"):

        '''

        使用字符串指定设备，目前支持的设备包括:

        "/cpu:0"：计算机的 CPU；

        "/gpu:0"：计算机的第一个 GPU，如果可用；

        "/gpu:1"：计算机的第二个 GPU，以此类推

        '''

        maxtrix1 = tf.constant([[3., 3.]])

        maxtrix2 = tf.constant([[2.], [2.]])

        product = tf.matmul(maxtrix1, maxtrix2)

        result = sess.run([product])

        print(result)

# 关于使用 GPU 的更多信息，请参阅GPU使用

交互式使用

文档中的 Python 示例使用一个会话 Session 来启动图, 并调用 Session.run() 方法执行操作。考虑到如IPython这样的交互式Python环境的易用,可以使用InteractiveSession代替Session类, 使用 Tensor.eval()和 Operation.run() 方法代替 Session.run(). 这样可以避免使用一个变量来持有会话。

import tensorflow as tf

# Enter a interactive Tensorflow Session

sess = tf.InteractiveSession()

x = tf.Variable([1.0, 2.0])

a = tf.constant([3.0, 3.0])

# Initialize 'x' using the run()  method if its initializer op.

x.initializer.run()

# Add an op to subtract 'a' from 'x'. Run it and print the reusult

sub = tf.subtract(x,a)

print(sub.eval())

sess.close()

#   output : [-2. -1.]

张量

TensorFlow 程序使用 tensor 数据结构来代表所有的数据, 计算图中, 操作间传递的数据都是tensor。

你可以把TensorFlow的张量看作是一个n维的数组或列表。

一个tensor 包含一个静态类型 rank, 和一个 shape。

想了解 TensorFlow 是如何处理这些概念的, 参见 Rank,Shape, 和 Type。

变量

变量维持了图执行过程中的状态信息。

下面的例子演示了如何使用变量实现一个简单的计数器，更多细节详见变量章节。

import tensorflow as tf

# Create a Variable, that will be initialized to the scalar value 0.

state = tf.Variable(0, name ="counter")

# Create an Op to add one to 'state'

one = tf.constant(1)

new_value = tf.add(state,one)

# 代码中assign()操作是图所描绘的表达式的一部分, 正如add()操作一样。

# 所以在调用run()执行表达式之前, 它并不会真正执行赋值操作。

update = tf.assign(state, new_value)

# Variables must be initialized by running and `init` Op after having launched the graph.

# We first have to add the `init` Op the graph.

init_op = tf.initialize_all_variables()

# Launch the graph and run the ops.

with tf.Session() as sess:

    # Run the `init` op

    sess.run(init_op)

    print(sess.run(state))

    for _ in range(3):

        sess.run(update)

        print(sess.run(state))

        #     output :

        #         0

        #         1

        #         2

        #         3

通常会将一个统计模型中的参数表示为一组变量。例如, 你可以将一个神经网络的权重作为某个变量存储在一个 tensor 中。在训练过程中, 通过重复运行训练图, 更新这个 tensor。

取回

为了取回操作的输出内容, 可以在使用 Session 对象的 run() 调用执行图时, 传入一些tensor,这些tensor会帮助你取回结果。

在之前的例子里,我们只取回了单个节点state, 但是你也可以取回多个 tensor：

import tensorflow as tf

input1 = tf.constant(3.0)

input2 = tf.constant(2.0)

input3 = tf.constant(5.0)

intermed = tf.add(input2, input3)

mul = tf.multiply(input1, intermed)

with tf.Session() as sess:

    # 需要获取的多个 tensor 值，在 op 的一次运行中一起获得（而不是逐个去获取 tensor）

    result = sess.run([mul, intermed])

    print(result)

    #   output : [21.0, 7.0]

供给

上述示例在计算图中引入了 tensor, 以常量 (Constants) 或变量 (Variables) 的形式存储。

TensorFlow 还提供供给 (feed) 机制, 该机制可临时替代图中的任意操作中的 tensor 可以对图中任何操作提交补丁, 直接插入一个 tensor。

feed 使用一个 tensor 值临时替换一个操作的输出结果. 你可以提供 feed 数据作为 run() 调用的参数。

feed 只在调用它的方法内有效, 方法结束, feed 就会消失。

最常见的用例是将某些特殊的操作指定为"feed" 操作, 标记的方法是使用tf.placeholder()为这些操作创建占位符。

import tensorflow as tf

input1 = tf.placeholder(tf.float32)

input2 = tf.placeholder(tf.float32)

output = tf.multiply(input1, input2)

with tf.Session() as sess:

    print(sess.run([output], feed_dict={input1:[7.], input2:[2.]}))

    #   output : [array([14.], dtype=float32)]

如果没有正确供给, placeholder() 操作将会产生一个错误提示。

关于 feed 的规模更大的案例，参见MNIST 全连通 feed 教程以及其源代码

附录：有关知识的补充

这些基础概念作为对tensorflow学习的补充。

TensorFlow用张量这种数据结构来表示所有的数据。你可以把一个张量想象成一个n维的数组或列表，一个张量有一个静态类型和动态类型的维数。张量可以在图中的节点之间流通。

在TensorFlow系统中，张量的维数来被描述为阶.但是张量的阶和矩阵的阶并不是同一个概念.张量的阶（有时是关于如顺序或度数或者是n维）是张量维数的一个数量描述.比如，下面的张量（使用Python中list定义的）就是2阶.

    t = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

你可以认为一个二阶张量就是我们平常所说的矩阵，一阶张量可以认为是一个向量.对于一个二阶张量你可以用语句t[i, j]来访问其中的任何元素.而对于三阶张量你可以用't[i, j, k]'来访问其中的任何元素.

阶	数学实例	Python 例子
0	纯量 (只有大小)	`s = 483`
1	向量(大小和方向)	`v = [1.1, 2.2, 3.3]`
2	矩阵(数据表)	`m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]`
3	3阶张量 (数据立体)	`t = [[[2], [4], [6]], [[8], [10], [12]], [[14], [16], [18]]]`
n	n阶 (自己想想看)	`....`

形状

tensorFlow文档中使用了三种记号来方便地描述张量的维度：阶，形状以及维数.下表展示了他们之间的关系：

阶	形状	维数	实例
0	[ ]	0-D	一个 0维张量. 一个纯量.
1	[D0]	1-D	一个1维张量的形式[5].
2	[D0, D1]	2-D	一个2维张量的形式[3, 4].
3	[D0, D1, D2]	3-D	一个3维张量的形式 [1, 4, 3].
n	[D0, D1, ... Dn]	n-D	一个n维张量的形式 [D0, D1, ... Dn].

形状可以通过Python中的整数列表或元祖（int list或tuples）来表示，也或者用TensorShape class。shape [2,3] 表示为数组的意思是第一维有两个元素，第二维有三个元素，如: [[1,2,3],[4,5,6]]。

数据类型

除了维度，Tensors有一个数据类型属性.你可以为一个张量指定下列数据类型中的任意一个类型：

数据类型	Python 类型	描述
DT_FLOAT	tf.float32	32 位浮点数.
DT_DOUBLE	tf.float64	64 位浮点数.
DT_INT64	tf.int64	64 位有符号整型.
DT_INT32	tf.int32	32 位有符号整型.
DT_INT16	tf.int16	16 位有符号整型.
DT_INT8	tf.int8	8 位有符号整型.
DT_UINT8	tf.uint8	8 位无符号整型.
DT_STRING	tf.string	可变长度的字节数组.每一个张量元素都是一个字节数组.
DT_BOOL	tf.bool	布尔型.
DT_COMPLEX64	tf.complex64	由两个32位浮点数组成的复数:实数和虚数.
DT_QINT32	tf.qint32	用于量化Ops的32位有符号整型.
DT_QINT8	tf.qint8	用于量化Ops的8位有符号整型.
DT_QUINT8	tf.quint8	用于量化Ops的8位无符号整型.