TensorFlow-Gpu环境搭建——Win10+ Python+Anaconda+cuda

参考：http://blog.csdn.net/sb19931201/article/details/53648615

https://segmentfault.com/a/1190000009803319

python版本tensorflow分为Cpu版本和Gpu版本，Nvidia的Gpu非常适合机器学校的训练

python和tensorflow的安装较简单，可以参考上面的链接，主要是通过Anaconda来管理。

使用Nvidia的Gpu，需要安装Cuda和cudnn

需要注意

1、显卡是否支持GPU加速

2、软件的版本

windows 10--python 3.5--tensorflow-gpu 1.4.0--cuda cuda_8.0.61_win10 --cudnn-8.0-windows10-x64-v6.0

Cuda

The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. With the CUDA Toolkit, you can develop, optimize and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler and a runtime library to deploy your application.

介绍及最新版下载地址：https://developer.nvidia.com/cuda-toolkit

cuda个版本下载地址：https://developer.nvidia.com/cuda-toolkit-archive，根据提示安装即可

cudnn

The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN is part of the NVIDIA Deep Learning SDK.

cudnn 是一个dll文件，需要复制到cuda的安装目录的bin文件中

测试代码，使用的是tensorflow官网的代码

import tensorflow as tf

import numpy as np

# 使用 NumPy 生成假数据(phony data), 总共 100 个点.

x_data = np.float32(np.random.rand(2, 100)) # 随机输入

y_data = np.dot([0.100, 0.200], x_data) + 0.300

# 构造一个线性模型

#

b = tf.Variable(tf.zeros([1]))

W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0))

y = tf.matmul(W, x_data) + b

# 最小化方差

loss = tf.reduce_mean(tf.square(y - y_data))

optimizer = tf.train.GradientDescentOptimizer(0.5)

train = optimizer.minimize(loss)

# 初始化变量

init = tf.initialize_all_variables()

# 启动图 (graph)

sess = tf.Session()

sess.run(init)

# 拟合平面

for step in range(0, 201):

    sess.run(train)

    if step % 20 == 0:

        print (step, sess.run(W), sess.run(b))

# 得到最佳拟合结果 W: [[0.100  0.200]], b: [0.300]

输出结果：

可以看到显卡的计算能力是6.1

D:\Tools\Anaconda35\python.exe D:/PythonProj/tensorFlow/tensor8.py

WARNING:tensorflow:From D:\Tools\Anaconda35\lib\site-packages\tensorflow\python\util\tf_should_use.py:107: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.

Instructions for updating:

Use `tf.global_variables_initializer` instead.

2017-11-19 17:08:40.225423: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

2017-11-19 17:08:40.882335: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Found device 0 with properties:

name: GeForce GTX 1060 3GB major: 6 minor: 1 memoryClockRate(GHz): 1.7085

pciBusID: 0000:01:00.0

totalMemory: 3.00GiB freeMemory: 254.16MiB

2017-11-19 17:08:40.883414: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1060 3GB, pci bus id: 0000:01:00.0, compute capability: 6.1)

0 [[ 0.29419887 -0.23337287]] [ 1.0515306]

20 [[ 0.00030054  0.03563837]] [ 0.44433528]

40 [[ 0.04815638  0.14494912]] [ 0.35854429]

60 [[ 0.07746208  0.17898612]] [ 0.32386735]

80 [[ 0.09062619  0.19159497]] [ 0.30974501]

100 [[ 0.09614999  0.19658807]] [ 0.30398068]

120 [[ 0.09842454  0.1986087 ]] [ 0.30162627]

140 [[ 0.09935603  0.1994319 ]] [ 0.3006644]

160 [[ 0.09973686  0.19976793]] [ 0.30027145]

180 [[ 0.09989249  0.1999052 ]] [ 0.30011091]

200 [[ 0.09995609  0.19996127]] [ 0.30004531]

Process finished with exit code 0

MNIST教程，训练结果比cup版本快了大约百倍

from tensorflow.examples.tutorials.mnist import input_data

import tensorflow as tf

#加载训练数据

MNIST_data_folder=r"D:\WorkSpace\tensorFlow\data"

mnist=input_data.read_data_sets(MNIST_data_folder,one_hot=True)

print(mnist.train.next_batch(1))

#

# 建立抽象模型

x = tf.placeholder("float", [None, 784])

W = tf.Variable(tf.zeros([784,10]))

b = tf.Variable(tf.zeros([10]))

y = tf.nn.softmax(tf.matmul(x,W) + b)

y_ = tf.placeholder("float", [None,10])

#权重初始化

def weight_variable(shape):

  initial = tf.truncated_normal(shape, stddev=0.1)

  return tf.Variable(initial)

def bias_variable(shape):

  initial = tf.constant(0.1, shape=shape)

  return tf.Variable(initial)

#卷积和池化

def conv2d(x, W):

  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):

  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],

                        strides=[1, 2, 2, 1], padding='SAME')

#第一层卷积

W_conv1 = weight_variable([5, 5, 1, 32])

b_conv1 = bias_variable([32])

x_image = tf.reshape(x, [-1,28,28,1])

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)

h_pool1 = max_pool_2x2(h_conv1)

#第二层卷积

W_conv2 = weight_variable([5, 5, 32, 64])

b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)

h_pool2 = max_pool_2x2(h_conv2)

#密集连接层

W_fc1 = weight_variable([7 * 7 * 64, 1024])

b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])

h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

#Dropout

keep_prob = tf.placeholder("float")

h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

#输出层

W_fc2 = weight_variable([1024, 10])

b_fc2 = bias_variable([10])

y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

#训练和评估模型

cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))

train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

sess = tf.InteractiveSession();

init = tf.global_variables_initializer();

sess.run(init);

for i in range(20000):

  batch = mnist.train.next_batch(50)

  if i%100 == 0:

    train_accuracy = accuracy.eval(feed_dict={

        x:batch[0], y_: batch[1], keep_prob: 1.0})

    print("step %d, training accuracy %g"%(i, train_accuracy))

  train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

print("test accuracy %g"%accuracy.eval(feed_dict={

    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

TensorFlow-Gpu环境搭建——Win10+ Python+Anaconda+cuda的更多相关文章

深度学习（TensorFlow）环境搭建：（三）Ubuntu16.04+CUDA8.0+cuDNN7+Anaconda4.4+Python3.6+TensorFlow1.3
紧接着上一篇的文章<深度学习(TensorFlow)环境搭建:(二)Ubuntu16.04+1080Ti显卡驱动>,这篇文章,主要讲解如何安装CUDA+CUDNN,不过前提是我们是已经把N ...
深度学习（TensorFlow）环境搭建：（二）Ubuntu16.04+1080Ti显卡驱动
前几天把刚拿到了2台GPU机器组装好了,也写了篇硬件配置清单的文章——<深度学习(TensorFlow)环境搭建:(一)硬件选购和主机组装>.这两台也在安装Ubuntu 16.04和108 ...
【tensorflow】1.安装Tensorflow开发环境，安装Python 的IDE--PyCharm
================================================== 安装Tensorflow开发环境,安装Python 的IDE--PyCharm 1.PyCharm ...
Python环境搭建、python项目以docker镜像方式部署到Linux
Python环境搭建.python项目以docker镜像方式部署到Linux 本文的项目是用Python写的,记录了生成docker镜像,然后整个项目在Linux跑起来的过程: 原文链接:https: ...
04基于python玩转人工智能最火框架之TensorFlow开发环境搭建
MOOC_VM.vdl.zip 解压之后,得到一个vdl文件.打开virtual box,新建选择类型linuxubuntu 64位. 选择继续,分配2g.使用已有的虚拟硬盘文件,点击选择我们下载的文 ...
ubuntu16.04+cuda9+cudnn7+tensorflow+pycharm环境搭建
安装环境:ubuntu16.04+cuda9+cudnn7+tensorflow+pycharm 1)前期搭建过程主要是按照这篇博文,对于版本选择,安装步骤都讲得很详细,亲测有效! https://b ...
TensorFlow 开发环境搭建--Pycharm
今天动手开始搭建TensorFlow开发环境, 用PyCharm来跑MNIST中的例子.记录过程如下下载安装 (1)首先安装AnaConda, AnaConda可以帮忙去管理安装包,帮忙创建虚拟环境 ...
TensorFlow实验环境搭建
初衷: 由于系统.平台的原因,网上有各种版本的tensorflow安装教程,基于linux的.mac的.windows的,各有不同,tensorflow的官网也给出了具体的安装命令.但实际上,即使te ...
Jetson tx2的tensorflow keras环境搭建
其实我一直都在想,搞算法的不仅仅是服务,我们更是要在一个平台上去实现服务,因此,在工业领域,板子是很重要的,它承载着无限的机遇和挑战,当然,我并不是特别懂一些底层的东西,但是这篇博客希望可以帮助有需要 ...

随机推荐

Wikioi 3776 生活大爆炸版石头剪子布
题目描述 Description 石头剪刀布是常见的猜拳游戏:石头胜剪刀,剪刀胜布,布胜石头.如果两个人出拳一样,则不分胜负.在<生活大爆炸>第二季第8集中出现了一种石头剪刀布的升级版游戏 ...
android中listview点击事件的监听实现
listview_bookmark.setOnItemClickListener(new AdapterView.OnItemClickListener() { @Override public vo ...
So easy
Problem Description Small W gets two files. There are n integers in each file. Small W wants to know ...
[codevs 1243][网络提速（最短路分层思想）
题目:http://dev.codevs.cn/problem/1243/ 分析: 先容易想到将一个点拆成m个点,分别对应不同的边连过去,但是想不到控制加速器数量的办法.看了题解才知道,每个点的分层, ...
19、Java并发性和多线程-嵌套管程锁死
以下内容转自http://ifeve.com/nested-monitor-lockout/: 嵌套管程锁死类似于死锁, 下面是一个嵌套管程锁死的场景: 线程1获得A对象的锁. 线程1获得对象B的锁( ...
Solid Edge如何制作装配体的剖视图
在装配体中,点击检视-剖面选择剖切方向(向内是指把矩形框之内的东西去掉不要,向外是指把矩形框之外的东西去掉不要),选择剖切深度最后效果如下图所示你也可以选择不剖切的零件,效果如下图所 ...
python-pexpect_01安装
一:python2.7.12安装 #获取python2.7.12 wget https://www.python.org/ftp/python/2.7.12/Python-2.7.12.tgz ...
深入理解 JBoss 7/WildFly Domain 模式启动过程
概述 JBoss 7/WildFly 以 domain 模式启动时会启动多个 JVM.比如例如以下通过启动脚本启动 domain 模式: ./domain.sh 启动后我们查看进程: [kylin@l ...
CSS经典布局之弹性布局
当我们在浏览浏览器的时候,经常会放大/缩小浏览器的显示比例,或者在不同的设备上.所处的分辨率也不尽同样. 因此.我们须要学习一个新的知识:弹性盒模型. 弹性盒模型实现项目对齐,方向,排序(即使项目大 ...
谈谈c++纯虚函数的意义！
纯虚函数的存在有什么意义呢?相信大学假设有c++这么课程.在讲到纯虚函数时,必然会讲到纯虚函数是面向接口编程的基础. 如今和大家分享下纯虚函数设计的原由.目的.产生的效果. 现代软件project很庞 ...

TensorFlow-Gpu环境搭建——Win10+ Python+Anaconda+cuda

TensorFlow-Gpu环境搭建——Win10+ Python+Anaconda+cuda的更多相关文章

随机推荐

热门专题