安装CUDA和cuDNN

GPU和CPU区别

1，CPU主要用于处理通用逻辑，以及各种中断事物

2，GPU主要用于计算密集型程序，可并行运作；

NVIDIA 的 GeForce 显示卡系列采用 GPU 特性进行快速计算，渲染电脑画面，比如大型游戏，图像处理等场景的画面
深度学习的训练过程中，包含了大量重复性的计算，利用 GPU 的计算和并行特性，可提高训练的效率，具备 GPU 特性的电脑显卡就有用武之地啦！

使用 GPU 的计算前需要做些准备，下面以 window 7 x64 系统为例子

一，检查显卡类型和计算能力
1，查看笔记本显卡型号，以及计算能力
下载个 GPU 查看器， https://www.techpowerup.com/download/gpu-z/

我笔记本显卡型号：NVIDIA GeForce 940M

2，确定对应显卡 GPU 的计算能力
去 NVIDIA 官网查看 https://developer.nvidia.com/cuda-gpus

NVIDIA GeForce 940M Compute Capability 是 5.0

tensorflow 1.3 版本要求 GPU 计算能力必须在 3.0 以上
https://www.tensorflow.org/versions/r1.3/install/install_windows

GPU card with CUDA Compute Capability 3.0 or higher. See NVIDIA documentation for a list of supported GPU cards.

要是计算能力不支持，运行 tensorflow 会报错

Ignoring visible gpu device (device: 0, name: GeForce GT 630M, pci bus id: 0000:01:00.0) with Cuda compute capability 2.1. The minimum required Cuda capability is 3.0.

二，安装显卡驱动, CUDA ,cuDNN

1，安装显卡驱动
显卡一般都安装好了，但可能会出现显卡驱动版本跟 CUDA 不对应的问题
自己去 NVIDIA 官网下载驱动http://www.nvidia.com/Download/index.aspx，或者使用驱动精灵

2，安装 CUDA
CUDA的主要作用是链接 GPU 和应用程序，方便用户通过 CUDA 的 API 调度 GPU 进行计算

安装说明地址：http://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/
选择对应版本，window7 系统 64 位, tensorflow 1.3 当前使用 CUDA 8.0版本

安装 cuda 的时候，会询问是否安装显卡驱动，说明 cuda 安装程序里包含了的显卡驱动；
建议先不要安装 cuda 里的显卡驱动，待安装完 cuda 后，执行例子程序，如果报错再检查显卡驱动是否正确，避免覆盖原来的显卡驱动

安装完后执行 nvcc -V 检查

C:\Users\yunhuichen>nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver

Copyright (c) 2005-2016 NVIDIA Corporation

Built on Mon_Jan__9_17:32:33_CST_2017

Cuda compilation tools, release 8.0, V8.0.60

运行例子

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\extras\demo_suite>device

Query.exe

deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce 940M"

  CUDA Driver Version / Runtime Version          9.0 / 8.0

  CUDA Capability Major/Minor version number:    5.0

  Total amount of global memory:                 1024 MBytes (1073741824 bytes)

  ( 3) Multiprocessors, (128) CUDA Cores/MP:     384 CUDA Cores

  GPU Max Clock rate:                            980 MHz (0.98 GHz)

  Memory Clock rate:                             1001 Mhz

  Memory Bus Width:                              64-bit

  L2 Cache Size:                                 1048576 bytes

  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536),

3D=(4096, 4096, 4096)

  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers

  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers

  Total amount of constant memory:               65536 bytes

  Total amount of shared memory per block:       49152 bytes

  Total number of registers available per block: 65536

  Warp size:                                     32

  Maximum number of threads per multiprocessor:  2048

  Maximum number of threads per block:           1024

  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)

  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)

  Maximum memory pitch:                          2147483647 bytes

  Texture alignment:                             512 bytes

  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)

  Run time limit on kernels:                     Yes

  Integrated GPU sharing Host Memory:            No

  Support host page-locked memory mapping:       Yes

  Alignment requirement for Surfaces:            Yes

  Device has ECC support:                        Disabled

  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Mo

del)

  Device supports Unified Addressing (UVA):      Yes

  Device PCI Domain ID / Bus ID / location ID:   0 / 4 / 0

  Compute Mode:

     < Default (multiple host threads can use ::cudaSetDevice() with device simu ltaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Versi

on = 8.0, NumDevs = 1, Device0 = GeForce 940M

Result = PASS

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\extras\demo_suite>

至此已经安装 cuda 成功
当然此过程可能会遇到以下问题

NVIDIA Geforce GTX 940M 设备是不可移动的,无法弹出或拔出

这是因为显卡驱动和CUDA版本不对应，可试下安装CUDA里的显卡驱动

3，安装 cuDNN
cuDNN 是一个为了优化深度学习计算的类库，它能将模型训练的计算优化之后，再通过 CUDA 调用 GPU 进行运算
当然你也可直接使用 GUDA，而不通过 cuDNN ，但运算效率会低好多

cuDNN 下载地址:https://developer.nvidia.com/cudnn

选择跟CUDA 8.0 对应的版本 cuDNN 6.1

其实就几个 lib 文件，解压出来把安装路径添加到 PATH 中；你也可以把所有 lib 文件复制到 CUDA 对应目录下

安装CUDA和cuDNN的更多相关文章

记录下自己安装cuda以及cudnn
之前已经装过一次了,不过没有做记录,现在又要翻一堆博客安装,长点记性,自己记录下. 环境 ubuntu16.04 python2.7 商家送过来时候已经装好了显卡驱动,所以省去了一大麻烦. 剩下的就是 ...
win10 安装cuda和cudnn
首先通过nvidia-smi 查看自己的显卡驱动对应的cuda版本. 参考:https://blog.csdn.net/qq_40212975/article/details/89963016 再去官 ...
Ubuntu安装CUDA、CUDNN比较有用的网址总结
Ubuntu安装CUDA.CUDNN比较有用的网址总结 1.tensorflow各个版本所对应的的系统要求和CUDA\CUDNN适配版本 https://tensorflow.google.cn/in ...
非root用户安装cuda和cudnn
1.根据自己的系统在官网下载cuda (选择runfile(local)) https://developer.nvidia.com/cuda-downloads 2.进入下载目录,并执行 sh cu ...
Anaconda--在虚拟环境中安装CUDA and cudnn
在conda虚拟环境中安装CUDAconda install cudatoolkit=8.0 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs ...
ubuntu 安装 CUDA、 cuDNN 的tips
CUDA 查看驱动兼容性:https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html 查看GCC 与CUDA 驱动版本的兼容性 ...
【tf.keras】Linux 非 root 用户安装 CUDA 和 cuDNN
TensorFlow 2.0 for Linux 使用时报错:(cuDNN 版本低了) E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loade ...
ubuntu安装cuda、cudnn
环境: Ubuntu 16.04.4 LTS CUDA:8.0 CUDNN:5.1 CUDA下载:https://developer.nvidia.com/cuda-80-ga2-download-a ...
Windows10 安装 CUDA + cuDNN + pyTorch
2020/5/29 在 windows10 上面安装 CUDA 和 cuDNN 0.简单了解一下 CUDA 和 cuDNN 1)什么是 CUDA CUDA(ComputeUnified Device ...

随机推荐

[TypeScript] Create a fluent API using TypeScript classes
You can create an easy to chain API using TypeScript classes. Learn about the thisreturn type annota ...
BigDecimal 舍入模式（Rounding mode）介绍
BigDecimal 舍入模式(Rounding mode)介绍什么样的经历,才能领悟成为架构师? >>> 1 RoundingMode介绍 package java.math ...
freemarker自己定义标签（一）
freemarker自己定义标签 1.自己定义标签说明宏变量存储模板片段能够被用作自己定义指令macro 2.演示样例说明 <html> <head> <meta ht ...
hbase 2.0.2 分布式安装配置/jar包替换
环境 zk: 3.4.10 hadoop 2.7.7 jdk8 hbase 2.0.2 三台已安装配置好的hadoop002,hadoop003,hadoop004 1.上传并解压hbase-2.1. ...
Windows 7 X64位平台下，VC6调试运行程序，中断调试无法退出
用VC6在64位Windows7下调试的时候,如果中断(Shift+F5)调试,程序无法退出. 问题描述: 当点击F5开始一个项目的调试时,程序在设置的断点处停止,这时按下Shift+F5后,vc6可 ...
寄存器，cache、伙伴系统、内存碎片、段式页式存储管理
cache.伙伴系统.内存碎片.段式页式存储管理目录分层的存储管理 cache 局部性原理置换算法写回策略 linux环境下的cache 连续内存分配与内存碎片内部碎片与外部碎片动态分区分 ...
[福利] 开发者必备的 Chrome 插件——ChromeSnifferPlus
Chrome Sniffer Plus: Chrome 探测器,可以探测正在使用的开源软件或者 js 类库,开发者必备. 通过本插件,您可以探测: javascript 库: jQuery.ExtJS ...
纯洁CSS3实现图片墙
预赛 DIV+CSS基金会 CSS3的transform 和 transition说明主要用于transform的rotate/scale 动画过渡的几个參数(transition-property ...
理解 iOS 和 macOS 的内存管理
在 iOS 和 macOS 应用的开发中,无论是使用 Objective-C 还是使用 swift 都是通过引用计数策略来进行内存管理的,但是在日常开发中80%(这里,我瞎说的,8020 原则嘛
hdu - 4971 - A simple brute force problem.（最大权闭合图）
题意:n(n <= 20)个项目,m(m <= 50)个技术问题,做完一个项目能够有收益profit (<= 1000),做完一个项目必须解决对应的技术问题,解决一个技术问题须要付出 ...

安装CUDA和cuDNN

安装CUDA和cuDNN的更多相关文章

随机推荐

热门专题