基于Ubuntu16.04的GeForce GTX 1080驱动安装,遇到的问题及对应的解决方法
1.在主机上插上GPU之后,查看设备:
$ nvidia-smi
Tue Dec ::
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66 Driver Version: 375.66 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| GeForce GTX Off | ::00.0 On | N/A |
| % 34C P8 8W / 200W | 284MiB / 8112MiB | % Default |
+-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| G /usr/lib/xorg/Xorg 117MiB |
| G compiz 155MiB |
| G fcitx-qimpanel 9MiB |
+-----------------------------------------------------------------------------+
可见系统已经检测到GeForce GTX 1080.
另外,这台机器之前搭载过1060,从上面的结果还可以看到对应的驱动NVIDIA 375.66还在;而使用GTX1080对应要装驱动NVIDIA 367.27
$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt-get update
中间过程遇到Y/n时候直接回车继续
然后装驱动nvidia-367
$ sudo apt-get install nvidia-
在这一步,因为与之前的驱动nvidia375存在冲突,会报错:
Building initial module for 4.10.0-32-generic
ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/nvidia-384.0.crash'
Error! Bad return status for module build on kernel: 4.10.0-32-generic (x86_64)
Consult /var/lib/dkms/nvidia-384/384.98/build/make.log for more information.
dpkg: error processing package nvidia-384 (--configure):
subprocess installed post-installation script returned error exit status 10
dpkg: dependency problems prevent configuration of libcuda1-384:
libcuda1-384 depends on nvidia-384 (>= 384.98); however:
Package nvidia-384 is not configured yet. dpkg: error processing package libcuda1-384 (--configure):
dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of nvidia-367:
nvidia-367 depends on nvidia-384; however:
Package nvidia-384 is not configured yet. dpkg: error processing package nvidia-367 (--configure):
dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of nvidia-opencl-icd-384:
nvidia-opencl-icd-384 depends on nvidia-384 (>= 384.98); however:
Package nvidia-384 is not configured yet. dpkg: error processing package nvidia-opencl-icd-384 (--configure):
dependency problems - leaving unconfigured
Setting up nvidia-prime (0.8.2) ...
No apport report written because the error message indicates its a followup error from a previous failure.
No apport report written because the error message indicates its a followup error from a previous failure.
No apport report written because MaxReports is reached already
Processing triggers for libc-bin (2.23-0ubuntu9) ...
Processing triggers for initramfs-tools (0.122ubuntu8.8) ...
update-initramfs: Generating /boot/initrd.img-4.10.0-32-generic
Errors were encountered while processing:
nvidia-384
nvidia-375
libcuda1-384
libcuda1-375
nvidia-367
nvidia-opencl-icd-384
nvidia-opencl-icd-375
E: Sub-process /usr/bin/dpkg returned an error code (1)
对于这个问题,先把之前的驱动卸掉
$ sudo apt-get remove --purge nvidia-375
然后看log文件为什么编译内核报错
$ vim /var/lib/dkms/nvidia-/384.98/build/make.log
......
CONFTEST: drm_atomic_available
CONFTEST: drm_atomic_modeset_nonblocking_commit_available
CONFTEST: is_export_symbol_gpl_refcount_inc
CONFTEST: is_export_symbol_gpl_refcount_dec_and_test
CC [M] /var/lib/dkms/nvidia-/384.98/build/nvidia/nv-instance.o
CC [M] /var/lib/dkms/nvidia-/384.98/build/nvidia/nv-gpu-numa.o
cc: error: unrecognized command line option ‘-fstack-protector-strong’
scripts/Makefile.build:: recipe for target '/var/lib/dkms/nvidia-384/384.98/build/nvidia/nv-instance.o' failed
make[]: *** [/var/lib/dkms/nvidia-/384.98/build/nvidia/nv-instance.o] Error
make[]: *** Waiting for unfinished jobs....
CC [M] /var/lib/dkms/nvidia-/384.98/build/nvidia/nv.o
CC [M] /var/lib/dkms/nvidia-/384.98/build/nvidia/nv-frontend.o
cc: error: unrecognized command line option ‘-fstack-protector-strong’
scripts/Makefile.build:: recipe for target '/var/lib/dkms/nvidia-384/384.98/build/nvidia/nv-gpu-numa.o' failed
make[]: *** [/var/lib/dkms/nvidia-/384.98/build/nvidia/nv-gpu-numa.o] Error
cc: error: unrecognized command line option ‘-fstack-protector-strong’
cc: error: unrecognized command line option ‘-fstack-protector-strong’
scripts/Makefile.build:: recipe for target '/var/lib/dkms/nvidia-384/384.98/build/nvidia/nv-frontend.o' failed
make[]: *** [/var/lib/dkms/nvidia-/384.98/build/nvidia/nv-frontend.o] Error
scripts/Makefile.build:: recipe for target '/var/lib/dkms/nvidia-384/384.98/build/nvidia/nv.o' failed
make[]: *** [/var/lib/dkms/nvidia-/384.98/build/nvidia/nv.o] Error
Makefile:: recipe for target '_module_/var/lib/dkms/nvidia-384/384.98/build' failed
make[]: *** [_module_/var/lib/dkms/nvidia-/384.98/build] Error
make[]: Leaving directory '/usr/src/linux-headers-4.10.0-32-generic'
Makefile:: recipe for target 'modules' failed
make: *** [modules] Error
从网上查了一下,得知‘-fstack-protector-strong’ 选项是gcc4.9以后的版本才加入的,也就是说需要安装gcc4.9以后的版本才可以编译通过.
通过 gcc -v 命令查看机器上的gcc是4.8版本,确认是gcc版本问题,所以升级gcc到4.9版本:
$ sudo apt-get install gcc-4.9
$ cd /usr/bin/
$ sudo ln -s /usr/bin/gcc-4.9 /usr/bin/gcc -f
$ gcc -v
然后继续驱动安装
$ sudo apt-get install nvidia-
$ sudo apt-get install mesa-common-dev
$ sudo apt-get install freeglut3-dev
之后重启系统让GTX1080显卡驱动生效.
2.CUDA8(支持GTX1080)的下载安装
(因为本机器之前已经装过,所以这里先直接测试,过段时间有空再重新搞机器踩一下坑再更新)
3.测试
通过nvidia-smi看到驱动改为了nvidia384(有些人显示的是nvidia367,虽然这里显示不同,但是从编译过程中看到nvidia367依赖于nvidia384,而且后面的测试和使用也没问题,所以没影响)
$ nvidia-smi
Tue Dec ::
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.98 Driver Version: 384.98 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| GeForce GTX Off | ::00.0 On | N/A |
| % 62C P2 139W / 200W | 7898MiB / 8112MiB | % Default |
+-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| G /usr/lib/xorg/Xorg 188MiB |
| G compiz 110MiB |
| C python 7587MiB |
+-----------------------------------------------------------------------------+
样例测试1:
$ cd NVIDIA_CUDA-.0_Samples/1_Utilities/deviceQuery
$ make
$ ./deviceQuery
./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected CUDA Capable device(s) Device : "GeForce GTX 1080"
CUDA Driver Version / Runtime Version 9.0 / 8.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: MBytes ( bytes)
() Multiprocessors, () CUDA Cores/MP: CUDA Cores
GPU Max Clock rate: MHz (1.85 GHz)
Memory Clock rate: Mhz
Memory Bus Width: -bit
L2 Cache Size: bytes
Maximum Texture Dimension Size (x,y,z) 1D=(), 2D=(, ), 3D=(, , )
Maximum Layered 1D Texture Size, (num) layers 1D=(), layers
Maximum Layered 2D Texture Size, (num) layers 2D=(, ), layers
Total amount of constant memory: bytes
Total amount of shared memory per block: bytes
Total number of registers available per block:
Warp size:
Maximum number of threads per multiprocessor:
Maximum number of threads per block:
Max dimension size of a thread block (x,y,z): (, , )
Max dimension size of a grid size (x,y,z): (, , )
Maximum memory pitch: bytes
Texture alignment: bytes
Concurrent copy and kernel execution: Yes with copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: / /
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 8.0, NumDevs = , Device0 = GeForce GTX
Result = PASS
样例测试2:
$ cd NVIDIA_CUDA-.0_Samples/5_Simulations/nbody
$ make
$ ./nbody -benchmark -numbodies= -device=
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies=<N> (number of bodies (>= ) to run in simulation)
-device=<d> (where d=,,.... for the CUDA device to use)
-numdevices=<i> (where i=(number of CUDA devices > ) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation) NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. > Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> Devices used for simulation
gpuDeviceInit() CUDA Device []: "GeForce GTX 1080
> Compute 6.1 CUDA device: [GeForce GTX ]
number of bodies =
bodies, total time for iterations: 2981.761 ms
= 219.790 billion interactions per second
= 4395.792 single-precision GFLOP/s at flops per interaction
4. 查看GPU工作状态
使用nvidia-smi命令即可。
如果要周期性显示,例如每10s 显示一次GPU的情况:
$ watch -n nvidia-smi
具体如下所示:重要的参数主要是温度、内存使用、GPU占有率,具体如下红框所示。

======================================================================================
补充 2018.2.3
最近在另一台服务器装GTX1060之后遇到的问题:
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
原因:驱动装了cuda8.0和cudnn8.0版本,而tensorflow-gpu1.5的版本要求cuda9.0
解决方法:回滚tensorflow-gpu到1.4版本
pip install tensorflow-gpu==1.4 -i https://pypi.tuna.tsinghua.edu.cn/simple gevent
参考:
深度学习主机环境配置: Ubuntu16.04+Nvidia GTX 1080+CUDA8.0
http://blog.csdn.net/w5688414/article/details/79187499
基于Ubuntu16.04的GeForce GTX 1080驱动安装,遇到的问题及对应的解决方法的更多相关文章
- GeForce GTX 1080 ti安装记录
安装GeForce GTX 1080ti 安装GeForce GTX 1080ti,8+8pin需要全接,接4pin就开机显示器上会提示电源线没接完,将显示器线接在显卡上. 设置Win 10 pro ...
- ubuntu环境下,ubuntu16.04装机到nvdia显卡驱动安装、cuda8安装、cudnn安装
首先是安装ubuntu16.04 A.制作u盘启动盘(提前准备好.ios文件): 1.安装u盘制作工具unetbootinsudo apt-get install unetbootin2.格式化u盘s ...
- 深度学习主机环境配置: Ubuntu16.04+GeForce GTX 1080+TensorFlow
接上文<深度学习主机环境配置: Ubuntu16.04+Nvidia GTX 1080+CUDA8.0>,我们继续来安装 TensorFlow,使其支持GeForce GTX 1080显卡 ...
- 基于ubuntu16.04部署IBM开源区块链项目-弹珠资产管理(Marbles)
前言 本教程基本上是对Marbles项目的翻译过程. 如果英文比较好的话,建议根据官方操作说明,一步步进行环境部署.当然你也可以参考本教程在自己的主机上部署该项目. Marbles 介绍 关于 Mar ...
- Ubuntu16.04系统美化、常用软件安装等,长期更新
Ubuntu16.04系统美化.常用软件安装等,长期更新 IT之家啊 18-09-0915:00 因为我个人偏向于玩VPS.服务器之类的东西,所以一般我都是用CentOS.不过对于桌面版的Linux, ...
- ubuntu16.04+七彩虹GTX1060的NVIDIA驱动+Cuda8.0+cudnn5.1+tensorflow+keras搭建深度学习环境【学习笔记】【原创】
平台信息:PC:ubuntu16.04.i5.七彩虹GTX1060显卡 作者:庄泽彬(欢迎转载,请注明作者) 说明:参考了网上的一堆的资料搭建了深度学习的开发环境,下班在宿舍折腾了好几个晚上才搞定,写 ...
- C#码农的大数据之路 - 使用Ambari自动化安装HDP2.6(基于Ubuntu16.04)并运行.NET Core编写的MR作业
准备主机 准备3台主机,名称作用如下: 昵称 Fully Qualified Domain Name IP 作用 Ubuntu-Parrot head1.parrot 192.168.9.126 Am ...
- 基于ubuntu16.04快速构建Hyperledger Fabric网络
前言 最近在参加一个比赛,使用到了区块链的开源软件hyperledger,由于之前从未接触过区块链,以及和区块链开发相关的内容,所有在网上查阅了大量的资料,并且通过学习yeasy(杨宝华)开源的入门书 ...
- Ubuntu16.04安裝最新Nvidia驱动
在安装完Ubuntu之后,可能通过自带驱动无法更新,一直处于无法下载状态,那么就需要通过到Nvidia官网下载驱动,手动安装了 方法/步骤 通过度娘,打开NVIDIA官网,然后在下载驱动那里找到自己的 ...
随机推荐
- DDR4中的so-dimm 和component
so-dimm :Small Outline Dual In-line Memory Module (小型双列直插式内存模块) component:直接焊接的ddr4芯片
- Hive压缩格式
TextFile Hive数据表的默认格式,存储方式:行存储. 可使用Gzip,Bzip2等压缩算法压缩,压缩后的文件不支持split 但在反序列化过程中,必须逐个字符判断是不是分隔符和行结束符,因此 ...
- 简单的dos命令
cd .. 进入上一层目录cd /.. 进入当前盘所有根目录d: 进入d盘dir 查看下一级目录d:>d:\abc.txt 新建一个文档hostname 主机名SET 查看环境变 ...
- 项目总结--基于Cortex-A9平台的米兰花智能培育系统
基于Cortex-A9平台的米兰花智能培育系统 1. 系统功能概述 本系统主要实现了模拟米兰花智能培育的过程.通过前端传感器采集相关环境因子数据经ZigBee组网发送到协调器汇总,网关通过串口读取协调 ...
- 源码安装GCC-4.9.2
本文参考:http://cuchadanfan.blog.51cto.com/9940284/1689556 感谢原作者的分享! 首先安装基础包,安装网络依赖的时候要用 [root@localhos ...
- ADO.NET 使用DELETE语句批量删除操作,提示超时,删除失败,几种优化解决思路
起因是如此简单的一句sql 提示:Timeout 时间已到.在操作完成之前超时时间已过或服务器未响应. 提供几种解决思路: 1.检查WHERE条件中字段是否已建索引 2.检查是否被其他表引用,引用表外 ...
- SQL使用CASE 语句
CASE 语句可以在SELECT 子句和ORDER BY 子句中使用 CASE语句分为两种Case Simple Expression and Case Search Expression Case ...
- jQuery中兄弟元素、子元素和父元素的获取
我们这里主要总结jQuery中对某元素的兄弟元素.子元素和父元素的获取,原声的Javascript代码对这些元素的获取比较麻烦一些,而jQuery正好对这些方法进行封装,让我们更加方便的对这些元素进行 ...
- 最新Dubbo-admin+Zookeeper搭建
Zookeeper搭建: 下载zookeeper压缩包并解压,下载地址:http://www.apache.org/dyn/closer.cgi/zookeeper/进入conf目录下将 zoo_sa ...
- windows service 2008 内存吃尽解决方案
最近用windows2008,发现系统运行一段时间后,内存吃紧,赶紧打开资源查看器,发现当前运行的程序占有内存都很小,后经查资料,原来是被windows2008的文件缓存吃尽了,这是windows20 ...