Docker容器中使用GPU
背景
容器封装了应用程序的依赖项,以提供可重复和可靠的应用程序和服务执行,而无需整个虚拟机的开销。如果您曾经花了一天的时间为一个科学或 深度学习 应用程序提供一个包含大量软件包的服务器,或者已经花费数周的时间来确保您的应用程序可以在多个 linux 环境中构建和部署,那么 Docker 容器非常值得您花费时间。

安装添加docker源
[root@localhost ~]# sudo yum-config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
Loaded plugins: fastestmirror, langpacks
adding repo from: https://download.docker.com/linux/centos/docker-ce.repo
grabbing file https://download.docker.com/linux/centos/docker-ce.repo to /etc/yum.repos.d/docker-ce.repo
repo saved to /etc/yum.repos.d/docker-ce.repo
[root@localhost ~]#
[root@localhost ~]# cat /etc/yum.repos.d/docker-ce.repo
[docker-ce-stable]
name=Docker CE Stable - $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/$basearch/stable
enabled=1
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
[docker-ce-stable-debuginfo]
name=Docker CE Stable - Debuginfo $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/debug-$basearch/stable
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
[docker-ce-stable-source]
name=Docker CE Stable - Sources
baseurl=https://download.docker.com/linux/centos/$releasever/source/stable
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
[docker-ce-test]
name=Docker CE Test - $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/$basearch/test
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
[docker-ce-test-debuginfo]
name=Docker CE Test - Debuginfo $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/debug-$basearch/test
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
[docker-ce-test-source]
name=Docker CE Test - Sources
baseurl=https://download.docker.com/linux/centos/$releasever/source/test
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
[docker-ce-nightly]
name=Docker CE Nightly - $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/$basearch/nightly
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
[docker-ce-nightly-debuginfo]
name=Docker CE Nightly - Debuginfo $basearch
baseurl=https://download.docker.com/linux/centos/$releasever/debug-$basearch/nightly
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
[docker-ce-nightly-source]
name=Docker CE Nightly - Sources
baseurl=https://download.docker.com/linux/centos/$releasever/source/nightly
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
[root@localhost ~]#
下载安装包
[root@localhost ~]# cd docker
[root@localhost docker]#
[root@localhost docker]# repotrack docker-ce
安装docker 并设置开机自启
[root@localhost docker]# yum install ./*
[root@localhost docker]# systemctl start docker
[root@localhost docker]#
[root@localhost docker]# systemctl enable docker
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
[root@localhost docker]#
配置nvidia-docker的源
[root@localhost docker]# distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
> && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
[root@localhost docker]# cat /etc/yum.repos.d/nvidia-docker.repo
[libnvidia-container]
name=libnvidia-container
baseurl=https://nvidia.github.io/libnvidia-container/stable/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/libnvidia-container/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
[libnvidia-container-experimental]
name=libnvidia-container-experimental
baseurl=https://nvidia.github.io/libnvidia-container/experimental/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=0
gpgkey=https://nvidia.github.io/libnvidia-container/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
[nvidia-container-runtime]
name=nvidia-container-runtime
baseurl=https://nvidia.github.io/nvidia-container-runtime/stable/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/nvidia-container-runtime/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
[nvidia-container-runtime-experimental]
name=nvidia-container-runtime-experimental
baseurl=https://nvidia.github.io/nvidia-container-runtime/experimental/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=0
gpgkey=https://nvidia.github.io/nvidia-container-runtime/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
[nvidia-docker]
name=nvidia-docker
baseurl=https://nvidia.github.io/nvidia-docker/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/nvidia-docker/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
[root@localhost docker]#
安装下载nvidia-docker
[root@localhost ~]# mkdir nvidia-docker2
[root@localhost ~]# cd nvidia-docker2
[root@localhost nvidia-docker2]# yum update -y
[root@localhost nvidia-docker2]# repotrack nvidia-docker2
[root@localhost nvidia-docker2]# yum install ./*
[root@localhost ~]# mkdir nvidia-container-toolkit
[root@localhost ~]# cd nvidia-container-toolkit
[root@localhost nvidia-container-toolkit]# repotrack nvidia-container-toolkit
[root@ai-rd nvidia-container-toolkit]# yum install ./*
下载镜像,并保存
[root@localhost ~]# docker pull nvidia/cuda:11.0-base
11.0-base: Pulling from nvidia/cuda
54ee1f796a1e: Pull complete
f7bfea53ad12: Pull complete
46d371e02073: Pull complete
b66c17bbf772: Pull complete
3642f1a6dfb3: Pull complete
e5ce55b8b4b9: Pull complete
155bc0332b0a: Pull complete
Digest: sha256:774ca3d612de15213102c2dbbba55df44dc5cf9870ca2be6c6e9c627fa63d67a
Status: Downloaded newer image for nvidia/cuda:11.0-base
docker.io/nvidia/cuda:11.0-base
[root@localhost ~]#
[root@localhost ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nvidia/cuda 11.0-base 2ec708416bb8 15 months ago 122MB
[root@localhost ~]#
[root@localhost ~]# docker save -o cuda-11.0.tar nvidia/cuda:11.0-base
[root@localhost ~]#
[root@localhost ~]# ls cuda-11.0.tar
cuda-11.0.tar
[root@localhost ~]#
在要测试的服务器上导入镜像
[root@ai-rd cby]# docker load -i cuda-11.0.tar
2ce3c188c38d: Loading layer [==================================================>] 75.23MB/75.23MB
ad44aa179b33: Loading layer [==================================================>] 1.011MB/1.011MB
35a91a75d24b: Loading layer [==================================================>] 15.36kB/15.36kB
a4399aeb9a0e: Loading layer [==================================================>] 3.072kB/3.072kB
fa39d0e9f3dc: Loading layer [==================================================>] 18.84MB/18.84MB
232fb43df6ad: Loading layer [==================================================>] 30.08MB/30.08MB
0da51e35db05: Loading layer [==================================================>] 22.53kB/22.53kB
Loaded image: nvidia/cuda:11.0-base
[root@ai-rd cby]#
[root@ai-rd cby]# docker images | grep cuda
nvidia/cuda 11.0-base 2ec708416bb8 15 months ago 122MB
[root@ai-rd cby]#
安装升级内核
[root@ai-rd cby]# yum install kernel-headers
[root@ai-rd cby]# yum install kernel-devel
[root@ai-rd cby]# yum update kernel*
禁用模块,并升级boot
[root@ai-rd cby]# vim /etc/modprobe.d/blacklist-nouveau.conf
[root@ai-rd cby]# cat /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0
[root@ai-rd cby]#
[root@ai-rd cby]# mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
[root@ai-rd cby]# sudo dracut -v /boot/initramfs-$(uname -r).img $(uname -r)
下载驱动并安装
[root@localhost ~]# wget https://cn.download.nvidia.cn/tesla/450.156.00/NVIDIA-Linux-x86_64-450.156.00.run
[root@ai-rd cby]# chmod +x NVIDIA-Linux-x86_64-450.156.00.run
[root@ai-rd cby]# ./NVIDIA-Linux-x86_64-450.156.00.run
配置docker
[root@ai-rd ~]# vim /etc/docker/daemon.json
[root@ai-rd ~]# cat /etc/docker/daemon.json
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
[root@ai-rd ~]#
[root@ai-rd ~]# systemctl daemon-reload
[root@ai-rd ~]#
[root@ai-rd ~]#
[root@ai-rd ~]#
[root@ai-rd ~]# systemctl restart docker
[root@ai-rd ~]#
测试docker中的调用情况
[root@ai-rd ~]#
[root@ai-rd ~]# sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
Tue Nov 23 06:03:04 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.156.00 Driver Version: 450.156.00 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:86:00.0 Off | 0 |
| N/A 90C P0 34W / 70W | 0MiB / 15109MiB | 6% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
[root@ai-rd ~]#

https://blog.csdn.net/qq_33921750
https://my.oschina.net/u/3981543
https://www.zhihu.com/people/chen-bu-yun-2
https://segmentfault.com/u/hppyvyv6/articles
https://juejin.cn/user/3315782802482007
https://space.bilibili.com/352476552/article
https://cloud.tencent.com/developer/column/93230
知乎、CSDN、开源中国、思否、掘金、哔哩哔哩、腾讯云
Docker容器中使用GPU的更多相关文章
- 在Docker容器中搭建MXNet/Gluon开发环境
在这篇文章中没有直接使用MXNet官方提供的docker image,而是从一个干净的nvidia/cuda镜像开始,一步一步部署mxnet需要的相关软件环境,这样做是为了更加细致的了解mxnet的运 ...
- Kubernetes 教程:在 Containerd 容器中使用 GPU
原文链接:https://fuckcloudnative.io/posts/add-nvidia-gpu-support-to-k8s-with-containerd/ 前两天闹得沸沸扬扬的事件不知道 ...
- Docker容器中运行ASP.NET Core
在Linux和Windows的Docker容器中运行ASP.NET Core 译者序:其实过去这周我都在研究这方面的内容,结果周末有事没有来得及总结为文章,Scott Hanselman就捷足先登了. ...
- 在 docker 容器中捕获信号
我们可能都使用过 docker stop 命令来停止正在运行的容器,有时可能会使用 docker kill 命令强行关闭容器或者把某个信号传递给容器中的进程.这些操作的本质都是通过从主机向容器发送信号 ...
- Docker容器中开始.NETCore之路
一.引言 开始写这篇博客前,已经尝试练习过好多次Docker环境安装,.Net Core环境安装了,在这里替腾讯云做一个推广,假如我们想学习.练手.net core 或是Docker却苦于没有开发环境 ...
- 隔离 docker 容器中的用户
笔者在前文<理解 docker 容器中的 uid 和 gid>介绍了 docker 容器中的用户与宿主机上用户的关系,得出的结论是:docker 默认没有隔离宿主机用户和容器中的用户.如果 ...
- Docker容器中开始.Net Core之路
开始写这篇博客前,已经尝试练习过好多次Docker环境安装,.Net Core环境安装了,在这里替腾讯云做一个推广,假如我们想学习.练手.net core 或是Docker却苦于没有开发环境,服务器也 ...
- Docker容器中找不到vim命令
docker容器中,有的并未安装vi和vim,输入命令vim,会提示vim: command not found(如下图).此时我们就要安装vi命令 执行命令:apt-get update apt-g ...
- docker容器中Postgresql 数据库备份
查看运行的容器: docker ps 进入目标容器: docker exec -u root -it 容器名 /bin/bash docker 中,以root用户,创建备份目录,直接执行如下命令, p ...
- .NetCore下使用IdentityServer4 & JwtBearer认证授权在CentOS Docker容器中运行遇到的坑及填坑
今天我把WebAPI部署到CentOS Docker容器中运行,发现原有在Windows下允许的JWTBearer配置出现了问题 在Window下我一直使用这个配置,没有问题 services.Add ...
随机推荐
- 虚拟机安装windows 7 32位 + sqlserver 2000
安装包网盘地址:(https://pan.baidu.com/s/1ZoC-cTafBi8zZbCkvvmvNA?pwd=x1y2 提取码:x1y2 ) VMware 安装win7 32 位 http ...
- eset node32卸载记录
安装的是这个东西,卸载麻烦 1.一般的卸载软件比如wise program uninstall无论是普通卸载还是强制卸载都是实现不了的,火绒自带的文件粉碎是可以使用的,有两个目录要进行粉碎C:\Pro ...
- NOI 顺序查找——查找特定的值
描述 在一个序列(下标从1开始)中查找一个给定的值,输出第一次出现的位置. 输入 第一行包含一个正整数n,表示序列中元素个数.1 <= n <= 10000.第二行包含n个整数,依次给出序 ...
- python高阶编程(二)
1.迭代器 迭代是访问集合元素的一种方式.迭代器是一个可以记住遍历的位置的对象.迭代器对象从集合的第一个元素开始访问,知道所有的元素被访问结束.迭代器只能往下不会后退. 我们已经知道,可以直接作用于f ...
- Mysql_5.7编译部署
自述 - 概述:数据库是"按照数据结构来组织.存储和管理数据的仓库".是一个长期存储在计算机内的.有组织的.可共享的.统一管理的大量数据的集合:本文主要介绍mysql_5.7的部署 ...
- 变量调用分析——这个ball到底是那个ball?
public class Ball implements Rollable{ public static void main(String[] args) { Ball ball = new Ball ...
- Day06 ServletContext
ServletContext的介绍与用法 1.什么是ServletContext 1.1 SevrvletContext:Servlet上下文 服务器会为每一个Web工程创建一个ServletCont ...
- JavaScript 字符串和正则相关的方法
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title> ...
- (Linux服务器)git添加SSH公钥后本地验证失败
环境:腾讯云Ubuntu x86_64 问题说明: 在配置了公钥后,一直提示我git@github.com: Permission denied (publickey). 解决办法: 先查看root/ ...
- python xlwings实用操作记录
import xlwings as xw tfile="test.xlsx" newfile="new.xlsx" app = xw.App(visible=F ...