在k8s上面安装nevida-docker2

sudo yum-config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
curl -s -L https://nvidia.github.io/nvidia-docker/centos7/x86_64/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo

0、daemon.json 配置

(base) [xinchenTest@master docker]$ cat /etc/docker/daemon.json 
{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors":
        ["http://7e61f7f9.m.daocloud.io"],
"live-restore": true,
"graph": "/data/docker"
}

配置完别忘记

sudo systemctl daemon-reload
sudo systemctl restart docker

1、移除nvidia-docker 1.0

su do docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo yum remove nvidia-docker

2、安裝nvidia-docker 2.0

sudo yum install nvidia-docker2

3、重新載入Docker daemon的設定

sudo pkill -SIGHUP dockerd

4、测试是否成功

sudo docker run --runtime=nvidia --rm nvidia/cuda:11.1.1-base-centos7 d  nvidia-smi

5、

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
Configure containerd

When running kubernetes with containerd, edit the config file which is usually present at /etc/containerd/config.toml to set up nvidia-container-runtime as the default low-level runtime:

version = 2
[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    [plugins."io.containerd.grpc.v1.cri".containerd]
      default_runtime_name = "nvidia"

      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
          privileged_without_host_devices = false
          runtime_engine = ""
          runtime_root = ""
          runtime_type = "io.containerd.runc.v2"
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
            BinaryName = "/usr/bin/nvidia-container-runtime

journalctl -xe

# 正常来说会输出已经安装过的程序
$ rpm -qa | grep nvidia

# 正常来说会输出已经安装过的程序
$ rpm -qa | grep cuda

本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!