Using GPUs with Kubernetes

Jorge Bianquetti

Jorge Bianquetti


Share this article

One of the most interesting new features in Kubernetes is its support for Graphical Processing Units (GPUs). This makes it possible to obtain significant performance benefits when deploying certain types of applications (for example, machine learning applications like TensorFlow) in a Kubernetes cluster.

Support for scheduling GPU workloads in Kubernetes is still fairly new, so getting all the pieces working together usually requires a fair amount of trial and effort. To ease this process, this blog post will walk you through the process, showing you how to set up a Kubernetes GPU cluster using Google Container Engine (GKE) and then deploying a container image with GPU support to that cluster.

NOTE: This blog post will assume that you have the gcloud and kubectl command-line tools installed and configured for use with GKE. In case you don't, check out our Kubernetes tutorial for a detailed walkthrough.

Step 1: Start a GPU-enabled cluster

The first step is to launch a GPU-enabled Kubernetes cluster. Here are a few important things to remember:

  • As of this writing, Google Compute Engine supports NVIDIA P100 and K80 GPUs only in certain zones, so remember to use a supported zone when spinning up the cluster. See the list of available zones and restrictions.
  • As of this writing, GPU support in Kubernetes is an alpha feature so your GKE cluster needs to be an "alpha cluster". Alpha clusters are different from regular GKE clusters in several respects: they are not covered by the GKE SLA, they cannot be upgraded and they only last for 30 days. Find out more about alpha clusters.
  • GPUs are quota-restricted on Google Compute Engine, so ensure that you have adequate quota available before proceeding. Learn more about quotas.
  • The NVIDIA GPU drivers need to be separately installed on each node of a Kubernetes cluster. Therefore, when launching the cluster, select an image that has the necessary build and/or packaging tools for driver installation. Find out more.

Here's an example command to start a GPU-enabled alpha cluster on GKE. This cluster will run in the us-east1-c zone and will be composed of three hosts, each running Ubuntu and with an NVIDIA K80 GPU:

$ gcloud alpha container clusters create my-gpu-cluster --enable-cloud-logging --enable-cloud-monitoring --accelerator type=nvidia-tesla-k80,count=1 --zone us-east1-c --machine-type n1-standard-2 --enable-kubernetes-alpha --image-type UBUNTU  --num-nodes 3

Step 2: Install the GPU drivers on each cluster node

Once the cluster has started, the next step is to log into each cluster node individually using SSH and install the NVIDIA CUDA libraries, which include the necessary NVIDIA GPU drivers. The Google Cloud Console offers browser-based SSH access to each node. Once logged in, run the commands below:

$ curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
$ sudo -s
$ dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
$ apt-get update && apt-get install cuda -y

Once the libraries have been installed on each host, check if the NVIDIA GPU has been detected with the nvidia-smi tool:

GPU access on cluster node

Step 3: Configure each kubelet to use the NVIDIA GPU

Next, configure each kubelet to use the NVIDIA GPU, as shown below:

$ NVIDIA_GPU_NAME=$(nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 | sed -e 's/ /-/g')
$source /etc/default/kubelet
$ KUBELET_OPTS="$KUBELET_OPTS --node-labels='alpha.kubernetes.io/nvidia-gpu-name=$NVIDIA_GPU_NAME'"
$ echo "KUBELET_OPTS=$KUBELET_OPTS" > /etc/default/kubelet
$ systemctl restart kubelet.service

Step 4: Deploy and test a GPU-enabled container

At this point, you're ready to deploy your GPU-enabled container. This example will use a TensorFlow image with GPU support, and the following Kubernetes deployment file (based on this example by Frederic Tausch):

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: tensorflow-gpu
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: tensorflow-gpu
    spec:
      volumes:
      - hostPath:
          path: /usr/lib/nvidia-384/bin
        name: bin
      - hostPath:
          path: /usr/lib/nvidia-384
        name: lib
      - hostPath:
          path: /usr/lib/x86_64-linux-gnu/libcuda.so.1
        name: libcuda-so-1
      - hostPath:
          path: /usr/lib/x86_64-linux-gnu/libcuda.so
        name: libcuda-so
      containers:
      - name: tensorflow
        image: tensorflow/tensorflow:latest-gpu
        ports:
        - containerPort: 8888
        resources:
          limits:
            alpha.kubernetes.io/nvidia-gpu: 1
        volumeMounts:
        - mountPath: /usr/local/nvidia/bin
          name: bin
        - mountPath: /usr/local/nvidia/lib
          name: lib
        - mountPath: /usr/lib/x86_64-linux-gnu/libcuda.so.1
          name: libcuda-so-1
        - mountPath: /usr/lib/x86_64-linux-gnu/libcuda.so
          name: libcuda-so
---
apiVersion: v1
kind: Service
metadata:
  name: tensorflow-gpu-service
  labels:
    app: tensorflow-gpu
spec:
  selector:
    app: tensorflow-gpu
  ports:
  - port: 8888
    protocol: TCP
    nodePort: 30061
  type: LoadBalancer
---

There are two important points to note about the deployment above:

  • The NVIDIA libraries on the host are exposed to the Kubernetes pod using the hostPath directive. Remember to update this path if your drivers are installed to a different location.
  • NVIDIA GPU resources used by the container are specified in the containers section using the special resource name alpha.kubernetes.io/nvidia-gpu.

TIP: If you'd like to build your own custom GPU-enabled TensorFlow image, start with the Bitnami Docker TensorFlow Serving image and then follow these instructions to add GPU support to it.

Deploy the GPU-enabled TensorFlow container using the command below:

$ kubectl create -f deployment.yaml

You should now be able to see the running pods with kubectl get pods.

Pod listing

Executing the nvidia-smi command within a pod should display the same output as running it directly on the cluster node.

GPU access within pod

This demonstrates that the pod is able to access the GPU.

As this example has illustrated, deploying an application with GPU support in a Kubernetes cluster is not as simple as what you're probably used to. There are definitely a few additional hoops you have to jump through… But once you've done that, you have a scalable, flexible solution for all your GPU-accelerated workloads. Plus, remember that GPU support in Kubernetes is under active development, so expect things to get significantly easier in future!

Want to reach the next level in Kubernetes?

Contact us for a Kubernetes Training