One of the most interesting new features in Kubernetes is its support for Graphical Processing Units (GPUs). This makes it possible to obtain significant performance benefits when deploying certain types of applications (for example, machine learning applications like TensorFlow) in a Kubernetes cluster.
Support for scheduling GPU workloads in Kubernetes is still fairly new, so getting all the pieces working together usually requires a fair amount of trial and effort. To ease this process, this blog post will walk you through the process, showing you how to set up a Kubernetes GPU cluster using Google Container Engine (GKE) and then deploying a container image with GPU support to that cluster.
NOTE: This blog post will assume that you have the gcloud and kubectl command-line tools installed and configured for use with GKE. In case you don't, check out our Kubernetes tutorial for a detailed walkthrough.
The first step is to launch a GPU-enabled Kubernetes cluster. Here are a few important things to remember:
Here's an example command to start a GPU-enabled alpha cluster on GKE. This cluster will run in the us-east1-c zone and will be composed of three hosts, each running Ubuntu and with an NVIDIA K80 GPU:
$ gcloud alpha container clusters create my-gpu-cluster --enable-cloud-logging --enable-cloud-monitoring --accelerator type=nvidia-tesla-k80,count=1 --zone us-east1-c --machine-type n1-standard-2 --enable-kubernetes-alpha --image-type UBUNTU --num-nodes 3
Once the cluster has started, the next step is to log into each cluster node individually using SSH and install the NVIDIA CUDA libraries, which include the necessary NVIDIA GPU drivers. The Google Cloud Console offers browser-based SSH access to each node. Once logged in, run the commands below:
$ curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb $ sudo -s $ dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb $ apt-get update && apt-get install cuda -y
Once the libraries have been installed on each host, check if the NVIDIA GPU has been detected with the nvidia-smi tool:
Next, configure each kubelet to use the NVIDIA GPU, as shown below:
$ NVIDIA_GPU_NAME=$(nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 | sed -e 's/ /-/g') $source /etc/default/kubelet $ KUBELET_OPTS="$KUBELET_OPTS --node-labels='alpha.kubernetes.io/nvidia-gpu-name=$NVIDIA_GPU_NAME'" $ echo "KUBELET_OPTS=$KUBELET_OPTS" > /etc/default/kubelet $ systemctl restart kubelet.service
At this point, you're ready to deploy your GPU-enabled container. This example will use a TensorFlow image with GPU support, and the following Kubernetes deployment file (based on this example by Frederic Tausch):
--- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: tensorflow-gpu spec: replicas: 1 template: metadata: labels: app: tensorflow-gpu spec: volumes: - hostPath: path: /usr/lib/nvidia-384/bin name: bin - hostPath: path: /usr/lib/nvidia-384 name: lib - hostPath: path: /usr/lib/x86_64-linux-gnu/libcuda.so.1 name: libcuda-so-1 - hostPath: path: /usr/lib/x86_64-linux-gnu/libcuda.so name: libcuda-so containers: - name: tensorflow image: tensorflow/tensorflow:latest-gpu ports: - containerPort: 8888 resources: limits: alpha.kubernetes.io/nvidia-gpu: 1 volumeMounts: - mountPath: /usr/local/nvidia/bin name: bin - mountPath: /usr/local/nvidia/lib name: lib - mountPath: /usr/lib/x86_64-linux-gnu/libcuda.so.1 name: libcuda-so-1 - mountPath: /usr/lib/x86_64-linux-gnu/libcuda.so name: libcuda-so --- apiVersion: v1 kind: Service metadata: name: tensorflow-gpu-service labels: app: tensorflow-gpu spec: selector: app: tensorflow-gpu ports: - port: 8888 protocol: TCP nodePort: 30061 type: LoadBalancer ---
There are two important points to note about the deployment above:
TIP: If you'd like to build your own custom GPU-enabled TensorFlow image, start with the Bitnami Docker TensorFlow Serving image and then follow these instructions to add GPU support to it.
Deploy the GPU-enabled TensorFlow container using the command below:
$ kubectl create -f deployment.yaml
You should now be able to see the running pods with kubectl get pods.
Executing the nvidia-smi command within a pod should display the same output as running it directly on the cluster node.
This demonstrates that the pod is able to access the GPU.
As this example has illustrated, deploying an application with GPU support in a Kubernetes cluster is not as simple as what you're probably used to. There are definitely a few additional hoops you have to jump through… But once you've done that, you have a scalable, flexible solution for all your GPU-accelerated workloads. Plus, remember that GPU support in Kubernetes is under active development, so expect things to get significantly easier in future!
Want to reach the next level in Kubernetes?