In this guide we will go from a basic Ubuntu 20 install to a fully functioning Kubernetes cluster running Layar. This guide is intended as an example installation only and makes configuration decisions that may not be best suited for your environment.

Hardware Requirements

RAM: 256GB

CPU: 16+ cores

GPU: 4x NVIDIA V100 32GB generation or later

Disk: 1TB SSD

Prerequisites

  • The system must have an external DNS entry (not relying on /etc/hosts) with the corresponding IP assigned to the host.

  • Internet access from the installation system is a requirement.

  • Swap must be disabled.

  • SELinux must be disabled or set to Permissive.

Installation

Commands should be run as root or prefixed with sudo

Ensure you have docker 20.10 installed

docker version

both Client and Server output should show version 20.10.17 or later. If not please follow the instructions for installing docker on Ubuntu here before proceeding.

Increase operating system vm.max_map_count

echo "vm.max_map_count=262144" > /etc/sysctl.d/99-vyasa.conf 
sysctl -p /etc/sysctl.d/99-vyasa.conf

Configure NetworkManager to ignore Calico interfaces

If using NetworkManager, edit the file /etc/NetworkManager/conf.d/calico.conf and set the following:

[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico;interface-name:wireguard.cali

Restart NetworkManager: systemctl restart NetworkManager

Install CUDA 11

apt-get install linux-headers-$(uname -r)
distribution=$(. /etc/os-release;echo $ID$VERSION_ID | sed -e 's/\.//g')
wget https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/cuda-$distribution.pin
mv cuda-$distribution.pin /etc/apt/preferences.d/cuda-repository-pin-600
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
echo "deb http://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64 /" | sudo tee /etc/apt/sources.list.d/cuda.list
apt-get update
apt install -y cuda-runtime-11-6

Install NVIDIA Docker toolkit

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
apt update
apt-get install -y nvidia-docker2

Set docker default runtime

Edit the file /etc/docker/daemon.json and replace its contents with:

{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}

Restart docker

systemctl restart docker

Install Kubernetes

curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" >> /etc/apt/sources.list && apt update
apt install -y kubeadm=1.21.10-00 kubelet=1.21.10-00 kubectl=1.21.10-00
/usr/bin/kubeadm init --kubernetes-version=1.21.10 --token-ttl 0 --pod-network-cidr=10.17.0.0/16 -v 5
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Install NGINX Ingress controller

kubectl apply -f https://vyasa-static-assets.s3.amazonaws.com/layar/nginx-ingress.yaml

Install Calico CNI

kubectl create -f https://vyasa-static-assets.s3.amazonaws.com/layar/tigera-operator.yaml
kubectl create -f https://vyasa-static-assets.s3.amazonaws.com/layar/custom-resources.yaml

Let head node run pods

kubectl taint nodes --all node-role.kubernetes.io/master:NoSchedule-

Install local volume provisioner

/usr/bin/kubectl apply -f https://vyasa-static-assets.s3.amazonaws.com/layar/local-storage-provisioner.yml

Set default storage class

kubectl patch storageclass 'local-path' -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Install Helm

wget -P /usr/local/bin/ https://vyasa-static-assets.s3.amazonaws.com/layar/helm
chmod +x /usr/local/bin/helm

Add the Layar helm repository

helm repo add vyasa https://helm.vyasa.com/charts/ --username vyasahelm --password "sail#away()"
helm repo update

Install Layar

Note: Replace MY_URL with the IP address or DNS name of the system. Set MY_GPU_COUNT to the number of GPUs available to Layar.

helm install layar vyasa/layar --set APPURL=MY_URL --set TRITON_GPU_COUNT=MY_GPU_COUNT

Did this answer your question?