Dockerized Jupyterhub Deep Learning Notebooks with GPU Access: Tensorflow 1.3, cuDNN 6.0, CUDA 8.0

For deep learning teaching/research with students, often server/workstation configuration creates so many hassles given so many dependencies and environments in the modern machine learning / deep learning scene.  So…how can we give students access to workstations with multiple GPUs to run their deep learning code, without giving them full root/user access nor have them have to go through the configuration hassles.

The solution I have found is to use JupyterHub, a Jupyter Notebook server that scales to classrooms of users, with the ability to spawn preconfigured Docker containers for each user.  This saves on configuration woes and also sandboxes users, while having minimal overhead compared to virtual machines.  Students/Users simply use their web browser and go to the domain which connects to your workstation, and can immediately start doing work in Python and their favorite machine learning/deep learning libraries.

JupyterHub is still early-ish software, and though it has been used in many production environments, still many snags/support issues abound.  In particular, getting NVIDIA GPU access is not straightforward and there are no up-to-date tutorials.  Here I will show you how to get this to work on Ubuntu 16.04 for the workstation, with a Docker container for each user/student based on Ubuntu 16.04, CUDA 8.0, cuDNN 6.0, and Tensorflow 1.3.  You can also edit the Dockerfile I will link to for other builds (e.g., PyTorch, Keras, etc.).

I will break this up into 3 (now 4) parts:

  1. Installing and configuring JupyterHub server to spawn Docker containers with authentication through Github
  2. Building a custom Docker image that supports GPUs and deep learning libraries
  3. Tweaking the configuration so that spawned docker containers can communicate with the host server’s GPU(s)
  4. Placing custom CPU/RAM/GPU access restrictions on a per-user basis.

This assumes you have already configured your workstation/server to run your deep learning libraries as the root/main user.  In other words, try running,

nvidia-smi

and see if you get output describing the NVIDIA driver version, as well as the hardware you have in your workstation/server.  If that does not work, you must first install all the necessary drivers.  There are many other tutorials describing this.

1. Installing and configuring JupyterHub server to spawn Docker containers with authentication through Github

Jupyterhub uses Python 3, not 2.7.  To install jupyterhub and other packages, we will use pip3.  First, change to an appropriate directory, e.g, ~/jupyterhub

sudo apt-get install npm nodejs-legacy
npm install -g configurable-http-proxy
pip3 install jupyterhub

Configurable-http-proxy is a program that acts as a local server proxy.  What happens when a user connects to our workstation computer is they will be internally rerouted from the external IP address to the internal IP address of the JupyterHub server.

You will now need to create the configuration file which the JupyterHub server will read every start.

jupyterhub –generate-config

This will put a python file which you will be extensively editing called jupyterhub_config.py.  Right now it is all default settings with everything commented out.
Importantly, we will be focusing on using Github OAuth for user authentication and DockerSpawner to launch new docker containers.

Give it a launch without SSL just to see if it works,

sudo jupyterhub –no-ssl

If that does not work, you may need to explicitly write out the path to where your jupyterhub executable is installed.  Figure this out with,

which jupyterhub

Running the server will create a sqlite database that stores authenticated users and session info, as well as a jupyterhub_cookie_secret.  To keep using the same cookie secret so you do not create a new one every time, just hide the current one and we will link to it later.

cp jupyterhub_cookie_secret .jupyterhub_cookie_secret

 

Getting a SSL Certificate for secure access on port 443

We will first need SSL for secure server access for our entry point.  If you have a static IP address, you can use Let’s Encrypt‘s certbot to create a SSL Certificate.  If not, you can create your own signed certificate, but just know that users will always be prompted with an unauthorized certificate that they must “accept and trust.”  I will leave those details out because you should use a trusted certificate authority (CA).  Put the following now in your JupyterHub configuration file.

import os
c = get_config()
c.JupyterHub.ip = ‘<your_domain or external IP address>’
c.JupyterHub.ssl_cert = ‘/etc/letsencrypt/live/<your_domain>/fullchain.pem’
c.JupyterHub.ssl_key = ‘/etc/letsencrypt/live/<your_domain>/privkey.pem’
c.JupyterHub.cookie_secret_file = ‘/<jupyterhub_root>/.jupyterhub_cookie_secret’
c.JupyterHub.port = 443

Getting Github authentication to work

We don’t want random users ramping up Docker containers on our workstation, so let’s authenticate them.  JupyterHub has support for multiple authenticators, including Kerberos which is used at many universities.  We will be using Github since many students already have one.  Note that this does NOT work with Github Enterprise, e.g., github.mit.edu.  You must use your personal Github account under the github.com domain.  Just saved you a day figuring this out 🙂

Go to your personal Github, and under “Settings -> Developer Settings” create a new “OAuth App.”  Make a nice name that your users will recognize, and voila, you have an authentication scheme through Github.  Important here is the “Client ID” and “Client Secret”.

You will put these in your JupyterHub configuration file as so:

c.JupyterHub.authenticator_class = ‘oauthenticator.GitHubOAuthenticator’
c.GitHubOAuthenticator.oauth_callback_url = ‘https://<your_domain>/hub/oauth_callback&#8217;
c.GitHubOAuthenticator.client_id = ‘<your_client_id>’
c.GitHubOAuthenticator.client_secret = “<your_client_secret>”

We will now “whitelist” users as a second filter for authentication, as well as designate JupyterHub admins.  The user names here should be their Github account names.

c.GitHubAuthenticator.create_system_users = True
c.Authenticator.whitelist = {‘aburnap’, ‘<user_1>’, ‘<user_2>’} # TODO: Add students here
c.Authenticator.admin_users = {‘aburnap’}

 

2. Building a custom Docker image that supports GPUs and deep learning libraries

If you are not familiar with Docker, it is basically a lightweight container that is somewhere between native access and a virtual machine.  Whereas a workstation could only support a few virtual machines depending on hardware resources, one can spin up hundreds of Docker containers since resources can be shared.  Docker is also very useful from this standpoint because you build images that ensure configurations are done consistently.  Ramping up a container of an image takes much less time that a virtual machine as well.

We will first build a Docker image that has all the bells and whistles for our setup.  Much like a Makefile, Docker has a set of instructions for creating a new Docker image called a Dockerfile.  Dockerfiles generally inhereit from other Docker images to save much work.  We will be modifying Dockerfiles from the jupyter/docker-stacks. Instead of using their images though, we will build off a NVIDIA base-image of nvidia/cuda:8.0-cudnn6-devel-ubuntu16.04.

To save you time, I have forked the jupyter/docker-stacks and include an extra Dockerfile to build the appropriate image here.  Grab this using,

git clone https://github.com/aburnap/docker-stacks

Now, first build the base image which we will call ‘base-notebook-gpu’ by cd’ing into the base image dir,

cd docker-stacks/base-notebook

build this image and call it ‘base-notebook-gpu’ using

sudo docker build -t base-notebook-gpu . –no-cache

This will download many sources and may take awhile.  Upon successful build of this image, check that it exists,

sudo docker image ls

This base notebook does a few things.  First it creates a non-root user called “UROP_student” and make this user the default upon login.  It also downloads and installs cuDNN 6.0 and copies the appropriate .so files to /usr/local/cuda-8.0.  This is different than the prebuilt versions using that apt-get install libcudnn6.  We will need this for the way we ramp up Docker containers.

Now, move to the tensorflow-notebook-gpu directory and build this image which inhereits from the base notebook image.

sudo docker build -t tensorflow-notebook-gpu . –no-cache

Again, check that this image exists with,

sudo docker image ls

Now that we have Docker images to spawn containers for authenticated users entering our server,we first need to install dockerspawner.

pip3 install dockerspawner

We now need to add a few things to the jupyterhub_config.py file,

from dockerspawner import DockerSpawner
c.JupyterHub.spawner_class = DockerSpawner
c.DockerSpawner.image = ‘tensorflow-notebook-gpu’

3. Tweaking the configuration so that spawned docker containers can communicate with the host server’s GPU(s)

We have come pretty far, but now becomes a few custom tweaks that need to be done to get the Docker containers communicating correctly with the GPUs on the host server.

First thing we need to do is allow newly spun up Docker containers to access the JupyerHub server ip.  This is done with a handy tool called netifaces as given in the YouTube video tutorial for JuypterHub.

pip3 install netifaces

Now add the following lines to your jupyterhub_config.py,

import netifaces
docker0 = netifaces.ifaddresses(‘docker0’)
docker0_ipv4 = docker0[netifaces.AF_INET][0]
c.JupyterHub.hub_ip = docker0_ipv4[‘addr’]

Now about those GPUs…how do we access them?  Well, first off the image you created called tensorflow-notebook-gpu has a few commands that added some of the appropriate lines to the environment and path.  We still need some more commands for the JupyterHub server itself.  Specifically, where should DockerSpawner mount the server’s GPUs?  We will use the wrapper NVIDIA created for this mount by first installing nvidia-docker using,

sudo apt-get install nvidia-docker

Now with Juypterhub running (i.e., sudo jupyterhub), query the server’s api route using (credit goes to Andrea Zonca for this tip),

curl -s localhost:3476/docker/cli

You will get output telling you exactly what NVIDIA drivers and devices are being used.  Notice that the nvidia-docker volume driver is being used for this access…thanks NVIDIA.  We now need to tell out JupyterHub server to run these commands explicitly by adding the following to your jupyterhub_config.py.

c.DockerSpawner.read_only_volumes = {“nvidia_driver_384.90″:”/usr/local/nvidia”}
c.DockerSpawner.extra_create_kwargs = {“volume_driver”:”nvidia-docker”}
c.DockerSpawner.extra_host_config = { “devices”:[“/dev/nvidiactl”,”/dev/nvidia-uvm”,”/dev/nvidia-uvm-tools”,”/dev/nvidia0″,”/dev/nvidia1″] }

Your exact values will likely be different depending on your NVIDIA graphics driver and the particulars of your workstation/server.  I only wanted to give access to two NVIDIA GPUs for these students, hence just adding /dev/nvidia0 and /dev/nvidia1 under the devices.

Lastly, there will be some missing paths that Tensorflow needs to locate CUDA and cuDNN.  add the following to your jupyterhub_config.py,

os.environ[‘LD_LIBRARY_PATH’] = ‘/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/lib64/libcudart.so.8.0’
c.Spawner.env.update(‘LD_LIBRARY_PATH’)

 

4. Placing custom CPU/RAM/GPU access restrictions on a per-user basis.

This is a new section as of May 9, 2018 (since I’ve gotten this question privately several times and the comment below).

Q: How do we take our multi-GPU machine and allow only 1 (or 2 or more) GPU per user, or maybe an upper limit on RAM and CPU usage?

A: One way is to overload the DockerSpawner object.

In other words, the DockerSpawner object in the jupyterhub_config.py file will act differently for each user.  Here’s what that code might look like in the jupyterhub_config.py file.

 

from dockerspawner import DockerSpawner
class CustomDockerSpawner(DockerSpawner):
def start(self):

#—– Global Class Config ——–
username = self.user.name
self.notebook_dir = <YOUR__BASE_DIR>
self.image = <YOUR_DOCKER_IMAGE>
self.remove_containers = True
#—– Global Class Config ——–

# Shared Data Persistance
self.volumes = { <DATA_VOLUMES_COMMON_TO_ALL_USERS> }
# GPU Access
self.read_only_volumes = {“nvidia_driver_384.90″:”/usr/local/nvidia”}
self.extra_create_kwargs = {“volume_driver”:”nvidia-docker”}

# Individual Data Persistance
if username == <USER_1>:

self.volumes.update({‘<DATA_VOLUMES_UNIQUE_TO_USER_1>’: <LOCATION_ON_DOCKER_IMAGE_FOR_DATA_VOLUMES_UNIQUE_TO_USER_1>})
# Give USER 1 their own GPU, 64GB of memory, and 12 cpus
self.extra_host_config = { “devices”:[“/dev/nvidiactl”,”/dev/nvidia-uvm”,”/dev/nvidia-uvm-tools”,”/dev/nvidia0″] }
self.mem_limit = ’64G’
self.cpu_limit = 12

elif username == <USER_2>:’

self.volumes.update({‘<DATA_VOLUMES_UNIQUE_TO_USER_2>’: <LOCATION_ON_DOCKER_IMAGE_FOR_DATA_VOLUMES_UNIQUE_TO_USER_2>})
# Give USER 2 their own GPU, 32GB of memory, and 6 cpus
self.extra_host_config = { “devices”:[“/dev/nvidiactl”,”/dev/nvidia-uvm”,”/dev/nvidia-uvm-tools”,”/dev/nvidia1″] }
self.mem_limit = ’32G’
self.cpu_limit = 6

return super().start()

c.JupyterHub.spawner_class = CustomDockerSpawner

 

 

That’s it! By running sudo jupyterhub, your users will now be able to access the domain you configured and have Github authentication setup.

Happy coding!