Dockerized Jupyterhub Deep Learning Notebooks with GPU Access: Tensorflow 1.3, cuDNN 6.0, CUDA 8.0

For deep learning teaching/research with students, often server/workstation configuration creates so many hassles given so many dependencies and environments in the modern machine learning / deep learning scene.  So…how can we give students access to workstations with multiple GPUs to run their deep learning code, without giving them full root/user access nor have them have to go through the configuration hassles.

The solution I have found is to use JupyterHub, a Jupyter Notebook server that scales to classrooms of users, with the ability to spawn preconfigured Docker containers for each user.  This saves on configuration woes and also sandboxes users, while having minimal overhead compared to virtual machines.  Students/Users simply use their web browser and go to the domain which connects to your workstation, and can immediately start doing work in Python and their favorite machine learning/deep learning libraries.

JupyterHub is still early-ish software, and though it has been used in many production environments, still many snags/support issues abound.  In particular, getting NVIDIA GPU access is not straightforward and there are no up-to-date tutorials.  Here I will show you how to get this to work on Ubuntu 16.04 for the workstation, with a Docker container for each user/student based on Ubuntu 16.04, CUDA 8.0, cuDNN 6.0, and Tensorflow 1.3.  You can also edit the Dockerfile I will link to for other builds (e.g., PyTorch, Keras, etc.).

I will break this up into 3 (now 4) parts:

  1. Installing and configuring JupyterHub server to spawn Docker containers with authentication through Github
  2. Building a custom Docker image that supports GPUs and deep learning libraries
  3. Tweaking the configuration so that spawned docker containers can communicate with the host server’s GPU(s)
  4. Placing custom CPU/RAM/GPU access restrictions on a per-user basis.

This assumes you have already configured your workstation/server to run your deep learning libraries as the root/main user.  In other words, try running,

nvidia-smi

and see if you get output describing the NVIDIA driver version, as well as the hardware you have in your workstation/server.  If that does not work, you must first install all the necessary drivers.  There are many other tutorials describing this.

1. Installing and configuring JupyterHub server to spawn Docker containers with authentication through Github

Jupyterhub uses Python 3, not 2.7.  To install jupyterhub and other packages, we will use pip3.  First, change to an appropriate directory, e.g, ~/jupyterhub

sudo apt-get install npm nodejs-legacy
npm install -g configurable-http-proxy
pip3 install jupyterhub

Configurable-http-proxy is a program that acts as a local server proxy.  What happens when a user connects to our workstation computer is they will be internally rerouted from the external IP address to the internal IP address of the JupyterHub server.

You will now need to create the configuration file which the JupyterHub server will read every start.

jupyterhub –generate-config

This will put a python file which you will be extensively editing called jupyterhub_config.py.  Right now it is all default settings with everything commented out.
Importantly, we will be focusing on using Github OAuth for user authentication and DockerSpawner to launch new docker containers.

Give it a launch without SSL just to see if it works,

sudo jupyterhub –no-ssl

If that does not work, you may need to explicitly write out the path to where your jupyterhub executable is installed.  Figure this out with,

which jupyterhub

Running the server will create a sqlite database that stores authenticated users and session info, as well as a jupyterhub_cookie_secret.  To keep using the same cookie secret so you do not create a new one every time, just hide the current one and we will link to it later.

cp jupyterhub_cookie_secret .jupyterhub_cookie_secret

 

Getting a SSL Certificate for secure access on port 443

We will first need SSL for secure server access for our entry point.  If you have a static IP address, you can use Let’s Encrypt‘s certbot to create a SSL Certificate.  If not, you can create your own signed certificate, but just know that users will always be prompted with an unauthorized certificate that they must “accept and trust.”  I will leave those details out because you should use a trusted certificate authority (CA).  Put the following now in your JupyterHub configuration file.

import os
c = get_config()
c.JupyterHub.ip = ‘<your_domain or external IP address>’
c.JupyterHub.ssl_cert = ‘/etc/letsencrypt/live/<your_domain>/fullchain.pem’
c.JupyterHub.ssl_key = ‘/etc/letsencrypt/live/<your_domain>/privkey.pem’
c.JupyterHub.cookie_secret_file = ‘/<jupyterhub_root>/.jupyterhub_cookie_secret’
c.JupyterHub.port = 443

Getting Github authentication to work

We don’t want random users ramping up Docker containers on our workstation, so let’s authenticate them.  JupyterHub has support for multiple authenticators, including Kerberos which is used at many universities.  We will be using Github since many students already have one.  Note that this does NOT work with Github Enterprise, e.g., github.mit.edu.  You must use your personal Github account under the github.com domain.  Just saved you a day figuring this out 🙂

Go to your personal Github, and under “Settings -> Developer Settings” create a new “OAuth App.”  Make a nice name that your users will recognize, and voila, you have an authentication scheme through Github.  Important here is the “Client ID” and “Client Secret”.

You will put these in your JupyterHub configuration file as so:

c.JupyterHub.authenticator_class = ‘oauthenticator.GitHubOAuthenticator’
c.GitHubOAuthenticator.oauth_callback_url = ‘https://<your_domain>/hub/oauth_callback&#8217;
c.GitHubOAuthenticator.client_id = ‘<your_client_id>’
c.GitHubOAuthenticator.client_secret = “<your_client_secret>”

We will now “whitelist” users as a second filter for authentication, as well as designate JupyterHub admins.  The user names here should be their Github account names.

c.GitHubAuthenticator.create_system_users = True
c.Authenticator.whitelist = {‘aburnap’, ‘<user_1>’, ‘<user_2>’} # TODO: Add students here
c.Authenticator.admin_users = {‘aburnap’}

 

2. Building a custom Docker image that supports GPUs and deep learning libraries

If you are not familiar with Docker, it is basically a lightweight container that is somewhere between native access and a virtual machine.  Whereas a workstation could only support a few virtual machines depending on hardware resources, one can spin up hundreds of Docker containers since resources can be shared.  Docker is also very useful from this standpoint because you build images that ensure configurations are done consistently.  Ramping up a container of an image takes much less time that a virtual machine as well.

We will first build a Docker image that has all the bells and whistles for our setup.  Much like a Makefile, Docker has a set of instructions for creating a new Docker image called a Dockerfile.  Dockerfiles generally inhereit from other Docker images to save much work.  We will be modifying Dockerfiles from the jupyter/docker-stacks. Instead of using their images though, we will build off a NVIDIA base-image of nvidia/cuda:8.0-cudnn6-devel-ubuntu16.04.

To save you time, I have forked the jupyter/docker-stacks and include an extra Dockerfile to build the appropriate image here.  Grab this using,

git clone https://github.com/aburnap/docker-stacks

Now, first build the base image which we will call ‘base-notebook-gpu’ by cd’ing into the base image dir,

cd docker-stacks/base-notebook

build this image and call it ‘base-notebook-gpu’ using

sudo docker build -t base-notebook-gpu . –no-cache

This will download many sources and may take awhile.  Upon successful build of this image, check that it exists,

sudo docker image ls

This base notebook does a few things.  First it creates a non-root user called “UROP_student” and make this user the default upon login.  It also downloads and installs cuDNN 6.0 and copies the appropriate .so files to /usr/local/cuda-8.0.  This is different than the prebuilt versions using that apt-get install libcudnn6.  We will need this for the way we ramp up Docker containers.

Now, move to the tensorflow-notebook-gpu directory and build this image which inhereits from the base notebook image.

sudo docker build -t tensorflow-notebook-gpu . –no-cache

Again, check that this image exists with,

sudo docker image ls

Now that we have Docker images to spawn containers for authenticated users entering our server,we first need to install dockerspawner.

pip3 install dockerspawner

We now need to add a few things to the jupyterhub_config.py file,

from dockerspawner import DockerSpawner
c.JupyterHub.spawner_class = DockerSpawner
c.DockerSpawner.image = ‘tensorflow-notebook-gpu’

3. Tweaking the configuration so that spawned docker containers can communicate with the host server’s GPU(s)

We have come pretty far, but now becomes a few custom tweaks that need to be done to get the Docker containers communicating correctly with the GPUs on the host server.

First thing we need to do is allow newly spun up Docker containers to access the JupyerHub server ip.  This is done with a handy tool called netifaces as given in the YouTube video tutorial for JuypterHub.

pip3 install netifaces

Now add the following lines to your jupyterhub_config.py,

import netifaces
docker0 = netifaces.ifaddresses(‘docker0’)
docker0_ipv4 = docker0[netifaces.AF_INET][0]
c.JupyterHub.hub_ip = docker0_ipv4[‘addr’]

Now about those GPUs…how do we access them?  Well, first off the image you created called tensorflow-notebook-gpu has a few commands that added some of the appropriate lines to the environment and path.  We still need some more commands for the JupyterHub server itself.  Specifically, where should DockerSpawner mount the server’s GPUs?  We will use the wrapper NVIDIA created for this mount by first installing nvidia-docker using,

sudo apt-get install nvidia-docker

Now with Juypterhub running (i.e., sudo jupyterhub), query the server’s api route using (credit goes to Andrea Zonca for this tip),

curl -s localhost:3476/docker/cli

You will get output telling you exactly what NVIDIA drivers and devices are being used.  Notice that the nvidia-docker volume driver is being used for this access…thanks NVIDIA.  We now need to tell out JupyterHub server to run these commands explicitly by adding the following to your jupyterhub_config.py.

c.DockerSpawner.read_only_volumes = {“nvidia_driver_384.90″:”/usr/local/nvidia”}
c.DockerSpawner.extra_create_kwargs = {“volume_driver”:”nvidia-docker”}
c.DockerSpawner.extra_host_config = { “devices”:[“/dev/nvidiactl”,”/dev/nvidia-uvm”,”/dev/nvidia-uvm-tools”,”/dev/nvidia0″,”/dev/nvidia1″] }

Your exact values will likely be different depending on your NVIDIA graphics driver and the particulars of your workstation/server.  I only wanted to give access to two NVIDIA GPUs for these students, hence just adding /dev/nvidia0 and /dev/nvidia1 under the devices.

Lastly, there will be some missing paths that Tensorflow needs to locate CUDA and cuDNN.  add the following to your jupyterhub_config.py,

os.environ[‘LD_LIBRARY_PATH’] = ‘/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/lib64/libcudart.so.8.0’
c.Spawner.env.update(‘LD_LIBRARY_PATH’)

 

4. Placing custom CPU/RAM/GPU access restrictions on a per-user basis.

This is a new section as of May 9, 2018 (since I’ve gotten this question privately several times and the comment below).

Q: How do we take our multi-GPU machine and allow only 1 (or 2 or more) GPU per user, or maybe an upper limit on RAM and CPU usage?

A: One way is to overload the DockerSpawner object.

In other words, the DockerSpawner object in the jupyterhub_config.py file will act differently for each user.  Here’s what that code might look like in the jupyterhub_config.py file.

 

from dockerspawner import DockerSpawner
class CustomDockerSpawner(DockerSpawner):
def start(self):

#—– Global Class Config ——–
username = self.user.name
self.notebook_dir = <YOUR__BASE_DIR>
self.image = <YOUR_DOCKER_IMAGE>
self.remove_containers = True
#—– Global Class Config ——–

# Shared Data Persistance
self.volumes = { <DATA_VOLUMES_COMMON_TO_ALL_USERS> }
# GPU Access
self.read_only_volumes = {“nvidia_driver_384.90″:”/usr/local/nvidia”}
self.extra_create_kwargs = {“volume_driver”:”nvidia-docker”}

# Individual Data Persistance
if username == <USER_1>:

self.volumes.update({‘<DATA_VOLUMES_UNIQUE_TO_USER_1>’: <LOCATION_ON_DOCKER_IMAGE_FOR_DATA_VOLUMES_UNIQUE_TO_USER_1>})
# Give USER 1 their own GPU, 64GB of memory, and 12 cpus
self.extra_host_config = { “devices”:[“/dev/nvidiactl”,”/dev/nvidia-uvm”,”/dev/nvidia-uvm-tools”,”/dev/nvidia0″] }
self.mem_limit = ’64G’
self.cpu_limit = 12

elif username == <USER_2>:’

self.volumes.update({‘<DATA_VOLUMES_UNIQUE_TO_USER_2>’: <LOCATION_ON_DOCKER_IMAGE_FOR_DATA_VOLUMES_UNIQUE_TO_USER_2>})
# Give USER 2 their own GPU, 32GB of memory, and 6 cpus
self.extra_host_config = { “devices”:[“/dev/nvidiactl”,”/dev/nvidia-uvm”,”/dev/nvidia-uvm-tools”,”/dev/nvidia1″] }
self.mem_limit = ’32G’
self.cpu_limit = 6

return super().start()

c.JupyterHub.spawner_class = CustomDockerSpawner

 

 

That’s it! By running sudo jupyterhub, your users will now be able to access the domain you configured and have Github authentication setup.

Happy coding!

 

 


Ubuntu 14.04 with CUDA 6.0 for GPU machine learning

Canonical’s recent Ubuntu LTS came out last week, and although it has some known bugs, that hasn’t stopped you and I from early adoption!  Here we will talk about how to get the most recent Nvidia drivers working with your Ubuntu.  Our lab is using a Nvidia GTX Titan Black, but this walkthrough should work for any similar setups.

Mostly I’m writing this blog post because there are a few recent blog posts in other places that do not work, and will give you the dreaded black screen with blinking cursor after the GRUB menu.  More importantly, Nvidia does not officially support CUDA 6.0 for Ubuntu 14.04 (as of early July 2014 as well), so we need to be careful about which linux kernel and compilation toolchain we use.

Let’s get started!

Read the rest of this entry »


Getting IPython Notebooks Converted to PDF

So if you’re using IPython notebook, awesome.  I have recently started using it and it’s definitely taken the role as my primary tool for quick plotting and data analysis.  Soon it may make sense to start blogging with it, as it outputs very friendly HTML with all the nice figures and formatted code as regular IPython.

But how do you get it to output PDF?  Well there is nbconvert, which has been merged into the main IPython trunk, but you need a lot of TeX packages for it to work.  And to get all the requisite (old) TeX packages, which are not available through Canonical’s centralized repo system :-(, you need to get TeXLive!  This is not in the official Canonical PPA anymore (it is apparently in 12.10), but for 12.04 you must sidestep authority, and add some backdoor PPA (personal package archive).  This can be dangerous if you don’t know who you’re dealing with, but this one seems to be maintained be the official release manager.  Here goes:

sudo add-apt-repository ppa:texlive-backports/ppa
sudo apt-get update
sudo apt-get install texlive

sudo apt-get install texlive-latex-extra
That should give you all the TeX packages you need (there are a lot). Next, simply run the merged IPython code in the directory where your .ipynb files are:
ipython nbconvert <your_file_name_here>.ipynb –to latex –post PDF
And voila!  Perfectly typesetted (ack) IPython notebooks.


Configuring Completely Fresh Ubuntu 12.04 Server for Django Apps

Are you configuring a completely fresh Ubuntu 12.04 server, in particular by using Amazon Web Services EC2 servers? Well here is some good flow for setting it up in a somewhat principled manner.  This info is compiled from here and here and here, as well as my own notes from setting up an earlier Ubuntu box.  Feel free to contact me (check the about)!

Considering we are already super user, start with the basics:

sudo apt-get update

sudo apt-get upgrade

Install build essential to get some good libraries for compiling other libraries:

sudo apt-get install build-essential

Amazon Web Services already locks down specific ports, and yes it requires that key-pair file, but that SSH port will still gets pinged.

So now a security daemon (pun intended) that monitors and blocks suspicious login attempts:

apt-get install fail2ban

Read the rest of this entry »


Infer.NET and IronPython on Ubuntu 12.04

This post is to help understand how to get Infer.NET working on Linux, specifically Ubuntu 12.04.  All this stuff was found by Binging around a bit, so big thank you to the collective group I got this from.  Most helpful was a post put together a couple years ago.

To start off you need Mono, an open source .NET framework for IronPython to be built on.  Right now Mono is on version 2.10.x

That’s as simple as:

sudo apt-get install mono-complete

Read the rest of this entry »


Hello World

This will be a public facing reflection on some of the journey into design research.  I hope you find these blog posts useful 🙂