Docker controlled from Jupyter Notebook C# with PostgresDB

Published on Tuesday, 12 November 2019

In the context of Docker and Jupyter Notebook, it's interesting to note that there exists a Nuget that allows C# to control docker. So, yes, it is possible to launch a Postgresql database, on docker, inside a Jupyter notebook!

This assumes you have Docker, Dotnet try, Jupyter notebook and follow the setup of the C# kernel for Jupyter.

If you don't want to wait, you can find my complete notebook here.

Microsoft has created a C# Client library for talking to Docker, so we will be taking advantage of it. Much of the magic docker code is pulled from the Docker.DotNet repository. I'm using the Npgsql drivers for accessing the PostgreSQL database. pgsql01

The real magic moment is when you access the Docker instance, if it is on your local machine on windows, you can use the npipe://./pipe/docker_engine Uri. If you are on Linux, use unix:///var/run/docker.sock (at this time, I haven't tried it on linux, but if you do, please tell me). pgsql02

In block 3, we select a random port to host the pgSQL database. Then list the local image names that are available (you should get postgres:latest on your machine to run this). We create and start up the container, passing the environment variables for the password, user and initial schema. Once the container is started, we detach ourselves from it, so it runs in the background. Finally, I wait until I'm pretty sure the container and database is ready (10s sleep at the end). You can reduce that sleep time. On my Surface laptop 1, I sometimes have an issue when I've got too many other containers running. pgsql03

I'm connecting to the database and validating the connection name. pgsql04

I'm creating a database schema and a user for this particular schema, then reconning with the new user. pgsql05

Creating a table and inserting two rows using direct strings and string concatenation. In production, you should never be using string concatenation for your SQL statements. Please always use Bound variables as described below. pgsql06

This is how you should interact with the PostgreSQL database if you are using direct SQL statements. You should be using Bound commands with parameters. pgsql07

Finally, I check that all three insertions were successful. pgsql08

You can watch the container run in the docker extension of Visual studio Code. It's a great way of monitoring what is currently running, as well as deleting old containers that might be still present. pgsql10

Talking of deleting old containers, this is how you shut them down and delete them at the end of the notebook. I first close the Db connection and dispose of it before asking Docker.DotNet to stop the containers. pgsql09

It would be cleaner if I knew how to enfore a Finally in Jupyter Notebook, but at this time, I don't know. If you do, drop me a line on twitter or a pull request on this blog post.