Docker Registry Rest API

The Docker Registry

The Docker registry is Docker’s in-build way to share images. It is an open-source project and can be found at https://github.com/dotcloud/docker-registry in the official repository of DotCloud. You can set it up on your private server (maybe in the cloud) at push and pull your images to it. You can also secure it, e.g. with SSL and a NGINX (maybe I will write about this later).

The Rest API

Similar to Docker itself, the registry provides a Rest API to interact with it. Using the Rest API, you can list all images, search or brows a certain repository. The only prerequisite is that you define a search back-end in the registry’s config.yaml:

Now you can use the Rest API like this:

List a certain repository

Search

Get info to a certain image

List all image

And thanks to bwilcox from StackOverflow, this is how you can list all images:

More

Best regards,
Thomas

Cloud vendors with Windows

The cloud is build on Linux – that is my own humbling opinion. But is it really? To answer this question for myself, I took a look at a bunch of cloud vendors to see what they got under the hood. Here is what I found.

But note that the list is neither complete nor representative. I am also comparing two very different things: IaaS and PaaS. While IaaS vendors like AWS provide virtual machines, PaaS vendors like Heroku provide a tooling to setup complete environments.

However, the list shows that most of the vendors use Linux as their base system and the more you go to the PaaS direction, the more Windows vanishes.

Vendor Windows Linux Type Comment
Microsoft Azure yes yes IaaS
AWS yes yes IaaS AWS has a lot of Linux distributions and Windows version on their IaaS EC2.
AWS Elastic Beanstalk yes yes IaaS
eNlight Cloud yes yes CentOS, Red Hat Enterprise Linux, SUSE Linux, Oracle Linux, Ubuntu, Fedora, Debian, Window Server 2003, Windows Server 2008, Windows 7.
Google App Engine PaaS Google App Engine has a sandbox and hides the OS.
Google Compute Engine yes yes IaaS Linux, FreeBSD, Microsoft Windows
Heroku yes PaaS Ubuntu
Jelastic yes PaaS
HP Cloud yes IaaS Based on OpenStack.
OpenShift yes PaaS Red Hat Enterprise Linux
Engine Yard yes PaaS Ubuntu, Gentoo
Rackspace yes yes
Cloud Foundry yes PaaS

Best regards,
Thomas

How to know you are inside a Docker container

How to know that you are living in the Matrix? Well, I do not know, but at least I know how to tell you if you are inside a Docker container or not.

The Docker Matrix

Docker provides virtualization based on Linux Containers (LXC). LXC is a technology to provide operating system virtualization for processes on Linux. This means, that processes can be executed in isolation without starting a real and heavy virtual machine. All processes will be executed on the same Linux kernel, but will still have their own namespaces, users and file system.

An important feature of such virtualization is that applications inside a virtual environment do not know that they are not running on real hardware. An application will see the same environment, no matter if it is running on real or virtual resources.

/proc

However, there are some tricks. The /proc file system provides an interface to kernel data structures of processes. It is a pseudo file system and most of it is read-only. But every process on Linux will have an entry in this file system (named by its PID):

In this directory, we find information about the executed program, its command line arguments or working directory. And since the Linux kernel 2.6.24, we also find a file called cgroup:

This file contains information about the control group the process belongs to. Normally, it looks something like this:

But since LXC (and therefore Docker) makes use of cgroups, this file looks different inside a container:

As you can see, some resources (like the CPU) are belonging to a control group with the name of the container. We can make this a little bit easier if we use the keyword self instead of the PID. The keyword self will always reference the folder of the calling process:

And we can wrap this into a function (thanks to Henk Langeveld from StackOverflow):

More

Best regards,
Thomas

Layering of Docker images

Docker images are great! They are not only portable application containers, they are also building blocks for application stacks. Using a Docker registry or the public Docker index, you can compose setups just by downloading the right Docker image.

But Docker images are not only building blocks for applications, they also use a kind of “build block” themselves: layers. Every Docker image consists of a set of layers which make up the final image.

Layers

Let us consider the following Dockerfile to build a simple Ubuntu image with an Apache installation:

If we build the image by calling docker build -t test/a . we get an image called a, belonging to a repository called test. We can see the history of your image by calling docker history test/a:

The final image a consists of six intermediate images as we can see. The first three layers belongs to the Ubuntu base image and the rest is ours: one layer for every build instruction.

We will see the benefit of this layering if build a slightly different image. Let’s consider this Dockerfile to build nearly the same image (only the text file in the last instruction has a different name):

When we build this file, the first thing we will notice is that the build is much faster. Since we already created intermediate images for the first three instructions (namely FROM..., RUN... and RUN...), Docker will reuse those layers for the new image. Only the last layer will be created from scratch. The history of this image will look like this:

As we see, all layers are the same as for image a, except of the first one where we touch a different file!

Benefits

Those layers (or intermediate images or whatever you call them) have some benefits. Once we build them, Docker will reuse them for new builds. This makes the builds much faster. This is great for contentious integration, where we want to build an image at the end of each successful build (e.g. in Jenkins). But the build is not only faster, the images are also smaller, since intermediate images are shared between images.

But maybe the best things are rollbacks: since every image contains all of its building steps, we can easily go back to a previous step if we want so. This can be done tagging a certain layer. Let’s take a look at image b again:

If we want to make a rollback and remove the last layer (maybe the file should be called c.txt instead of b.txt) we can do so by tagging the layer 9977b78fbad7:

Let’s take a look at the new history:

Our last layer is gone and with the layer the text file b.txt!

Best regards,
Thomas

Docker vs. Heroku

Untitled drawing

Since a couple of weeks I am working with Docker as an application container for Amazon’s EC2. Despite my eternal fight with the Docker registry, I am absolutely amazed about Docker and enjoyed my experience.

But sometimes it is hard to explain what Docker is and what is has to do with all this cloud and PaaS and scalability topic. So I thought a little bit about some similar concepts between Docker and Heroku -maybe the most popular PaaS provider. But let’s start with a small…

Disclaimer

Docker and Heroku maybe have similar concepts (as you will see below), but they are two completely different things: while Docker is an open source software project, Heroku is a commercial service provider. You can download, build and install Docker on your own laptop or participate on its online community. On Heroku, you can create yourself an user account, pay some money (maybe) and get a really great service and hosting experience for your applications and code. So obviously, Docker and Heroku are very different things. But some of their core concepts have at least some similarities.

Docker vs. Heroku

Docker Heroku
Dockerfile BuildPack
Image Slug
Container Dyno
Index Add-Ons
CLI CLI

Docker and Heroku have a lot of similarities, especially in their core concepts. This makes Docker an interesting alternative for people who are looking for an alternative for Heroku – maybe on their own infrastructure.

Dockerfile vs. BuildPack

Docker images can be build with a Dockerfile. A Dockerfile is a set of commands, e.g. to add files and folders or to install packages. It defines how the final image should look like. Here is an example of a Dockerfile which installs memcache from the official website:

Heroku’s pendant are so called BuildPacks. BuildPacks are also a set of scripts which are used to setup the final state of an image. Heroku comes with a couple of default BuildPacks such as for Java, Python or the Play! framework. But you can also write your own. Here’s a snippet of the Heroku BuildPack for Java apps:

BTW, there are even projects to enable the usage of Heroku’s BuildPacks for Docker images (like this).

Image vs. Slug

When you run a Dockerfile, it creates a Docker image. Such an image contains all data, files, dependencies and settings you need for your application. You can exchange those images and start them right away on any machine with Docker installed.

When you run a build on Heroku, the BuildPack creates a so called Slug. Those slugs are “are compressed and pre-packaged copies of your application” as Heroku says. Similar to Docker’s images, they contain all dependencies and can be deployed and started in a very short time.

Container vs. Dyno

After starting a Docker image, you have a running container of this image. You can start an image multiple times, to get multiple isolated container of the same application. This enables you to build an image once and start easily multiple instances of it.

Heroku does the very same. After you build your app with your BuildPack, you get a slug which you can run on a Dyno. Such a dyno is “a lightweight container running a single user-specified command” as Heroku describes it.

Heroku even uses LXC for virtualization of their containers (dynos), which is the same technology Docker uses at its core.

Index vs. Add Ons

Docker images can be shared with the community. This is possible by uploading them to the official Docker index. All images on this index can be download and used by everyone. Most of them are documented very well and can be started with a single command. This makes it possible to run a lot of applications as building blocks. Here’s an example how to run elasticsearch:

A similar concept applies to Heroku’s add-on market. You can use (or buy) different pre-configured add-ons for your application (e.g. for elasticsearch). This makes it possible to build a complex app with common building blocks – such as Docker is doing it!

So both, Docker’s index and Heroku’s add-ons, underline a service oriented way of developing applications and reusing components.

CLI

2014-05-05 17_28_04-C__Users_tuhrig_Desktop_AWSRepo_formations_RELEASE_7.0.0.5_PIM.json (static) - S

Although the four points mentioned before are the most important concepts of both, Docker and Heroku have one more thing in common: both have a powerful command line interface which allows to manage containers. E.g. you can run heroku ps to see all your running slugs or docker ps to see all your running containers or you can request the log of a certain container.

Resources

Best regards,
Thomas

Development speed, the Docker remote API and a pattern of frustration

One of the challenges Docker is facing right now, is its own development speed. Since its initial release in January 2013, there have been over 7.000 commits (in one year!) by more than 400 contributors. There are more than 1.800 forks on GitHub and Dockers brings up approximately one new release per month. Docker is in a super fast development right now and this is really great to see!

However, this very high development speed leaves a lot of third-party tools behind. If you develop a tool for Docker, you have to keep a very high pace. If not, your tool is outdated within a month.

Docker remote API client libraries

A good example how this development speed affects projects, are the remote API libraries for Docker. Docker offers a JSON API to access Docker in a programmatic way. It enables you for example to list all running containers and stop a specific one. All via JSON and HTTP requests.

To use this JSON API in a convenient way, people created bindings for their favorite programming language. As you can see below, there exist bindings for JavaScript, Ruby, Java and many more. I used some of them on my own and I am really thankful for the great work their developers have done!

But many of those libraries are outdated at the time I am writing this. To be exact: all of them are outdated! The current remote API version of Docker is v1.11 (see here for more) which none of the remote API libraries supports right now. Many of them don’t even support v1.10 or v1.9.

Here is the list of remote API tools as you find it at http://docs.docker.io/reference/api/remote_api_client_libraries/.

Language Name Remote API
Python docker-py v1.9
Ruby docker-api v1.10
JavaScript (NodeJS) dockerode v1.10
JavaScript (NodeJS) ocker.io v1.7
JavaScript (Angular) WebUI dockerui v1.8
Java docker-java v1.8
Erlang erldocker v1.4
Go dockerclient v1.10
PHP Docker-PHP v1.9
Scala reactive-docker v1.10

How to deal with rapidly evolving APIs

How to deal with rapidly evolving APIs is a difficult question and IMHO Docker made the right decision. By solely providing a JSON API Docker chose a modern and universal technique. A JSON API can be used in any language or even in a web browser. JSON (together with a RESTful API) is the state-of-the-art technique to interact with services. Docker even leaves the possibility to fall back to an old API version by adding an version identifier to the request. Well done.

But the decision to stay “universal” (by solely providing a JSON API) also means to don’t get specific. Getting specific (which means to use Docker in a certain programming language) is left to the developers of third party tools. These tools are also evolving rapidly right now, no matter if those are remote API bindings, deployment tools (like Deis.io), or hosting solutions (like CoreOS). This enriches the Docker ecosystem and makes the project even more interesting.

Bad third party tools will fall back on you

The problem is, even if Docker made a good job (which they did!), outdated or poorly implemented third party tools will fall back on Docker, too. If you use a third party library (which you maybe found via the official website) and it works fine, you will be happy with Docker and the third party library. But if the library doesn’t work next month because you updated Docker and the library doesn’t take care of the API version, you will be frustrated about the tool and about Docker.

Pattern of frustration

This pattern of frustration occurs a lot in software development. Bad libraries cause frustrations about the tool itself. Let’s take Java as an example. A lot of people complain about Java that it is verbose, uses class-explosions as a pattern and makes things much more complicated as they should be. The famous AbstractSingletonProxyFactoryBean class of the Spring framework is just such an example (see +Paul Lewis). Another example is reading a file in Java which was an awful pain:

And even the new NIO API which came with Java 7 is not as easy as it could be:

You need to put a String into a Path to pass it into static method which output you need to put into a String again. Great idea! But what about something like this:

However, it is not the fault of Java, but of a poorly implemented third party tool. If you need to put a File into a FileReader which you need to put into a BufferedReader to be able to read a file line by line into a StringBuilder you use a terrible I/O library! But anyway, you will be frustrated about Java and how verbose it is (and maybe also about the API itself).

This pattern applies to many other things: You are angry about your smartphone, because of a poorly coded app. You are angry about Eclipse because it crashes with a newly installed plugin. And so on…

I hope this pattern of frustration will not apply to Docker and the community will develop a stable ecosystem of tools to provide a solid basis for development and deployment with Docker. A tool like Dockers lives trough its ecosystem. If the tools are buggy or outdated, people will be frustrated about Docker – and that would be a shame, because Docker is really great!

Best regards,
Thomas