Resizing Vagrant box disk space

Vagrant is a great tool to provision virtual machines! As I’m a passionated Windows user, Vagrant is the weapon of my choice whenever I need to use some Linux-only tools such as Docker. I spinn up a new Linux VM, already configured with the things I need and start working. However, when it comes to resizing a disk, Vagrant is not nice to you…

The problem

Vagrant doesn’t provide any out-of-the-box option to configure or to change the disk size. The disk size of a VM totally depends on the base image used for the VM. There are base images with a 10 GB disk, some with a 20 GB disk and some other with a 40 GB disk. There is no Vagrant option to change this – and even worse: most Vagrant boxes use VMDK disks which cannot be resized!

Resizing (manually) with VirtualBox

As Vagrant doesn’t provide any out-of-the-box functionality, we need to do the resizing “manually”. Of course, we can write a script for this, too, but for now we keep it simple and do it by hand.

  1. First we need to convert the VMDK disk to a VDI disk which can be resized. We do this with the VBoxManage tool which comes with the VirtualBox installation:
  2. Now we can easily resize this VDI disk, e.g. to 50 GB:
  3. The last step we need to do is just to use the new disk instead of the old one. We can do this by cloning the VDI disk back to the original VMDK disk or within a view clicks in VirtualBox:

    2016-01-06 16_32_02-Oracle VM VirtualBox Manager

That’s it! Now start your VM with vagrant up and check the disk space. It’s at 50 GB and we have new free space again!

Best regards,
Thomas

More

VirtualBox crashes with STATUS_OBJECT_NAME_NOT_FOUND

As I’m a passionate Windows user (sorry…), I often use VirtualBox (with Vagrant) to pull up a Linux box to use Docker or some other “Linux-only” stuff. Usually, this works really fine, but today my VirtualBox crashed with a STATUS_OBJECT_NAME_NOT_FOUND error:

MXhqz

One of those mystic error where everything worked like a charm yesterday at 6 p.m., but when you start your machine today at 9 a.m. it’s broken. Damn.

The solution was (and still is) a small Windows patch you need to install on your machine:

https://support.microsoft.com/en-us/kb/2628582

Install, restart, back at work. I hope this will help you, too.

Best regards,
Thomas

Mount Windows folder to Boot2Docker VM

I just stumbled over a post on Stackoverflow (http://stackoverflow.com/questions/30864466/whats-the-best-way-to-share-files-from-windows-to-boot2docker-vm) with the question how to mount a Windows folder to a Boot2Docker VM. Although the steps are a little bit confusing, in the end it is not difficult to do.

Boot2Docker

Boot2Docker is a simple VM to run Docker. The VM will run on VirtualBox and Boot2Docker is just a tool to provision this VM (very similar to Vagrant, but smaller and customized for using Docker). You simply download and install Boot2Docker and run boot2docker up to start the VM. After the VM is up, you can run boot2docker ssh to login. Now, we can start to get our Windows folder.

Mounting the folder

To use one of your Windows folders in your Boot2Docker VM, you need to mount it. To do so, you mount your Windows folder to the VM:

Now you login to your VM via SSH (with boot2docker ssh) and do the following:

Make a folder inside your VM:

Mount your stuff from Windows:

After that, you can access c:/my/folder/with/code inside your Boot2Docker VM:

Now, that your code is present inside your VM, you can use it with Docker. Either by mounting it as a volume to the container:

Or by using it while building your Docker image:

Best regards,
Thomas

DeployMan (command line tool to deploy Docker images to AWS)

DeployMan

2014-07-29 11_34_11-Java EE - Eclipse

Yesterday, I published a tool called DeployMan on GitHub. DeployMan is a command line tool to deploy Docker images to AWS and was the software prototype for my master thesis. I wrote my thesis at Informatica in Stuttgart-Weilimdorf, so first of all, I want to say thank you to Thomas Kasemir for the opportunity to put this online!

Disclaimer

At the time I am writing this post, DeployMan is a pure prototype. It was created for academic research and as a demo for my thesis. It is not ready ready for production. If you need a solid tool to deploy Docker images (to AWS), have a look at Puppet, CloudFormation (for AWS), Terraform, Vagrant, fig (for Docker) or any other orchestration tool that came up in the last couple of years.

What DeployMan does

DeployMan can create new AWS EC2 instances and deploy a predefined stack of Docker images on it. To do so, DeployMan takes a configuration file called a formation. A formation specifies how the EC2 machine should look like and which Docker images (and which configurations) should be deployed. Docker images can either be deployed from a Docker registry (the public one or a private) or a tarballs from a S3 storage. Together with each image, a configuration folder will pulled from a S3 storage and mounted to the running container.

Here is an example of a formation which deploys a Nginx server with a static HTML page:

Interfaces

DeployMan provides a command line interface to start instances and do some basic monitoring of the deployment process. Here is a screenshot which shows some formations (which can be started) and the output of a started Logstash server:

Run_Logstash_Server

To keep track of the deployment process in a more pleasant way, DeployMan has a web interface. The web interface shows details to machines, such as the deployed images and which containers are running. Here is how a Logstash server would look like:

Machine_Details

The project

GitHub-Mark

You can find the project on GitHub at https://github.com/tuhrig/DeployMan. I wrote a detailed README.md which explains how to build and use DeployMan. To test DeployMan, you need an AWS account (there are also free accounts).

The project is made with Java 8, Maven, the AWS Java API, the Docker Java API and a lot of small stuff like Apache Commons. The web interface is based on Spark (for the server), Google’s AngularJS and Twitter’s Bootstrap CSS.

Best regards,
Thomas

Presentation of my master thesis

Over the last six months, I wrote my master thesis about porting an enterprise OSGi application to a PaaS. Last Monday, the 21th Juli 2014, I presented the main results of my thesis to my professor (best greetings to you, Mr. Goik!) and to my colleges (thanks to all of you!) at Informatica in Stuttgart-Weilimdorf, Germany (where I had written my thesis based on one of their product information management applications, called Informatica PIM).

Here are the slides of my presentation.

While my master thesis also covers topics like OSGi, VMs and JEE application servers, the presentation focuses on my final solution of a deployment process for the cloud. Based on Docker, the complete software stack used for the Informatica PIM server was packaged into separate, self-contained images. Those images have been stored in a repository and were used to automatically setup cloud instances on Amazon Web Services (AWS).

The presentation gives answers to the following questions:

  • What is cloud computing and what is AWS?
  • What are containers and what is Docker?
  • How can we deploy containers?

To automate the deployment process of Docker images, I implemented my own little tool called DeployMan. It will show up at the end of my slides and I will write about it in a couple of days here. Although there are a lot of tools out there to automate Docker deployments (e.g. fig or Maestro), I wanted to do my own experiments and to create a prototype for my thesis.

Enjoy!

Best regards,
Thomas

Docker Registry Rest API

The Docker Registry

The Docker registry is Docker’s in-build way to share images. It is an open-source project and can be found at https://github.com/dotcloud/docker-registry in the official repository of DotCloud. You can set it up on your private server (maybe in the cloud) at push and pull your images to it. You can also secure it, e.g. with SSL and a NGINX (maybe I will write about this later).

The Rest API

Similar to Docker itself, the registry provides a Rest API to interact with it. Using the Rest API, you can list all images, search or brows a certain repository. The only prerequisite is that you define a search back-end in the registry’s config.yaml:

Now you can use the Rest API like this:

List a certain repository

Search

Get info to a certain image

List all image

And thanks to bwilcox from StackOverflow, this is how you can list all images:

More

Best regards,
Thomas

How to know you are inside a Docker container

How to know that you are living in the Matrix? Well, I do not know, but at least I know how to tell you if you are inside a Docker container or not.

The Docker Matrix

Docker provides virtualization based on Linux Containers (LXC). LXC is a technology to provide operating system virtualization for processes on Linux. This means, that processes can be executed in isolation without starting a real and heavy virtual machine. All processes will be executed on the same Linux kernel, but will still have their own namespaces, users and file system.

An important feature of such virtualization is that applications inside a virtual environment do not know that they are not running on real hardware. An application will see the same environment, no matter if it is running on real or virtual resources.

/proc

However, there are some tricks. The /proc file system provides an interface to kernel data structures of processes. It is a pseudo file system and most of it is read-only. But every process on Linux will have an entry in this file system (named by its PID):

In this directory, we find information about the executed program, its command line arguments or working directory. And since the Linux kernel 2.6.24, we also find a file called cgroup:

This file contains information about the control group the process belongs to. Normally, it looks something like this:

But since LXC (and therefore Docker) makes use of cgroups, this file looks different inside a container:

As you can see, some resources (like the CPU) are belonging to a control group with the name of the container. We can make this a little bit easier if we use the keyword self instead of the PID. The keyword self will always reference the folder of the calling process:

And we can wrap this into a function (thanks to Henk Langeveld from StackOverflow):

More

Best regards,
Thomas

Layering of Docker images

Docker images are great! They are not only portable application containers, they are also building blocks for application stacks. Using a Docker registry or the public Docker index, you can compose setups just by downloading the right Docker image.

But Docker images are not only building blocks for applications, they also use a kind of “build block” themselves: layers. Every Docker image consists of a set of layers which make up the final image.

Layers

Let us consider the following Dockerfile to build a simple Ubuntu image with an Apache installation:

If we build the image by calling docker build -t test/a . we get an image called a, belonging to a repository called test. We can see the history of your image by calling docker history test/a:

The final image a consists of six intermediate images as we can see. The first three layers belongs to the Ubuntu base image and the rest is ours: one layer for every build instruction.

We will see the benefit of this layering if build a slightly different image. Let’s consider this Dockerfile to build nearly the same image (only the text file in the last instruction has a different name):

When we build this file, the first thing we will notice is that the build is much faster. Since we already created intermediate images for the first three instructions (namely FROM..., RUN... and RUN...), Docker will reuse those layers for the new image. Only the last layer will be created from scratch. The history of this image will look like this:

As we see, all layers are the same as for image a, except of the first one where we touch a different file!

Benefits

Those layers (or intermediate images or whatever you call them) have some benefits. Once we build them, Docker will reuse them for new builds. This makes the builds much faster. This is great for contentious integration, where we want to build an image at the end of each successful build (e.g. in Jenkins). But the build is not only faster, the images are also smaller, since intermediate images are shared between images.

But maybe the best things are rollbacks: since every image contains all of its building steps, we can easily go back to a previous step if we want so. This can be done tagging a certain layer. Let’s take a look at image b again:

If we want to make a rollback and remove the last layer (maybe the file should be called c.txt instead of b.txt) we can do so by tagging the layer 9977b78fbad7:

Let’s take a look at the new history:

Our last layer is gone and with the layer the text file b.txt!

Best regards,
Thomas

Docker vs. Heroku

Untitled drawing

Since a couple of weeks I am working with Docker as an application container for Amazon’s EC2. Despite my eternal fight with the Docker registry, I am absolutely amazed about Docker and enjoyed my experience.

But sometimes it is hard to explain what Docker is and what is has to do with all this cloud and PaaS and scalability topic. So I thought a little bit about some similar concepts between Docker and Heroku -maybe the most popular PaaS provider. But let’s start with a small…

Disclaimer

Docker and Heroku maybe have similar concepts (as you will see below), but they are two completely different things: while Docker is an open source software project, Heroku is a commercial service provider. You can download, build and install Docker on your own laptop or participate on its online community. On Heroku, you can create yourself an user account, pay some money (maybe) and get a really great service and hosting experience for your applications and code. So obviously, Docker and Heroku are very different things. But some of their core concepts have at least some similarities.

Docker vs. Heroku

Docker Heroku
Dockerfile BuildPack
Image Slug
Container Dyno
Index Add-Ons
CLI CLI

Docker and Heroku have a lot of similarities, especially in their core concepts. This makes Docker an interesting alternative for people who are looking for an alternative for Heroku – maybe on their own infrastructure.

Dockerfile vs. BuildPack

Docker images can be build with a Dockerfile. A Dockerfile is a set of commands, e.g. to add files and folders or to install packages. It defines how the final image should look like. Here is an example of a Dockerfile which installs memcache from the official website:

Heroku’s pendant are so called BuildPacks. BuildPacks are also a set of scripts which are used to setup the final state of an image. Heroku comes with a couple of default BuildPacks such as for Java, Python or the Play! framework. But you can also write your own. Here’s a snippet of the Heroku BuildPack for Java apps:

BTW, there are even projects to enable the usage of Heroku’s BuildPacks for Docker images (like this).

Image vs. Slug

When you run a Dockerfile, it creates a Docker image. Such an image contains all data, files, dependencies and settings you need for your application. You can exchange those images and start them right away on any machine with Docker installed.

When you run a build on Heroku, the BuildPack creates a so called Slug. Those slugs are “are compressed and pre-packaged copies of your application” as Heroku says. Similar to Docker’s images, they contain all dependencies and can be deployed and started in a very short time.

Container vs. Dyno

After starting a Docker image, you have a running container of this image. You can start an image multiple times, to get multiple isolated container of the same application. This enables you to build an image once and start easily multiple instances of it.

Heroku does the very same. After you build your app with your BuildPack, you get a slug which you can run on a Dyno. Such a dyno is “a lightweight container running a single user-specified command” as Heroku describes it.

Heroku even uses LXC for virtualization of their containers (dynos), which is the same technology Docker uses at its core.

Index vs. Add Ons

Docker images can be shared with the community. This is possible by uploading them to the official Docker index. All images on this index can be download and used by everyone. Most of them are documented very well and can be started with a single command. This makes it possible to run a lot of applications as building blocks. Here’s an example how to run elasticsearch:

A similar concept applies to Heroku’s add-on market. You can use (or buy) different pre-configured add-ons for your application (e.g. for elasticsearch). This makes it possible to build a complex app with common building blocks – such as Docker is doing it!

So both, Docker’s index and Heroku’s add-ons, underline a service oriented way of developing applications and reusing components.

CLI

2014-05-05 17_28_04-C__Users_tuhrig_Desktop_AWSRepo_formations_RELEASE_7.0.0.5_PIM.json (static) - S

Although the four points mentioned before are the most important concepts of both, Docker and Heroku have one more thing in common: both have a powerful command line interface which allows to manage containers. E.g. you can run heroku ps to see all your running slugs or docker ps to see all your running containers or you can request the log of a certain container.

Resources

Best regards,
Thomas

Development speed, the Docker remote API and a pattern of frustration

One of the challenges Docker is facing right now, is its own development speed. Since its initial release in January 2013, there have been over 7.000 commits (in one year!) by more than 400 contributors. There are more than 1.800 forks on GitHub and Dockers brings up approximately one new release per month. Docker is in a super fast development right now and this is really great to see!

However, this very high development speed leaves a lot of third-party tools behind. If you develop a tool for Docker, you have to keep a very high pace. If not, your tool is outdated within a month.

Docker remote API client libraries

A good example how this development speed affects projects, are the remote API libraries for Docker. Docker offers a JSON API to access Docker in a programmatic way. It enables you for example to list all running containers and stop a specific one. All via JSON and HTTP requests.

To use this JSON API in a convenient way, people created bindings for their favorite programming language. As you can see below, there exist bindings for JavaScript, Ruby, Java and many more. I used some of them on my own and I am really thankful for the great work their developers have done!

But many of those libraries are outdated at the time I am writing this. To be exact: all of them are outdated! The current remote API version of Docker is v1.11 (see here for more) which none of the remote API libraries supports right now. Many of them don’t even support v1.10 or v1.9.

Here is the list of remote API tools as you find it at http://docs.docker.io/reference/api/remote_api_client_libraries/.

Language Name Remote API
Python docker-py v1.9
Ruby docker-api v1.10
JavaScript (NodeJS) dockerode v1.10
JavaScript (NodeJS) ocker.io v1.7
JavaScript (Angular) WebUI dockerui v1.8
Java docker-java v1.8
Erlang erldocker v1.4
Go dockerclient v1.10
PHP Docker-PHP v1.9
Scala reactive-docker v1.10

How to deal with rapidly evolving APIs

How to deal with rapidly evolving APIs is a difficult question and IMHO Docker made the right decision. By solely providing a JSON API Docker chose a modern and universal technique. A JSON API can be used in any language or even in a web browser. JSON (together with a RESTful API) is the state-of-the-art technique to interact with services. Docker even leaves the possibility to fall back to an old API version by adding an version identifier to the request. Well done.

But the decision to stay “universal” (by solely providing a JSON API) also means to don’t get specific. Getting specific (which means to use Docker in a certain programming language) is left to the developers of third party tools. These tools are also evolving rapidly right now, no matter if those are remote API bindings, deployment tools (like Deis.io), or hosting solutions (like CoreOS). This enriches the Docker ecosystem and makes the project even more interesting.

Bad third party tools will fall back on you

The problem is, even if Docker made a good job (which they did!), outdated or poorly implemented third party tools will fall back on Docker, too. If you use a third party library (which you maybe found via the official website) and it works fine, you will be happy with Docker and the third party library. But if the library doesn’t work next month because you updated Docker and the library doesn’t take care of the API version, you will be frustrated about the tool and about Docker.

Pattern of frustration

This pattern of frustration occurs a lot in software development. Bad libraries cause frustrations about the tool itself. Let’s take Java as an example. A lot of people complain about Java that it is verbose, uses class-explosions as a pattern and makes things much more complicated as they should be. The famous AbstractSingletonProxyFactoryBean class of the Spring framework is just such an example (see +Paul Lewis). Another example is reading a file in Java which was an awful pain:

And even the new NIO API which came with Java 7 is not as easy as it could be:

You need to put a String into a Path to pass it into static method which output you need to put into a String again. Great idea! But what about something like this:

However, it is not the fault of Java, but of a poorly implemented third party tool. If you need to put a File into a FileReader which you need to put into a BufferedReader to be able to read a file line by line into a StringBuilder you use a terrible I/O library! But anyway, you will be frustrated about Java and how verbose it is (and maybe also about the API itself).

This pattern applies to many other things: You are angry about your smartphone, because of a poorly coded app. You are angry about Eclipse because it crashes with a newly installed plugin. And so on…

I hope this pattern of frustration will not apply to Docker and the community will develop a stable ecosystem of tools to provide a solid basis for development and deployment with Docker. A tool like Dockers lives trough its ecosystem. If the tools are buggy or outdated, people will be frustrated about Docker – and that would be a shame, because Docker is really great!

Best regards,
Thomas