程式扎記: Docker URun - Ch4 - Working with Docker Images

標籤

2015年9月24日 星期四

Docker URun - Ch4 - Working with Docker Images

Prefac

Every Docker container is based on an image, which provides the basis for everything that you will ever deploy and run with Docker. To launch a container, you must either download a public image or create your own. Every Docker image consists of one or more filesystem layers that generally have a direct one-to-one mapping to each individual build step used to create that image.

For image management, Docker relies heavily on its storage backend, which communicates with the underlying Linux filesystem to build and manage the multiple layers that combine into a single usable image. The primary storage backends that are supported include: AUFSBTRFS,Device-mapper, and overlayfs. Each storage backend provides a fast copy-on-write (CoW) system for image management.

Anatomy of a Dockerfile#

To create a custom Docker image with the default tools, you will need to become familiar with the Dockerfile. This file describes all the steps that are required to create one image and would usually be contained within the root directory of the source code repository for your application.

A typical Dockerfile might look something like the one shown here, which will create a container for a Node.js-based application:
FROM node:0.10
MAINTAINER Anna Doe <anna@example.com>
LABEL "rating"="Five Stars" "class"="First Class"
USER root
ENV AP /data/app
RUN apt-get -y update# The daemons
RUN apt-get -y install supervisor
RUN mkdir -p /var/log/supervisor# Supervisor Configuration
ADD ./supervisord/conf.d/* $SCPATH/
# Application Code
ADD *.js* $AP/
WORKDIR $AP
RUN npm install
CMD ["supervisord", "-n"]

Dissecting this Dockerfile will provide some initial exposure to a number of the possible instructions that you can use to control how an image is assembled. Each line in a Dockerfile creates a new image layer that is stored by Docker. This means that when you build new images, Docker will only need to build layers that deviate from previous builds.

Although you could build a Node instance from a plain, base Linux image, you can also explore the Docker Registry for official images for Node. Node.js maintains a series of Docker images and tags that allows you to quickly determine that you should tell the image to inherit from node:0.10, which will pull the most recent Node.js version 0.10 container. If you want to lock the image to a specific version of Node, you could instead point it at node:0.10.33. The base image that follows will provide you with an Ubuntu Linux image running Node 0.10.x:
FROM node:0.10

The MAINTAINER field provides contact information for the Dockerfile’s author, which populates the Author field in all resulting images’ metadata:
MAINTAINER Anna Doe >anna@example.com<

The ability to apply labels to images and containers was added to Docker in version 1.6. This means that you can now add metadata via key-value pairs that can later be used to search for and identify Docker images and containers. You can see the labels applied to any image using the docker inspect command:
LABEL "rating"="Five Stars" "class"="First Class"

By default, Docker runs all processes as root within the container, but you can use the USER instruction to change this:
USER root

The ENV instruction allows you to set shell variables that can be used during the build process to simplify the Dockerfile:
ENV AP /data/app
ENV SCPATH /etc/supervisor/conf.d

In the following code, you’ll use a collection of RUN instructions to start and create the required file structure that you need, and install some required software dependencies. You’ll also start to use the build the variables you defined in the previous section to save you a bit of work and help protect you from typos:
RUN apt-get -y update# The daemons
RUN apt-get -y install supervisor
RUN mkdir -p /var/log/supervisor

Note.
It is generally considered a bad idea to run commands like apt-get -y update or yum -y update in your application Dockerfiles because it can significantly increase the time it takes for all of your builds to finish. Instead, consider basing your application image on another image that already has these updates applied to it.

The ADD instruction is used to copy files from the local filesystem into your image. Most often this will include your application code and any required support files:
# Supervisor Configuration
ADD ./supervisord/conf.d/* $SCPATH/
# Application Code
ADD *.js* $AP/

With the WORKDIR instruction, you change the working directory in the image for the remaining build instructions:
WORKDIR $AP
RUN npm install

And finally you end with the CMD instruction, which defines the command that launches the process that you want to run within the container:
CMD ["supervisord", "-n"]

Building an Image#

To build your first image, let’s go ahead and clone a git repo that contains an example application called docker-node-hello, as shown here:
$ git clone https://github.com/spkane/docker-node-hello.git
Cloning into 'docker-node-hello'... 
... 
Checking connectivity... done. 

$ cd docker-node-hello

This will download a working Dockerfile and related source code files into a directory called docker-node-hello. If you look at the contents while ignoring the git repo directory, you should see the following:


Let’s review the most relevant files in the repo.

The Dockerfile should be exactly the same as the one you just reviewed.

The .dockerignore file allows you to define files and directories that you do not want uploaded to the Docker host when you are building the image. In this instance, the .dockerignore file contains the following line:
.git

This instructs docker build to exclude the .git directory, which contains the whole source code repository. You do not need this directory to build the Docker image, and since it can grow quite large over time, you don’t want to waste time copying it every time you do a build.

The supervisord directory contains the configuration files for supervisord that you will need to start and monitor the application. As we discussed in Chapter 3, you will need to have your Docker server running and your client properly set up to communicate with it before you can build a Docker image. Assuming that this is all working, you should be able to initiate a new build by running the command below, which will build and tag an image based on the files in the current directory.

Each step identified in the following output maps directly to a line in the Dockerfile, and each step creates a new image layer based on the previous step:
// -t, --tag=: Repository name (and optionally a tag) for the image
# docker build -t example/docker-node-hello:latest .
...
Step 12 : CMD supervisord -n 
---> Running in e454bc764ad8 
---> 808ad0c6921f 
Removing intermediate container e454bc764ad8 
Successfully built 808ad0c6921f 

# docker images | grep example/docker-node-hello
example/docker-node-hello latest 808ad0c6921f About a minute ago 649.3 MB

Note.
To improve the speed of builds, Docker will use a local cache when it thinks it is safe. This can sometimes lead to unexpected issues. In the output above you will notice lines like ---> Running in 23671c2f57b7. If instead you see ---> Using cache, you know that Docker decided to use the cache. You can disable the cache for a build by using the --no-cache argument to the docker build command.

Running Your Image#

Once you have successfully built the image, you can run it on your Docker host with the following command:
# docker run -d -p 8080:8080 example/docker-node-hello:latest

The above command tells Docker to create a running container in the background from the image with the example/docker-node-hello:latest tag, and then map port 8080 in the container to port 8080 on the Docker host.

If everything goes as expected, the Node.js application should be running in a container on the host. You can verify this by running docker ps. To see the running application in action, you will need to open up a web browser and point it at port 8080 on the Docker host.
# ps aux | grep docker-proxy | grep -v grep
root 22458 0.0 0.9 153024 19972 pts/0 Sl+ 01:25 0:00 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8080 -container-ip 172.17.0.11 -container-port 8080
# curl 172.17.0.11:8080
Hello World. Wish you were here.

Environment Variables#

If you read the index.js file, you will notice that part of the file refers to the variable $WHO, which the application uses to determine who it is going to say Hello to:
var DEFAULT_WHO = "World";
var WHO = process.env.WHO || DEFAULT_WHO;
app.get('/', function (req, res) {
    res.send('Hello ' + WHO + '. Wish you were here.\n');
});

Let’s quickly learn how you can configure this application by passing in environment variables when you start it. First you need to stop the existing container using two commands.
# docker ps
...
12fce8815778 example/docker-node-hello:latest "supervisord -n" 11 minutes ago Up 11 minutes 0.0.0.0:8080->8080/tcp stupefied_colden
# docker stop 12fce8815778

You can then restart the container by adding one argument to the previous docker run command:
// -e, --env=: Set environment variables
# docker run -d -p 8080:8080 -e WHO="Sean and Karl" example/docker-node-hello:latest
e1860d7986dd5116fc18bc71e424b566a64b727d340ece316e128ce59e44d103

If you reload your web browser, you should see that the text on the web page now reads:
# ps aux | grep docker-proxy | grep -v grep
root 22832 0.0 0.8 144828 16860 pts/0 Sl+ 01:38 0:00 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8080 -container-ip 172.17.0.12 -container-port 8080
# curl 172.17.0.12:8080
Hello Sean and Karl. Wish you were here.

Custom Base Images#

Base images are the lowest-level images that other Docker images will build upon. Most often, these are based on minimal installs of Linux distributions like Ubuntu, Fedora, or CentOS, but they can also be much smaller, containing a single statically compiled binary. For most people, using the official base images for their favorite distribution or tool is a great option.

However, there are times when it is more preferable to build your own base images that are not based on an image created by someone else. One reason to do this would be to maintain a consistent OS image across all your deployment methods for hardware, VMs, and containers. Another would be to get the image size down substantially. There is no need to ship around an entire Ubuntu distribution, for example, if your application is a statically built C or Go application. You might find that you only need the tools you regularly use for debugging and some other shell commands and binaries. Making the effort to build such an image could pay off in better deployment times and easier application distribution.

In the official Docker documentation, there is some good information about how you can build base images on the various Linux distributions.

Storing Images#

Now that you have created a Docker image that you’re happy with, you’ll want to store it somewhere so that it can be easily accessed by any Docker host that you want to deploy it to. This is also the clear hand-off point between building images and putting them somewhere to run. You don’t normally build the images on the server and then run them. Ordinarily, deployment is the process of pulling an image from a repository and running it on one or more Docker servers. There are a few ways you can go about storing your images into a central repository for easy retrieval.

Public Registries#

Docker provides an image registry for public images that the community wants to share. These include official images for Linux distibutions, ready-to-go WordPress containers, and much more.

If you have images that can be published to the Internet, the best place for them is a public registry, like Docker Hub. However, there are other options. When the core Docker tools were first gaining popularity, Docker Hub did not exist. To fill this obvious void in the community, Quay.io was created. Since then, Quay.io has been purchased by CoreOS and has been used to create the CoreOS Enterprise Registry product, which we will discuss in a moment.

Both Docker Hub and Quay.io provide centralized Docker image registries that can be accessed from anywhere on the Internet, and provide a method to store private images in addition to public ones. Both have nice user interfaces and the ability to separate team access permissions and manage users. Both also offer reasonable commercial options for private SaaS hosting of your images, much in the same way that GitHub sells private registries on their systems. This is probably the right first step if you’re getting serious about Docker but are not yet shipping enough code to need an internally hosted solution.

For companies that use Docker heavily, the biggest downside to these registries is that they are not local to the network on which the application is being deployed. This means that every layer of every deployment might need to be dragged across the Internet in order to deploy an application. Internet latencies have a very real impact on software deployments, and outages that affect these registries could have a very detrimental impact on a company’s ability to deploy smoothly and on schedule. This is mitigated by good image design where you make thin layers that are easy to move around the Internet.

Private Registries#

The other option that many companies consider is to host some type of Docker image registry internally. Before the public registry existed for Docker, the Docker developers released the docker-registry project on GitHub. The docker-registry is a GUI-less Python daemon that can interact with the Docker client to support pushing, pulling, and searching images. Originally it did not support any form of authentication, but this has been fixed, and in addition to local file storage, the open source docker-registry now supports S3, Azure, and a few other storage backends.

Another strong contender in the private registry space is the CoreOS Enterprise Registry. When CoreOS bought Quay.io, it quickly took the codebase and made it avaliable as an easily deployable Docker container. This product basically offers all the same features at Quay.io, but can be deployed internally. It ships as a virtual machine that you run as an appliance, and supports the same UI and interfaces as the public Quay.io.

In December of 2014, Docker announced that it was working to develop Docker Hub Enterprise (DHE), which will allow organizations to have a Docker-supported onpremise image registry in their data center or cloud environment.

Authenticating to a Registry#

Communicating with a registry that stores container images is part of daily life with Docker. For many registries, this means you’ll need to authenticate to gain access to images. But Docker also tries to make it easy to automate things so it can store your login information and use it on your behalf when you request things like pulling down a private image. By default, Docker assumes the registry will be Docker Hub, the public repository hosted by Docker, Inc.

Creating a Docker Hub account
For these examples, we will create an account on Docker Hub. You don’t need an account to use publicly shared images, but you will need one to upload your own public or private containers.

To create your account, use your web browser of choice to navigate to Docker Hub. From there, you can either log in via an existing GitHub account or create a new login based on your email address. When you first log in to your new account, you will land on the Docker welcome page, which is where you can configure details about your account.

When you create your account, Docker Hub sends a verification email to the address that you provided during signup. You should immediately log in to your email account and click the verification link inside the email to finish the validation process.

At this point, you have created a public registry to which you can upload new images. The “Global settings” option in your account sidebar will allow you to change your registry into a private one if that is what you need.

Logging in to a registry
Now let’s log in to the Docker Hub registry using our account:
# docker login
Username: johnklee
Password: ;lt&Enter password;gt&
Email: puremonkey2001@yahoo.com.tw
WARNING: login credentials saved in /root/.docker/config.json 
Login Succeeded

When we get “Login Succeeded” back from the server, we know we’re ready to pull images from the registry. But what happened under the covers? It turns out that Docker has written a dotfile for us in our home directory to cache this information. The permissions are set to 0600 as a security precaution against other users reading your credentials. You can inspect the file with something like:
# ls -al ~/.docker
... 
-rw------- 1 root root 233 Sep 20 04:27 config.json 

# cat ~/.docker/config.json
{
        "auths": {
                "docker:8080": {
                        "auth": "xxxxxxx",
                        "email": "jkclee@tw.ibm.com"
                },
                "https://index.docker.io/v1/": {
                        "auth": "xxxx",
                        "email": "puremonkey2001@yahoo.com.tw"
                }
        }
}

Here we can see the config.json file, owned by root, and the stored credentials in JSON format. From now on, when the registry needs authentication, Docker will look in config.json to see if we have credentials stored for this hostname. If so, it will supply them. You will notice that one value is completely lacking here: a timestamp. These credentials are cached forever or when we tell Docker to remove them, whichever comes first.

Just like logging in, we can also log out of a registry if we no longer want to cache the credentials:
# docker logout
# cat ~/.docker/config.json
{
        "auths": {
                "docker:8080": {
                        "auth": "xxx",
                        "email": "jkclee@tw.ibm.com"
                }
        }
}

Here we removed our cached credentials and they are no longer stored. If we were trying to log in to something other than the Docker Hub registry, we could supply the hostname on the command line:
# docker login someregistry.example.com

Mirroring a Registry#

It is possible to set up a local registry in your network that will mirror images from the upstream public registry so that you don’t need to pull commonly used images all the way across the Internet every time you need them on a new host. This can even be useful on your development workstation so that you can keep a local stash of frequently used images that you might need to access offline.

Configuring the Docker daemon
To do this, the first thing that you need to do is relaunch your Docker daemon with the --registry-mirror command-line argument, replacing${YOUR_REGISTRYMIRROR-HOST} with your Docker server’s IP address and port number (e.g., 172.17.42.10:5000).
Note.
If you plan to run the docker-registry container on your only Docker server, you can set ${YOUR_REGISTRY-MIRROR-HOST} to localhost:5000.

If you already have Docker running, you need to stop it first. This is distributionspecific. You should use the commands you normally use on your distribution, like initctlservice, or systemctl, to stop the daemon. Then we can invoke it manually with this registry mirroring option:
$ docker -d --registry-mirror=http://${YOUR_REGISTRY-MIRROR-HOST}

If you would like to ensure that your Docker daemon always starts with this setup, you will need to edit the appropriate configuration file for your Linux distibution.
  • Boot2Docker. Create /var/lib/boot2docker/profile if it doesn’t already exist:
$ sudo touch /var/lib/boot2docker/profile

Then edit /var/lib/boot2docker/profile and append the argument to your EXTRA_ARGS:
EXTRA_ARGS="--registry-mirror=http://${YOUR_REGISTRY-MIRROR-HOST}"

And then restart the docker daemon:
$ sudo /etc/init.d/docker restart

  • Ubuntu. Edit /etc/default/docker and append the argument to your DOCKER_OPTS:
DOCKER_OPTS="--registry-mirror=http://${YOUR_REGISTRY-MIRROR-HOST}"

And then restart the docker daemon:
$ sudo service docker.io restart

  • Fedora. Edit /etc/sysconfig/docker and append the argument to your OPTIONS:
OPTIONS="--registry-mirror=http://${YOUR_REGISTRY-MIRROR-HOST}"

And then restart the docker daemon:
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker

  • CoreOS. First copy the systemd unit file for Docker to a writeable filesystem:
$ sudo cp /usr/lib/systemd/system/docker.service /etc/systemd/system/

Then, as root, edit /etc/systemd/system/docker.service and append the argument to the end of the ExecStart line:
ExecStart=/usr/lib/coreos/dockerd --daemon --host=fd:// \
$DOCKER_OPTS $DOCKER_OPT_BIP $DOCKER_OPT_MTU $DOCKER_OPT_IPMASQ \
--registry-mirror=http://${YOUR_REGISTRY-MIRROR-HOST}

And then restart the docker daemon:
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker

Launching the local registry mirror service
You will now need to launch a container on your Docker host that will run the registry mirror service and provide you with a local cache of Docker images. You can accomplish this by running the registry image as a container with a few important environment variables defined and a storage volume mounted.

On your Docker server, ensure that you have a directory for storing the images:
$ mkdir -p /var/lib/registry

Then you can launch the container, with the following options defined:
// Readme.md
$ docker run -d -p 5000:5000 \ 
-v /var/lib/registry:/tmp/registry \ 
-e SETTINGS_FLAVOR=dev \ 
-e STANDALONE=false \ 
-e MIRROR_SOURCE=https://registry-1.docker.io \ 
-e MIRROR_SOURCE_INDEX=https://index.docker.io \ 
registry

Testing the local registry mirror service
Now that the registry is running as a mirror, we can test it. On a Unix-based system, you can time how long it takes to download the newest CentOS image, using the following command:
$ time docker pull centos:latest
Pulling repository centos 
88f9454e60dd: Download complete 
511136ea3c5a: Download complete 
5b12ef8fd570: Download complete 
Status: Downloaded newer image for centos:latest 
real 1m25.406s 
user 0m0.019s 
sys 0m0.014s

In this case, it took 1 minute and 25 seconds to pull the whole image. If we then go ahead and delete the image from the Docker host and then re-time fetching the image again, we will see a significant difference:
$ docker rmi centos:latest
Untagged: centos:latest
$ time docker pull centos:latest
Pulling repository centos 
88f9454e60dd: Download complete 
511136ea3c5a: Download complete 
5b12ef8fd570: Download complete 
Status: Image is up to date for centos:latest 
real 0m2.042s 
user 0m0.004s 
sys 0m0.005s

Both times that you pulled the centos:latest image, the Docker server connected to the local registry mirror service and asked for the image. In the first case, the mirror service did not have the image so it had to pull it from the official docker-registry first, add it to its own storage, and then deliver it to the Docker server. After you delete the image from the Docker server and then request it again, you’ll see that the time to pull the image will drop to be very low. In the previous code, it took only two seconds for the Docker server to receive the image. This is because the local registry mirror service had a copy of the image and could provide it directly to the server without pulling anything from the upstream public docker-registry.

Other Approaches to Image Delivery#

Over the last two years, the community has explored many other approaches to managing Docker images and providing simple but reliable access to images when needed. Some of these projects, like dogestry, leverage the docker save and docker load commands to create and load images from cloud storage like Amazon S3. Other people are exploring the possibilities of using torrents to distribute Docker images, with projects like torrent-docker. Torrents seem like a natural fit because deployment is usually done to a group of servers on the same network all at the same time. Solomon Hykes recently committed that the Docker Distribution project will soon ship a command-line tool for importing and exporting image layers even without a Docker daemon. This will facilitate even more diverse methods of image distribution. As more and more companies and projects begin to use Docker seriously, even more robust solutions are likely to begin to appear to meet the needs of anyone’s unique workflow and requirements.

If you have a scenario in which you can’t use the off-the-shelf mechanisms, such as an isolated network for security concerns, you can leverage Docker’s built-in importing and exporting features to dump and load new images. Unless you have a specific reason to do otherwise, you should use one of the off-the-shelf solutions and only considering changing your approach when needed. The available options will work for almost everyone.

Supplement#

沒有留言:

張貼留言

網誌存檔

關於我自己

我的相片
Where there is a will, there is a way!