Source From Here
PrefaceUnderstand disk space usage and reclaiming the unused part
In this piece, we’ll go back to basics. We will look at how Docker uses the disk space of the host machine and how to reclaim it when it is not being used anymore:
Overall Consumption
Docker is great, there’s no doubt about that. A couple of years ago, it provided a new way to build, ship and run any workloads by democratizing the usage of containers and hugely simplifying management of their lifecycle. It also brought the developer the ability to run any applications without polluting the local machine. But, when we run containers, pull images, deploy complex application stacks, and build our own images the footprint on our host filesystem might increase in a significant way.
If we have not cleaned up our local machine for a while we might be surprised by the result of this command:
This command shows Docker’s disk usage in several categories:
* Images:
* Containers:
* Local Volumes:
* Build Cache:
From the output above, we can see quite a lot of disk space can be reclaimable. In other words, as it’s not in use by Docker, it can be given back to the host machine.
Containers Disk Usage
Each time a container is created, several folders and files are created under /var/lib/docker on the host machine. Among them:
* the /var/lib/docker/containers/ID folder (ID being the container’s unique identifier)
* a folder within /var/lib/docker/overlay2
Let’s imagine we have a brand new system where Docker has just been installed:
First, we start a NGINX container:
Running the df command again, we can now see:
There is no reclaimable space yet as the container is running and the image is currently in use. As the size of the container (2B) is negligible and thus not easy to track on the filesystem, let’s create an empty 100MB file in the container’s filesystem. For this purpose, we use the handy dd command from within the www container.
This file is created in the read-write layer associated with this container. If we check the output of the df command again, we can now see the container now takes up some additional disk space:
Where is this file located on the host? Let’s take a look:
Without going too deep into the details, this file was created in the container’s read-write layer which is managed by the overlay2 driver. If we stop the container, the disk space used by the container becomes reclaimable. Let’s take a look:
How can this space be reclaimed? By deleting the container, which will delete the associate read-write container’s layer. The following commands allow us to delete all stopped containers at once and to reclaim the disk space they’re using:
From the output, we can see there is no more space used by containers and, as the image is not used anymore (no container is running), the space it uses on the host filesystem can be reclaimed:
Note: As soon as an image is used by at least one container, the disk space it uses cannot be reclaimed.
The prune subcommand we used above removes the stopped containers. If we need to remove all containers, the running ones and the stopped ones we can use one of the following commands (both are equivalent):
Note: It’s often useful to use the --rm flag when running a container so that it is automatically removed when it’s PID 1 process is stopped, thus releasing unused disk immediately.
Images Disk Usage
A couple of years ago, it was common to have several hundred MB per image. Ubuntu was around 600MB, Microsoft .Net images weighed several GB (true story). At that time, pulling only a couple of images could quickly impact the disk space of the host machine, even if the layers are shared between images. This is less true today — base images are much lighter — but after a certain amount of time, piling up images will definitely have an impact if we’re not careful.
There are several kinds of images that are not directly visible to the end-user:
The following commands list the existing dangling image on the system:
To remove the dangling image we can go the long way:
Or we can use the prune subcommand:
In case we need to remove all images at once (not only the dangling ones) we can run the following command. This will not be able to remove the images currently used by a container though:
Volumes Disk Usage
Volumes are used to store data outside of a container filesystem. For instance, when a container runs a stateful application we want the data to be persisted outside of the container so they are decoupled from the container life-cycle. Volumes are also used because heavy filesystem operations inside the container are bad for performance.
Say we run a container based on MongoDB and then use it to test a backup we previously did (available locally in the bck.json file):
The data within the backup file will be stored on the host in the /var/lib/docker/volumes folder. Why is this data not saved within the container’s layer? Because in the mongo image’s Dockerfile the location /data/db (where mongo stores its data by default) is defined as a volume:
Note: Many images, often related to stateful applications, define volumes to manage data outside of the container’s layer.
When we are done testing the backup we stop or remove the container. But the volume is not removed — it stays there consuming disk space unless we explicitly remove it. To remove the volumes not used any longer, we can go the long way:
Or we can use the prune subcommand:
Build Cache Disk Usage
The Docker 18.09 release introduces enhancements for the build process through BuildKit. Using this tool can improve performance, storage management, feature functionality, and security. We won’t detail BuildKit in this piece, but just look at how to enable it and how it affects disk usage.
Let’s consider the following dummy Node.Js application and its associated Dockerfile: index.js file defines a simple HTTP server which exposes the ‘/’ endpoint and replies with a string for each request received:
- var express = require('express');
- var util = require('util');
- var app = express();
- app.get('/', function(req, res) {
- res.setHeader('Content-Type', 'text/plain');
- res.end(util.format("%s - %s", new Date(), 'Got Request'));
- });
- app.listen(process.env.PORT || 80);
- {
- "name": "testnode",
- "version": "0.0.1",
- "main": "index.js",
- "scripts": {
- "start": "node index.js"
- },
- "dependencies": {
- "express": "^4.14.0"
- }
- }
- FROM node:13-alpine
- COPY package.json /app/package.json
- RUN cd /app && npm install
- COPY . /app/
- WORKDIR /app
- EXPOSE 80
- CMD ["npm", "start"]
If we check the disk usage, we only see the base image (node:13-alpine pulled at the beginning of the build) and the final image of the build (app:1.0):
Let’s now build the version 2.0 of the image using BuildKit. We just need to set the DOCKER_BUILDKIT to 1:
If we check the disk usage once more, we can see build-cache was created:
$ docker system df
- TYPE TOTAL ACTIVE SIZE RECLAIMABLE
- Images 2 0 109.3MB 109.3MB (100%)
- Containers 0 0 0B 0B
- Local Volumes 0 0 0B 0B
- Build Cache 11 0 8.949kB 8.949kB
Cleaning Everything at Once
As we saw in the examples above, each of the container, image and volume commands provides the prune subcommand to reclaim disk space. The prune subcommand is available at the Docker’s system-level so it reclaims all the unused disk space at once:
Running this command once in a while to clean up the disk is a good habit to have.
沒有留言:
張貼留言