3. Best Practices for Optimising Docker Images Flashcards by Alican Demirtas

Why is it important to understand the anatomy of Docker images?

Understanding how Docker images are constructed is key to managing their size.

How well did you know this?

Not at all

Perfectly

Where is the configuration data for an image located in the result from the docker inspect command?

Inside the top-level Config object.

{
   ...
   "Config": { ... }
   ...
}

How well did you know this?

Not at all

Perfectly

Where is the working directory path for an image located in the result from the docker inspect command?

Inside WorkingDir field of the top-level Config object.

{
   ...
   "Config": { 
        ...
        "WorkingDir": ".",
        ... 
    },
   ...
}

How well did you know this?

Not at all

Perfectly

How can we inspect a Docker image?

By running the docker inspect command with the image’s tag or id.

docker inspect mini:1.0

How well did you know this?

Not at all

Perfectly

Where is the filesystem definition for derived containers located in the result from the docker inspect command?

Inside the top-level RootFS object.

{
   ...
   "RootFS": { ... }
   ...
}

How well did you know this?

Not at all

Perfectly

Where is the content layers information for the image located in the result from the docker inspect command?

Inside the Layers field of the top-level RootFS object.

{
   ...
   "RootFS": {
        ...
        "Layers": [ ... ],
        ... 
   }
   ...
}

How well did you know this?

Not at all

Perfectly

What are the three types of Dockerfiel instructions?

Instructions — Dockerfile instructions define the content and nature of images.
Metadata — instructions that define how derived containers will get executed.
Content — instructions that create files and directories for the image.

How well did you know this?

Not at all

Perfectly

What are the three content-creating Dockerfile instructions?

COPY — used to copy the build context into an image.
ADD — similar to COPY instruction, but can retrieve remote content.
RUN — executes commands to generate additional image content.

How well did you know this?

Not at all

Perfectly

What are the three content-creating Dockerfile instructions?

COPY — used to copy the build context into an image.
ADD — similar to COPY instruction, but can retrieve remote content.
RUN — executes commands to generate additional image content.

How well did you know this?

Not at all

Perfectly

What are the three content-creating Dockerfile instructions?

COPY — used to copy the build context into an image.
ADD — similar to COPY instruction, but can retrieve remote content.
RUN — executes commands to generate additional image content.

How well did you know this?

Not at all

Perfectly

What are the three Dockerfiel instructions that add a new content layer to the image?

COPY, ADD, and RUN.

How well did you know this?

Not at all

Perfectly

Which is the recommended method of copying content from the build context into the image context, COPY or ADD?

COPY wherever we can.

How well did you know this?

Not at all

Perfectly

What does Docker do with the layers created by content-creating Dockerfile instructions?

When a container is created, Docker assembles the content of each of the image layers to present a homogenous set of files and directories. This forms the basis of the container filesystem.

How well did you know this?

Not at all

Perfectly

What can be said about the mutability of a Docker image?

The content of the images is read-only and in this way, Docker achieves immutability for images.

How well did you know this?

Not at all

Perfectly

What can be said about the duplication/sharing of a Docker image’s content when creating other images or building containers from it?

We can build images from other images and build containers that all share the same content without having to duplicate everything due to the immutable nature of containers and images.

How well did you know this?

Not at all

Perfectly

How can containers write to their filesystem?

Study These Flashcards

To allow this to happen, Docker adds a final layer on top of the layers from the image that the container is built from. This is a unique, temporary writeable layer for every container instance — it’s not a part of the image and is removed when the container is removed.

What does copy-on-write mean?

Study These Flashcards

If the container needs to alter the content of a file that is built in one of the read-only layers, it first copies the content to the writeable layer where it can be amended without affecting the content in the image.

Where does the container create new files/directories?

Study These Flashcards

In a final, temporary, writeable layer that is separate from all the other layers, which are immutable.

What happens when a container attempts to edit some content that is inside one of its immutable layers?

Study These Flashcards

The content is copied into the writeable layer where it can be amended without affecting the content in the image.

What happens when a container attempts to delete content that is inside one of its immutable layers?

Study These Flashcards

The content remains in the image layer but is obscured in the union of the layers using a technique called whiteout.

Is it a good idea to use CMD instruction to add a package that we temporarily need for another command, and then remove it by using CMD once we’re done using it? Why?

Study These Flashcards

No, it’s not. When we delete something from an image using CMD, it’s obscured at the layer that we issue its deletion. However, it still exists at the layer that it was created in. This means that the image size will stay the same, or it might even get bigger!

What is a trick we can use to install a package and delete it after its use in the Dockerfile without increasing the image size?

Study These Flashcards

Instead of using multiple CMD instructions for each operation, use a single CMD and chain commands using the Shell logical “and” operator (&&) to execute each command in sequence.

What can we take advantage of Docker’s build cache?

Study These Flashcards

Careful placement of Dockerfile instructions can maximize cache hits and significantly reduce the time it takes to build images.

How does the Docker build cache work?

Study These Flashcards

Each Dockerfile instruction processed during a build results in the creation of an intermediary image that is part of the build cache.
These images are created by commiting containers created from the image associated with the preceding Dockerfile instruction.
Images reference their parent image and thereby create an implicit chain of images that represent a sequence of instructions.
During an image build, if Docker recognises the sequence of steps that already exists in cache, it desists from executing the Dockerfile instructions that correspond to the sequence. Instead, it reuses the metadata and layer content from the cache intermediary images.

How does the Docker build cache work?

1. Each Dockerfile instruction processed during a build results in the creation of an intermediary image that is part of the build cache. 2. These images are created by *commiting* containers created from the image associated with the preceding Dockerfile instruction. 3. Images reference their parent image and thereby create an implicit chain of images that represent a sequence of instructions. 4. During an image build, if Docker recognises the sequence of steps that already exists in cache, it desists from executing the Dockerfile instructions that correspond to the sequence. Instead, it reuses the metadata and layer content from the cache intermediary images.

How does the Docker build cache work?

What causes cache invalidation when building Docker images?

1. Instruction change — adding, removing, or altering an instruction invalidates the cache. 2. Checksum check — content change in the build context will invalidate the cache.

What seemingly significant change is not checked by Docker when determining the validation of cache?

Command output — consequences of command execution are not checked.

What seemingly significant change is not checked by Docker when determining the validation of cache?

Command output — consequences of command execution are not checked.

What causes cache invalidation when building Docker images?

1. Instruction change — adding, removing, or altering an instruction invalidates the cache. 2. Checksum check — content change in the build context will invalidate the cache.

What is the optimisation issue caused by copying across all files from the build context into the container with a single `CMD` instruction?

Since we're copying across dependency files (eg. package.json, yarn.lock) alongside the source code, a change to the source code will invalidate the cache right at that instruction and cause dependencies to be installed all over again.

How can we get around the optimisation issue of installing dependencies every time there's a change to the source code of our project?

By separating the copying across of source code from that of dependency files (eg. package.json, yarn.lock) and placing it afterward. This means that the cache will be valid at the `COPY` instruction for the dependency files when a change occurs in the source code. ``` COPY package.json yarn.lock ./ RUN yarn install COPY spec source ./ # invalidated here, cache is valid for previous steps. ```

What are the three guidelines to follow for taking better advantage of the build cache when writing Dockerfile instructions?

1. Analyze the dependencies between Dockerfile instructions to determine ordering constraints. 2. Order Dockerfile instructions according to the frequency of change; less frequent first, more frequent last. 3. Where it is beneficial, split `COPY` Dockerfile instructions that copy content from the build context.

What is the relationship between multistage Dockerfiles and size optimisation for images?

Multi-stage Dockerfiles can the key to managing the size of Docker images.

What is the drawback of using a single instruction to run multiple commands?

If we were to make a change to one of the multiple commands, all the other instructions in that layer will get executed all over again.

What is an alternative to using a single instruction to run multiple commands for optimizing Docker image size?

To use a separate stage with every command getting its own `RUN` instruction, rather than there being a single `RUN`.

What does it practically mean to use multi-stage Dockerfiles?

1. Return to a RUN instruction for each command. 2. Maximize the use of the build cache. 3. Temporary content resides in a previous stage.

3. Best Practices for Optimising Docker Images Flashcards

(37 cards)