Intro Guide to Dockerfile Best Practices

By Tibor Vass July 02 2019 
 
https://blog.docker.com/2019/07/intro-guide-to-dockerfile-best-practices/

There are over one million Dockerfiles on GitHub today, but not all Dockerfiles are created equally. Efficiency is critical, and this blog series will cover five areas for Dockerfile best practices to help you write better Dockerfiles: incremental build time, image size, maintainability, security and repeatability. If you’re just beginning with Docker, this first blog post is for you! The next posts in the series will be more advanced.

Important note: the tips below follow the journey of ever-improving Dockerfiles for an example Java project based on Maven. Thelast Dockerfile is thus the recommended Dockerfile, while all intermediate ones are there only to illustrate specific best practices.

Incremental build time

In a development cycle, when building a Docker image, making code changes, then rebuilding, it is important to leverage caching. Caching helps to avoid running build steps again when they don’t need to.

Tip #1: Order matters for caching

However, the order of the build steps (Dockerfile instructions) matters, because when a step’s cache is invalidated by changing files or modifying lines in the Dockerfile, subsequent steps of their cache will break. Order your steps from least to most frequently changing steps to optimize caching.

Tip #2: More specific COPY to limit cache busts

Only copy what’s needed. If possible, avoid “COPY  .” When copying files into your image, make sure you are very specific about what you want to copy. Any changes to the files being copied will break the cache. In the example above, only the pre-built jar application is needed inside the image, so only copy that. That way unrelated file changes will not affect the cache.

Tip #3: Identify cacheable units such as apt-get update & install

Each RUN instruction can be seen as a cacheable unit of execution. Too many of them can be unnecessary, while chaining all commands into one RUN instruction can bust the cache easily, hurting the development cycle. When installing packages from package managers, you always want to update the index and install packages in the same RUN: they form together one cacheable unit. Otherwise you risk installing outdated packages.

Reduce Image size

Image size can be important because smaller images equal faster deployments and a smaller attack surface.

Tip #4: Remove unnecessary dependencies

Remove unnecessary dependencies and do not install debugging tools. If needed debugging tools can always be installed later. Certain package managers such as apt, automatically install packages that are recommended by the user-specified package, unnecessarily increasing the footprint. Apt has the –no-install-recommends flag which ensures that dependencies that were not actually needed are not installed. If they are needed, add them explicitly.

Tip #5: Remove package manager cache

Package managers maintain their own cache which may end up in the image. One way to deal with it is to remove the cache in the same RUN instruction that installed packages. Removing it in another RUN instruction would not reduce the image size.

There are further ways to reduce image size such as multi-stage builds which will be covered at the end of this blog post. The next set of best practices will look at how we can optimize for maintainability, security, and repeatability of the Dockerfile.

Maintainability

Tip #6: Use official images when possible

Official images can save a lot of time spent on maintenance because all the installation steps are done and best practices are applied. If you have multiple projects, they can share those layers because they use exactly the same base image.

Tip #7: Use more specific tags

Do not use the latest tag. It has the convenience of always being available for official images on Docker Hub but there can be breaking changes over time. Depending on how far apart in time you rebuild the Dockerfile without cache, you may have failing builds.

Instead, use more specific tags for your base images. In this case, we’re using openjdk. There are a lot more tags available so check out the Docker Hub documentation for that image which lists all the existing variants.

Tip #8: Look for minimal flavors

Some of those tags have minimal flavors which means they are even smaller images. The slim variant is based on a stripped down Debian, while the alpine variant is based on the even smaller Alpine Linux distribution image. A notable difference is that debian still uses GNU libc while alpine uses musl libc which, although much smaller, may in some cases cause compatibility issues. In the case of openjdk, the jre flavor only contains the java runtime, not the sdk; this also drastically reduces the image size.

Reproducibility

So far the Dockerfiles above have assumed that your jar artifact was built on the host. This is not ideal because you lose the benefits of the consistent environment provided by containers. For instance if your Java application depends on specific libraries it may introduce unwelcome inconsistencies depending on which computer the application is built.

Tip #9: Build from source in a consistent environment

The source code is the source of truth from which you want to build a Docker image. The Dockerfile is simply the blueprint.

You should start by identifying all that’s needed to build your application. Our simple Java application requires Maven and the JDK, so let’s base our Dockerfile off of a specific minimal official maven image from Docker Hub, that includes the JDK. If you needed to install more dependencies, you could do so in a RUN step.

The pom.xml and src folders are copied in as they are needed for the final RUN step that produces the app.jar application with mvn package. (The -e flag is to show errors and -B to run in non-interactive aka “batch” mode).

We solved the inconsistent environment problem, but introduced another one: every time the code is changed, all the dependencies described in pom.xml are fetched. Hence the next tip.

Tip #10: Fetch dependencies in a separate step

By again thinking in terms of cacheable units of execution, we can decide that fetching dependencies is a separate cacheable unit that only needs to depend on changes to pom.xml and not the source code. The RUN step between the two COPY steps tells Maven to only fetch the dependencies.

There is one more problem that got introduced by building in consistent environments: our image is way bigger than before because it includes all the build-time dependencies that are not needed at runtime.

Tip #11: Use multi-stage builds to remove build dependencies (recommended Dockerfile)

Multi-stage builds are recognizable by the multiple FROM statements. Each FROM starts a new stage. They can be named with the AS keyword which we use to name our first stage “builder” to be referenced later. It will include all our build dependencies in a consistent environment.

The second stage is our final stage which will result in the final image. It will include the strict necessary for the runtime, in this case a minimal JRE (Java Runtime) based on Alpine. The intermediary builder stage will be cached but not present in the final image. In order to get build artifacts into our final image, use COPY --from=STAGE_NAME. In this case, STAGE_NAME is builder.

Multi-stage builds is the go-to solution to remove build-time dependencies.

We went from building bloated images inconsistently to building minimal images in a consistent environment while being cache-friendly. In the next blog post, we will dive more into other uses of multi-stage builds.

 

[转帖]Intro Guide to Dockerfile Best Practices的更多相关文章

  1. [转帖]Docker学习之Dockerfile命令详解

    Docker学习之Dockerfile命令详解 https://it.baiked.com/system/docker/2436.html 图挺好的 前言 之前,制作镜像的伪姿势搭建已经见过了,今天介 ...

  2. Cheatsheet: 2019 07.01 ~ 09.30

    Other Intro Guide to Dockerfile Best Practices QuickJS Javascript Engine Questions for a new technol ...

  3. Images之Dockerfile中的命令1

    Dockerfile reference Docker can build images automatically by reading the instructions from a Docker ...

  4. FW: Dockerfile RUN, CMD & ENTRYPOINT

    Dockerfile RUN, CMD & ENTRYPOINT     在使用Dockerfile创建image时, 有几条指令比较容易混淆, RUN, CMD, ENTRYPOINT. R ...

  5. 【翻译】Dockerfile参考

    Dockerfile参考 来自docker官方网址:https://docs.docker.com/engine/reference/builder/ docker能够从Dockerfile中读取指令 ...

  6. Dockerfile注意事项

    准则 尽量将Dockerfile放在空目录中,如果目录中必须有其他文件,则使用.dockerignore文件. 避免安装不必须的包. 每个容器应该只关注一个功能点. 最小化镜像的层数. 多行参数时应该 ...

  7. Dockerfile 中的 COPY 与 ADD 命令

    Dockerfile 中提供了两个非常相似的命令 COPY 和 ADD,本文尝试解释这两个命令的基本功能,以及其异同点,然后总结其各自适合的应用场景. Build 上下文的概念 在使用 docker ...

  8. 转:Oculus Unity Development Guide开发指南(2015-7-21更新)

    http://forum.exceedu.com/forum/forum.php?mod=viewthread&tid=34175 Oculus Unity Development Guide ...

  9. 【Docker】涨姿势,深入了解Dockerfile 中的 COPY 与 ADD 命令

    参考资料:https://www.cnblogs.com/sparkdev/p/9573248.html Dockerfile 中提供了两个非常相似的命令 COPY 和 ADD,本文尝试解释这两个命令 ...

随机推荐

  1. 如何制作纯净的U盘启动盘

    1.去下载**WinPE工具箱**U盘启动盘制作工具 下载地址:http://www.wepe.com.cn/download.html

  2. 「SNOI2017」礼物

    题目链接:Click here Solution: 设\(f(x)\)代表第\(x\)个人送的礼物的数量,\(s(x)\)代表\(f(x)\)的前缀和,即: \[ f(x)=s(x-1)+x^k\\ ...

  3. SDOI2015 寻宝游戏 | noi.ac#460 tree

    题目链接:戳我 可以知道,我们相当于是把有宝藏在的地方围了一个圈,求这个圈最小是多大. 显然按照dfs序来遍历是最小的. 那么我们就先来一遍dfs序列,并且预处理出来每个点到根的距离(这样我们就可用\ ...

  4. 第一章 大体知道java语法1----------能写java小算法

    很多人开始学习java时,都是抱着诸如<Thinking in java>.<疯狂java>等书籍,从前到后慢慢翻看,不管其内容重要与否,也不关心自己以后能否使用到.我的建议是 ...

  5. Oracle实现分页,每页有多少条记录数

    分页一直都是关系数据库的热门,在数据量非常多的情况下,需要根据分页展示,每页展示多少条记录,以此减轻数据的压力; 1实现原理,根据rownum取记录数,根据公式(页数-1)*每页想要展示的记录数 AN ...

  6. Linux 服务器安装jdk,mysql,tomcat简要教程

    linux服务器是阿里云上买的,学生价9.9/月,拿来学习下. 需要准备软件工具: 1.editplus (编辑服务器上的文件) 2.PuTTY (Linux命令连接器) 3.FlashFXP(上传文 ...

  7. 06.旋转数组的最小数字 Java

    题目描述 把一个数组最开始的若干个元素搬到数组的末尾,我们称之为数组的旋转. 输入一个非减排序的数组的一个旋转,输出旋转数组的最小元素. 例如数组{3,4,5,1,2}为{1,2,3,4,5}的一个旋 ...

  8. git commit 合并到指定分支

    1. 将指定的commit合并到当前分支 git cherry-pick  commit_id 2. 合并多个连续 commit 到指定分支 假设需要合并 devlop 上从 fb407a3f 到 9 ...

  9. Solr 5.2.1 部署并索引Mysql数据库

    1.Solr简介 Solr是一个高性能,采用Java5开发,SolrSolr基于Lucene的全文搜索服务器.同时对其进行了扩展,提供了比Lucene更为丰富的查询语言,同时实现了可配置.可扩展并对查 ...

  10. Python 图形界面元素

    from tkinter import * import os def button_click1(): try: filePath = r'D:\CloudMusic' os.system(&quo ...