Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Theory: Dockerfile

A Dockerfile is a text file with instructions. Docker reads it from top to bottom and produces an image. Each instruction that changes the filesystem usually creates a new layer in that image.

Think of it as a recipe that answers: which starting system, what to copy in, what to install, and what command runs when someone starts a container.

Base image and FROM

The base image is whatever you name in the first FROM line. It is your starting filesystem and often your main supply-chain choice: who publishes it, how often it is patched, and what packages it already contains. Official language or distro images are common starting points. Pinning a tag (for example python:3.12-slim-bookworm) is better than bare latest for repeatable builds. Pinning a digest fixes the exact bits for production.

Every Dockerfile must begin with FROM (or a parser directive, which you rarely need in introductory labs).

Example Dockerfile (small Python service)

Below is a shape you can compare to weaker files in Lab: Dockerfile static analysis. It is not the only valid style, but it shows the usual building blocks in order.

# Base: slim OS + Python runtime (pin tag in real pipelines)
FROM python:3.12-slim-bookworm

# Working directory inside the image for later commands
WORKDIR /app

# Install OS packages only if needed, in one layer, then clean apt cache
RUN apt-get update \
  && apt-get install -y --no-install-recommends ca-certificates \
  && rm -rf /var/lib/apt/lists/*

# Copy dependency list first so Docker can cache the install layer
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Application source (changes often, so it comes after slow steps)
COPY app.py .

# Document the port (publish with docker run -p or Compose)
EXPOSE 8080

# Non-root user when the runtime allows it
RUN useradd --create-home --uid 10001 appuser
USER appuser

# Default process: one main PID per container is the usual goal
CMD ["python", "app.py"]

Line map: FROM is the base. WORKDIR avoids long paths in later commands. RUN with apt or pip mutates the image during build. COPY brings files from the build context (the directory you pass to docker build). EXPOSE documents intent only. USER lowers privilege for CMD and ENTRYPOINT. CMD is the default command at docker run time.

If requirements.txt or app.py are not in your lab folder yet, treat the block as a reference. Lab: Slim Python images uses a similar pattern with a single script and no requirements.txt.

Common instructions (beginner set)

  • FROM chooses the starting image (for example a minimal Linux with a language runtime). Every Dockerfile begins with a base.
  • WORKDIR sets the current directory for later commands inside the image.
  • COPY copies files from your build folder (the context) into the image. Prefer COPY for plain files. Use ADD only when you need its extra behaviors (for example auto-extract tar).
  • RUN executes a command while building the image (install packages, compile code).
  • CMD sets the default command when a container starts. It can be overridden at docker run.
  • ENTRYPOINT also defines what runs at start. It pairs with CMD in more advanced patterns. For now, remember one primary process per container is the usual goal.
  • USER switches which Linux user runs later RUN, CMD, and ENTRYPOINT steps. Running as non-root inside the image is a common hardening step.
  • EXPOSE documents which ports the app uses. It does not publish ports by itself. Publishing happens at docker run or in Compose with ports:.

Build context

When you run docker build, Docker sends a context (often the current directory) to the daemon. Only files you COPY or ADD need to be in that context. A large or sloppy context slows builds and can accidentally include secrets if you are not careful.

Layers and cache

If a line in the Dockerfile does not change, Docker can reuse the cached layer from a previous build. Order matters. Put lines that change often (copying app source) after lines that change rarely (installing dependencies) so installs stay cached.

Why Dockerfile review is security work

Every line is a policy decision.

  • FROM picks your patch level and supply chain (who built the base, how often it updates).
  • RUN apt install … decides which packages and versions ship in the image.
  • COPY can pull in config or keys if someone drops them in the wrong folder.
  • USER controls whether the default process runs as root.

Static analysis tools (covered later) treat the Dockerfile as code and flag risky patterns.

Next: Theory: Engine architecture.