Skip to content

Docker Best Practices

Proven patterns and techniques for building efficient, secure, and maintainable Docker images and containers.

Table of Contents

  1. Dockerfile Best Practices
  2. Multi-Stage Builds
  3. Health Checks
  4. .dockerignore
  5. Security
  6. Image Size Optimization
  7. Networking
  8. Logging & Monitoring

Dockerfile Best Practices

1. Use Specific Base Image Tags

BAD: Generic, unpredictable versions

FROM python
FROM node
FROM ubuntu

GOOD: Pinned, reproducible versions

FROM python:3.11-slim
FROM node:18-alpine
FROM ubuntu:22.04

Why? Untagged images use latest which changes unpredictably, causing inconsistent builds.

2. Minimize Layers

BAD: Multiple RUN commands

FROM ubuntu:22.04
RUN apt-get update
RUN apt-get install -y python3
RUN apt-get install -y pip
RUN pip install flask
RUN pip install redis
RUN apt-get clean

GOOD: Combine RUN commands

FROM ubuntu:22.04
RUN apt-get update && \
    apt-get install -y python3 pip && \
    pip install flask redis && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

Why? Each RUN creates a layer. More layers = larger images. Combining reduces size and build time.

3. Order Instructions for Cache Efficiency

BAD: Unstable instructions first

FROM python:3.11-slim
COPY . /app              # Changes frequently
WORKDIR /app
RUN pip install -r requirements.txt  # Cached even if code changed!

GOOD: Stable instructions first

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .  # Changes less frequently
RUN pip install -r requirements.txt
COPY . .                 # Changes frequently, after pip cache

Why? Docker caches layers. Put stable, expensive operations first. Code changes invalidate cache below it.

4. Use .dockerignore

BAD: Copy everything

COPY . /app
# Includes: .git, node_modules, venv, .env, etc.

GOOD: Exclude unnecessary files

# .dockerignore
.git
.gitignore
node_modules
venv
.env
.env.local
.venv
__pycache__
.pytest_cache
.DS_Store
*.pyc
.idea
.vscode

Why? Large images = slower builds, slower deploys, more storage.

5. Non-Root User

BAD: Running as root

FROM ubuntu:22.04
COPY app /app
WORKDIR /app
CMD ["python", "app.py"]
# Container runs as root - security risk!

GOOD: Create dedicated user

FROM ubuntu:22.04
RUN groupadd -r appuser && useradd -r -g appuser appuser
COPY app /app
WORKDIR /app
RUN chown -R appuser:appuser /app
USER appuser
CMD ["python", "app.py"]

Why? If container compromised, attacker has root access. Dedicated user limits damage.

6. Use ENTRYPOINT for Commands

BAD: Using CMD for executable

FROM python:3.11
CMD ["python", "app.py"]
# docker run myapp arg1 arg2 -> Ignores args!

GOOD: Use ENTRYPOINT

FROM python:3.11
ENTRYPOINT ["python", "app.py"]
# docker run myapp arg1 arg2 -> Args passed to app!

Or combined:

FROM python:3.11
ENTRYPOINT ["python"]
CMD ["app.py"]
# Can override: docker run myapp manage.py migrate

7. Explicit Port Exposure

BAD: Implicit, unclear ports

FROM nginx
# What port does nginx use?

GOOD: Document with EXPOSE

FROM nginx
EXPOSE 80 443
# Clear: nginx uses ports 80 and 443

Note: EXPOSE doesn't actually publish ports. Use -p flag when running:

docker run -p 8080:80 nginx

8. Environment Variables

BAD: Hardcoded values

FROM myapp
ENV DATABASE_URL=postgresql://prod.db.com:5432/mydb
ENV API_KEY=secret123

GOOD: No defaults in image

FROM myapp
# Set at runtime: docker run -e DATABASE_URL=... myapp
# Or in docker-compose.yml

Why? Images are portable. Don't bake secrets or environment-specific settings.

Complete Example

# Multi-stage build (see below for details)
FROM python:3.11-slim as builder

WORKDIR /build
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential && \
    rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

# Production stage
FROM python:3.11-slim

# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser appuser

WORKDIR /app

# Copy Python dependencies from builder
COPY --from=builder --chown=appuser:appuser /root/.local /home/appuser/.local
ENV PATH=/home/appuser/.local/bin:$PATH

# Copy application code
COPY --chown=appuser:appuser app /app

USER appuser

EXPOSE 5000

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:5000/health')"

CMD ["python", "app.py"]

Multi-Stage Builds

Multi-stage builds reduce final image size by using intermediate "builder" stages.

The Problem

FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]
# Final image: ~900MB (includes build tools, node_modules, source)

The Solution

# Stage 1: Builder
FROM node:18 as builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# builder stage: 900MB

# Stage 2: Runtime
FROM node:18-alpine as runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package.json .
EXPOSE 3000
CMD ["npm", "start"]
# Final image: 200MB (only production dependencies)

Real-World Examples

Python Example:

# Builder stage
FROM python:3.11 as builder
WORKDIR /build
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Runtime stage
FROM python:3.11-slim
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
WORKDIR /app
COPY app .
CMD ["python", "main.py"]

Go Example:

# Builder
FROM golang:1.21 as builder
WORKDIR /src
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /app

# Runtime
FROM scratch
COPY --from=builder /app /app
EXPOSE 8080
ENTRYPOINT ["/app"]
# Minimal image with just the binary!


Health Checks

Health checks tell Docker (and orchestrators like Kubernetes) if your application is healthy.

Basic Health Check

FROM nginx
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
    CMD curl -f http://localhost/ || exit 1
Parameter Default Meaning
--interval 30s Check every 30 seconds
--timeout 30s Wait max 30s for check
--start-period 0s Grace period before first check
--retries 3 Fail after 3 consecutive failures

Health Check Methods

HTTP (REST API):

FROM python:3.11
HEALTHCHECK CMD curl -f http://localhost:5000/health || exit 1

TCP (Database):

FROM postgres:15
HEALTHCHECK CMD pg_isready -U postgres || exit 1

Script:

FROM myapp
COPY healthcheck.sh /
HEALTHCHECK CMD /healthcheck.sh

Shell Command:

FROM nodejs
HEALTHCHECK CMD npm run health-check || exit 1

Real-World Example

FROM flask-app:1.0
EXPOSE 5000

# Wait 5 seconds for startup, check every 10 seconds
# If 3 consecutive checks fail, mark unhealthy
HEALTHCHECK \
    --interval=10s \
    --timeout=5s \
    --start-period=5s \
    --retries=3 \
    CMD python -c "import urllib.request; \
        urllib.request.urlopen('http://localhost:5000/health')" \
    || exit 1

View Health Status

# Check health
docker inspect myapp | grep -A 5 Health

# Watch logs
docker logs -f myapp

# Health changes
docker events --filter 'type=container' | grep health

.dockerignore

Controls what files Docker includes in the build context (files sent to Docker daemon).

Structure

.dockerignore
├── Version control
├── Dependencies (usually)
├── Build artifacts
├── Environment files
├── IDE/Editor files
├── OS files
└── CI/CD files

Complete .dockerignore

# Git
.git
.gitignore
.gitattributes
.github

# Node
node_modules
npm-debug.log
yarn-error.log
package-lock.json

# Python
__pycache__
.pytest_cache
.venv
venv
*.pyc
*.pyo
*.egg-info
dist
build

# Environment
.env
.env.local
.env.*.local
.envrc

# IDE
.vscode
.idea
.sublime-project
.sublime-workspace
*.swp
*.swo
*~

# OS
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

# CI/CD
.github
.gitlab-ci.yml
.travis.yml
.circleci
jenkins

# Docs
README.md
CHANGELOG.md
docs

# Testing
.coverage
coverage/
htmlcov/
test-results/

# Build
dist/
build/
*.egg-info/

Impact

# Without .dockerignore
COPY . /app
# Copies: 500MB (includes node_modules, .git, build artifacts)
# Build time: 10 seconds

# With .dockerignore
COPY . /app
# Copies: 50MB (only source)
# Build time: 1 second

Security

1. Scan Images

# Using Trivy (free, open-source)
trivy image myapp:1.0

# Using Docker Scout (built-in)
docker scout cves myapp:1.0

# Using Snyk
snyk container test myapp:1.0

2. No Secrets in Images

BAD: Secrets in Dockerfile

FROM myapp
ENV DB_PASSWORD=super_secret_123

GOOD: Pass at runtime

docker run -e DB_PASSWORD=secret myapp

Or use secrets in compose:

services:
  app:
    image: myapp
    environment:
      DB_PASSWORD_FILE: /run/secrets/db_password
    secrets:
      - db_password

secrets:
  db_password:
    file: ./secrets/db_password

3. Use Minimal Base Images

BAD: Full OS

FROM ubuntu:22.04          # 77 MB
FROM debian:12-slim        # 70 MB
FROM python:3.11           # 883 MB

GOOD: Minimal variants

FROM alpine:3.19           # 7 MB
FROM python:3.11-alpine    # 50 MB
FROM node:18-alpine        # 180 MB
FROM scratch               # 0 MB (just binary)

Trade-off: Alpine is smaller but missing some tools and glibc.

4. Read-Only Filesystem

# Run with read-only root filesystem
docker run --read-only \
    --tmpfs /tmp \
    --tmpfs /var/tmp \
    myapp

In compose:

services:
  app:
    image: myapp
    read_only: true
    tmpfs:
      - /tmp
      - /var/tmp

5. Drop Capabilities

# Drop all capabilities, add only needed ones
docker run --cap-drop=ALL \
    --cap-add=NET_BIND_SERVICE \
    myapp

6. Resource Limits

# Limit CPU and memory
docker run \
    --memory 512m \
    --cpus 1.5 \
    --pids-limit 100 \
    myapp

7. No Privileged Containers

BAD:

docker run --privileged myapp  # Full host access!

GOOD:

docker run myapp  # No special privileges

8. Image Signing

# Enable Docker Content Trust (DCT)
export DOCKER_CONTENT_TRUST=1
docker push myapp:1.0  # Creates signature
docker pull myapp:1.0  # Verifies signature

Image Size Optimization

Layer Inspection

# See layer sizes
docker history myapp:1.0

Specific Optimizations

Alpine Linux:

FROM python:3.11-alpine
RUN apk add --no-cache gcc musl-dev  # Only in build if needed

Remove Package Manager Cache:

FROM ubuntu:22.04
RUN apt-get update && \
    apt-get install -y curl && \
    rm -rf /var/lib/apt/lists/*  # Clear cache

Compress Files:

FROM ubuntu:22.04
RUN ... && \
    gzip -9 /app/data/* && \
    strip /usr/local/bin/*

Comparison:

Base Image Size Pros Cons
ubuntu:22.04 77 MB Full tooling Large
debian:12-slim 70 MB Smaller Still large
python:3.11 883 MB Ready to use Huge!
python:3.11-slim 150 MB Lean, Python ready Smaller toolset
python:3.11-alpine 50 MB Very small Different libc
scratch 0 MB Minimal Just binary

Networking

1. Explicit Networks

BAD: Default bridge

docker run myapp  # Uses default bridge
docker run mydb   # Can't resolve by name

GOOD: Custom network

docker network create mynet
docker run --network mynet --name app myapp
docker run --network mynet --name db mydb
# app can resolve 'db' hostname

2. Docker Compose Networking

version: '3.8'

services:
  web:
    build: .
    ports:
      - "8080:8000"
    environment:
      DB_HOST: postgres  # Can use service name
      DB_PORT: 5432
    depends_on:
      - postgres

  postgres:
    image: postgres:15
    environment:
      POSTGRES_PASSWORD: secret
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

3. Exposing Ports Safely

# Only expose what's needed
EXPOSE 8000
# Don't expose SSH, debugging ports, etc.

Logging & Monitoring

1. Container Logs

BAD: Writing to files

RUN mkdir -p /var/log/app

GOOD: Write to stdout/stderr

# Python
import sys
print("Log message", file=sys.stdout)

# Node.js
console.log("Log message")  # Goes to stdout

Docker captures stdout/stderr automatically:

docker logs myapp

2. Log Drivers

# JSON file (default)
docker run --log-driver json-file myapp

# Send to syslog
docker run --log-driver syslog myapp

# Send to CloudWatch
docker run --log-driver awslogs \
    --log-opt awslogs-group=/ecs/myapp \
    myapp

3. Structured Logging

BAD: Unstructured text

Starting server on port 8000
User 123 logged in
Error: Database connection failed

GOOD: JSON/structured

{"timestamp": "2024-01-15T10:30:00Z", "level": "INFO", "message": "Starting server", "port": 8000}
{"timestamp": "2024-01-15T10:30:05Z", "level": "INFO", "message": "User logged in", "user_id": 123}
{"timestamp": "2024-01-15T10:30:10Z", "level": "ERROR", "message": "Database connection failed", "error": "connection timeout"}

4. Application Monitoring

FROM myapp:1.0

# Include monitoring tools
RUN apt-get install -y prometheus-node-exporter

EXPOSE 8000 9100  # App + metrics

Quick Checklist

  • Using specific base image tags (not latest)
  • Combining RUN commands to reduce layers
  • Ordering Dockerfile instructions for caching
  • Have .dockerignore with unnecessary files
  • Running as non-root user
  • Using ENTRYPOINT for executables
  • Including HEALTHCHECK
  • No hardcoded secrets in image
  • Using minimal base images (alpine if possible)
  • Scanned image for vulnerabilities
  • Setting memory/CPU limits
  • Logging to stdout/stderr
  • Using custom networks (not default bridge)
  • Multi-stage builds when applicable
  • Removed build dependencies in final stage

For Docker commands, see Docker CLI Reference

For overview, see Docker Overview

For hands-on tutorials, see Docker Journey

For questions, see Contributing Guide