Docker for Beginners: Containers Made Simple

· 12 min read

Table of Contents

What Is Docker?

Docker is a platform that lets you package applications and their dependencies into lightweight, portable containers. Think of a container as a tiny, self-contained box that holds everything your app needs to run: code, runtime, libraries, and system tools. No more "it works on my machine" problems.

Before Docker, deploying software meant manually installing dependencies, configuring servers, and hoping nothing conflicted. You'd spend hours setting up Python versions, Node.js packages, database drivers, and system libraries. Then you'd do it all over again on staging and production servers, praying the configurations matched.

Docker eliminates this chaos by ensuring your app runs identically everywhere β€” your laptop, a teammate's machine, staging servers, or production. The container becomes the unit of deployment, not the application code alone.

Docker has become the standard tool for modern software development. Whether you're building microservices, setting up CI/CD pipelines, or just want a consistent development environment, Docker makes it possible with minimal overhead. Companies like Netflix, Spotify, and PayPal run millions of containers in production every day.

Quick tip: Docker isn't just for production deployments. Many developers use it to avoid cluttering their local machine with different language versions, databases, and tools. Need PostgreSQL for one project and MySQL for another? Run them both in containers without conflicts.

Containers vs Virtual Machines

Containers and virtual machines both provide isolation, but they work very differently under the hood. Understanding this difference is crucial to appreciating why containers have become so popular.

Virtual Machines run a full operating system with its own kernel on top of a hypervisor. Each VM needs its own OS installation, consuming gigabytes of disk space and significant memory. Boot times are measured in minutes. If you run three VMs, you're running three complete operating systems simultaneously.

Containers share the host OS kernel and only package the application layer. They're megabytes in size (not gigabytes), start in seconds (not minutes), and you can run dozens on a single machine without breaking a sweat.

# VM approach: Each app gets a full OS
App A β†’ Guest OS β†’ Hypervisor β†’ Host OS β†’ Hardware
App B β†’ Guest OS β†’ Hypervisor β†’ Host OS β†’ Hardware

# Container approach: Apps share the kernel
App A β†’ Container Runtime β†’ Host OS β†’ Hardware
App B β†’ Container Runtime β†’ Host OS β†’ Hardware

This lightweight architecture makes containers ideal for microservices, where you might run hundreds of small services instead of one monolithic application. The resource efficiency is staggering β€” a server that runs 10 VMs might comfortably run 100 containers.

Feature Virtual Machines Containers
Startup Time Minutes Seconds
Disk Space Gigabytes (full OS) Megabytes (app layer only)
Performance Near-native Native (no hypervisor overhead)
Isolation Complete (separate kernel) Process-level (shared kernel)
Portability Limited (hypervisor-dependent) High (runs anywhere Docker runs)
Resource Usage Heavy Lightweight

That said, VMs aren't obsolete. They provide stronger isolation since each VM has its own kernel. For security-critical workloads or when you need to run different operating systems, VMs remain the better choice. Many organizations use both: VMs for infrastructure isolation and containers for application deployment.

Core Docker Concepts

Docker has a few key concepts you need to understand before diving into commands and Dockerfiles. These building blocks form the foundation of how Docker works.

Images

A Docker image is a read-only template containing your application code, runtime, libraries, and dependencies. Think of it as a snapshot or blueprint. Images are built from instructions in a Dockerfile and stored in registries like Docker Hub.

Images are composed of layers. Each instruction in a Dockerfile creates a new layer. Docker caches these layers, so if you rebuild an image and only the last layer changed, Docker reuses the cached layers. This makes builds incredibly fast.

Containers

A container is a running instance of an image. You can create multiple containers from the same image, and each runs in isolation. When you stop a container, any changes made inside it are lost unless you explicitly save them or use volumes.

Containers are ephemeral by design. This immutability is a feature, not a bug β€” it ensures consistency and makes scaling trivial.

Dockerfile

A Dockerfile is a text file containing instructions to build a Docker image. It specifies the base image, copies your code, installs dependencies, and defines how to run your application. We'll dive deeper into Dockerfiles in a later section.

Docker Registry

A registry is a storage and distribution system for Docker images. Docker Hub is the default public registry, hosting millions of images. You can also run private registries for proprietary applications. When you run docker pull nginx, Docker downloads the nginx image from Docker Hub.

Volumes

Volumes are Docker's mechanism for persisting data. Since containers are ephemeral, any data written inside a container disappears when it stops. Volumes let you store data outside the container filesystem, surviving container restarts and deletions.

Pro tip: Use our Docker Command Generator to quickly create complex Docker commands without memorizing all the flags and options.

Essential Docker Commands

Let's walk through the Docker commands you'll use daily. These cover the entire container lifecycle from pulling images to cleaning up resources.

Working with Images

# Pull an image from Docker Hub
docker pull ubuntu:22.04

# List all local images
docker images

# Remove an image
docker rmi ubuntu:22.04

# Build an image from a Dockerfile
docker build -t myapp:1.0 .

# Tag an image for pushing to a registry
docker tag myapp:1.0 username/myapp:1.0

# Push an image to a registry
docker push username/myapp:1.0

Running Containers

# Run a container in the foreground
docker run ubuntu:22.04 echo "Hello Docker"

# Run a container in detached mode (background)
docker run -d nginx

# Run with a custom name
docker run -d --name my-nginx nginx

# Run with port mapping (host:container)
docker run -d -p 8080:80 nginx

# Run with environment variables
docker run -d -e POSTGRES_PASSWORD=secret postgres

# Run with a volume mount
docker run -d -v /host/path:/container/path nginx

# Run interactively with a shell
docker run -it ubuntu:22.04 /bin/bash

Managing Containers

# List running containers
docker ps

# List all containers (including stopped)
docker ps -a

# Stop a container
docker stop my-nginx

# Start a stopped container
docker start my-nginx

# Restart a container
docker restart my-nginx

# Remove a container
docker rm my-nginx

# Remove a running container (force)
docker rm -f my-nginx

# View container logs
docker logs my-nginx

# Follow logs in real-time
docker logs -f my-nginx

# Execute a command in a running container
docker exec my-nginx ls /usr/share/nginx/html

# Open a shell in a running container
docker exec -it my-nginx /bin/bash

# View container resource usage
docker stats

# Inspect container details
docker inspect my-nginx

Cleanup Commands

# Remove all stopped containers
docker container prune

# Remove unused images
docker image prune

# Remove unused volumes
docker volume prune

# Remove everything unused (containers, images, networks, volumes)
docker system prune -a

The -it flags deserve special mention. -i keeps STDIN open (interactive), and -t allocates a pseudo-TTY (terminal). Together, they let you interact with the container like a normal shell session.

Quick tip: Use docker ps -q to get just container IDs, which is useful for scripting. For example, docker stop $(docker ps -q) stops all running containers.

Writing Dockerfiles

A Dockerfile is where the magic happens. It's a recipe for building your Docker image, specifying exactly what goes into your container. Let's break down the most important instructions and best practices.

Basic Dockerfile Structure

# Start from a base image
FROM node:18-alpine

# Set the working directory
WORKDIR /app

# Copy package files
COPY package*.json ./

# Install dependencies
RUN npm install

# Copy application code
COPY . .

# Expose the port your app runs on
EXPOSE 3000

# Define the command to run your app
CMD ["node", "server.js"]

Key Dockerfile Instructions

FROM specifies the base image. Always use specific version tags (like node:18-alpine) instead of latest to ensure reproducible builds. Alpine variants are smaller and more secure.

WORKDIR sets the working directory for subsequent instructions. It creates the directory if it doesn't exist. This is cleaner than using RUN cd /app.

COPY copies files from your host machine into the image. The syntax is COPY source destination. Use COPY . . to copy everything, but be mindful of what you include (use .dockerignore).

RUN executes commands during the build process. Each RUN instruction creates a new layer. Chain commands with && to reduce layers:

# Bad: Creates 3 layers
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean

# Good: Creates 1 layer
RUN apt-get update && \
    apt-get install -y curl && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

EXPOSE documents which port your application listens on. It doesn't actually publish the port β€” that happens with docker run -p.

CMD specifies the default command to run when the container starts. Use the JSON array format (["executable", "param1"]) to avoid shell processing issues.

ENTRYPOINT is similar to CMD but harder to override. Use it when you want your container to behave like an executable. You can combine ENTRYPOINT and CMD for flexible defaults.

Multi-Stage Builds

Multi-stage builds let you use multiple FROM statements in one Dockerfile. This is incredibly powerful for creating small production images while keeping build tools separate.

# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

# Production stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
EXPOSE 3000
CMD ["node", "dist/server.js"]

The production image only contains the compiled code and runtime dependencies, not the build tools. This can reduce image size by 70% or more.

.dockerignore File

Create a .dockerignore file to exclude files from the build context. This speeds up builds and reduces image size.

node_modules
npm-debug.log
.git
.env
*.md
.DS_Store
coverage
.vscode

Pro tip: Order your Dockerfile instructions from least to most frequently changing. Put COPY package*.json before COPY . . so Docker can cache the dependency installation layer even when your code changes.

Docker Compose for Multi-Container Apps

Docker Compose is a tool for defining and running multi-container applications. Instead of running multiple docker run commands with dozens of flags, you define everything in a docker-compose.yml file.

Compose is essential for local development when your app needs multiple services: a web server, database, cache, and message queue. It handles networking, volumes, and dependencies automatically.

Basic docker-compose.yml

version: '3.8'

services:
  web:
    build: .
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgresql://db:5432/myapp
    depends_on:
      - db
      - redis
    volumes:
      - .:/app
      - /app/node_modules

  db:
    image: postgres:15-alpine
    environment:
      - POSTGRES_PASSWORD=secret
      - POSTGRES_DB=myapp
    volumes:
      - postgres_data:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

volumes:
  postgres_data:

Essential Compose Commands

# Start all services
docker-compose up

# Start in detached mode
docker-compose up -d

# Rebuild images before starting
docker-compose up --build

# Stop all services
docker-compose down

# Stop and remove volumes
docker-compose down -v

# View logs
docker-compose logs

# Follow logs for a specific service
docker-compose logs -f web

# Execute a command in a service
docker-compose exec web npm test

# List running services
docker-compose ps

# Scale a service
docker-compose up -d --scale web=3

Advanced Compose Features

Health checks ensure a service is ready before dependent services start:

services:
  db:
    image: postgres:15-alpine
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

Environment files keep secrets out of your compose file:

services:
  web:
    env_file:
      - .env
      - .env.local

Networks let you control how services communicate:

services:
  web:
    networks:
      - frontend
      - backend
  
  db:
    networks:
      - backend

networks:
  frontend:
  backend:

This setup prevents the web service from directly accessing the database network, adding a layer of security.

Quick tip: Use docker-compose config to validate your compose file and see the final configuration after variable substitution. It's a lifesaver for debugging complex setups.

Docker Networking and Volumes

Understanding Docker networking and volumes is crucial for building real applications. These features let containers communicate and persist data beyond their ephemeral lifecycle.

Docker Networks

Docker creates isolated networks for containers. By default, containers on the same network can communicate using service names as hostnames.

# Create a custom network
docker network create myapp-network

# Run containers on the network
docker run -d --name db --network myapp-network postgres
docker run -d --name web --network myapp-network myapp

# Inside the web container, you can connect to postgres://db:5432

Docker supports several network drivers:

Docker Volumes

Volumes are the preferred way to persist data. They're managed by Docker and stored outside the container filesystem.

# Create a named volume
docker volume create mydata

# Use the volume in a container
docker run -d -v mydata:/data postgres

# List volumes
docker volume ls

# Inspect a volume
docker volume inspect mydata

# Remove a volume
docker volume rm mydata

You can also use bind mounts to mount a host directory directly:

# Bind mount (absolute path required)
docker run -d -v /host/path:/container/path nginx

# Modern syntax (more explicit)
docker run -d --mount type=bind,source=/host/path,target=/container/path nginx

Bind mounts are great for development (live code reloading), while named volumes are better for production (Docker manages them, they're portable, and they work on all platforms).

Feature Named Volumes Bind Mounts
Management Docker manages location You specify exact path
Portability Works everywhere Host-dependent
Performance Optimized by Docker Direct filesystem access
Use Case Production data persistence Development, config files
Backup Use docker commands Standard filesystem tools

Docker Best Practices

Following best practices makes your Docker images smaller, more secure, and easier to maintain. These guidelines come from years of production experience.

Image Size Optimization

Security Best Practices

FROM node:18-alpine
RUN addgroup -g 1001 -S nodejs && adduser -S nodejs -u 1001
USER nodejs

Build Optimization

Runtime Best Practices

docker run -d --memory="512m" --cpus="1.0" myapp

Pro tip: Use docker history myapp:1.0 to see all layers in your image and their sizes. This helps identify which instructions are bloating your image.

Common Pitfalls and How to Avoid Them

Even experienced developers make these mistakes. Learning to recognize and avoid them will save you hours of debugging.

Pitfall 1: Using :latest in Production

The latest tag is a moving target. Your builds become non-reproducible, and you might pull a breaking change without realizing it.

Solution: Always use specific version tags: postgres:15.2-alpine instead of postgres:latest.

Pitfall 2: Storing Data in Containers

Containers are ephemeral. Any data written to the container filesystem disappears when the container is removed.

Solution: Use volumes for all persistent data. Database files, uploaded content, logs β€” anything you need to keep.

Pitfall 3: Running as Root

By default, processes in containers run as root. This is a security risk if the container is compromised.

Solution: Create and switch to a non-root user in your Dockerfile. Most official images provide a non-root user you can switch to.

Pitfall 4: Ignoring Layer Caching

Poor Dockerfile instruction ordering means Docker rebuilds everything on every code change, even if dependencies haven't changed.

Solution: Copy dependency files first, install them, then copy application code. This way, the dependency layer is cached.

Pitfall 5: Exposing Unnecessary Ports

Publishing all ports with -P or exposing internal services to the host network creates security vulnerabilities.

Solution: Only publish ports that need external access. Use Docker networks for inter-container communication.

Pitfall 6: Not Setting Resource Limits

A runaway container can consume all host resources, crashing other containers and the host itself.

Solution: Always set memory and CPU limits in production. Use docker stats to monitor resource usage.

Pitfall 7: Hardcoding Configuration

Baking configuration into images means rebuilding for every environment (dev, staging, production).

Solution: Use environment variables for configuration. Pass them with -e or --env-file.

Quick tip: Use our Dockerfile Generator to create optimized Dockerfiles that follow best practices automatically.

Real-World Use Cases

Let's look at practical examples of how teams use Docker to solve real problems.

Microservices Architecture

A typical e-commerce platform might have separate services for users, products, orders, payments, and notifications. Each service runs in its own container with its own database.

Docker makes this manageable. Each team can develop, test, and deploy their service independently. Services communicate over Docker networks using service discovery.

services:
  user-service:
    build: ./services/users
    environment:
      - DB_URL=postgresql://user-db:5432/users
  
  product-service:
    build: ./services/products
    environment:
      - DB_URL=postgresql://product-db:5432/products
  
  order-service:
    build: ./services/orders
    environment:
      - USER_SERVICE_URL=http://user-service:3000
      - PRODUCT_SERVICE_URL=http://product-service:3000

CI/CD Pipelines

Docker ensures your tests run in the same environment as production. Your CI pipeline builds a Docker image, runs tests inside a container, and pushes the image to a registry if tests pass.

# .github/workflows/ci.yml
- name: Build image
  run: docker build -t myapp:${{ github.sha }} .

- name: Run tests
  run: docker run