Docker for Beginners: Containers Made Simple
· 12 min read
Table of Contents
- What Is Docker?
- Containers vs Virtual Machines
- Core Docker Concepts
- Essential Docker Commands
- Writing Dockerfiles
- Docker Compose for Multi-Container Apps
- Docker Networking and Volumes
- Docker Best Practices
- Common Pitfalls and How to Avoid Them
- Real-World Use Cases
- Frequently Asked Questions
- Related Articles
What Is Docker?
Docker is a platform that lets you package applications and their dependencies into lightweight, portable containers. Think of a container as a tiny, self-contained box that holds everything your app needs to run: code, runtime, libraries, and system tools. No more "it works on my machine" problems.
Before Docker, deploying software meant manually installing dependencies, configuring servers, and hoping nothing conflicted. You'd spend hours setting up Python versions, Node.js packages, database drivers, and system libraries. Then you'd do it all over again on staging and production servers, praying the configurations matched.
Docker eliminates this chaos by ensuring your app runs identically everywhere β your laptop, a teammate's machine, staging servers, or production. The container becomes the unit of deployment, not the application code alone.
Docker has become the standard tool for modern software development. Whether you're building microservices, setting up CI/CD pipelines, or just want a consistent development environment, Docker makes it possible with minimal overhead. Companies like Netflix, Spotify, and PayPal run millions of containers in production every day.
Quick tip: Docker isn't just for production deployments. Many developers use it to avoid cluttering their local machine with different language versions, databases, and tools. Need PostgreSQL for one project and MySQL for another? Run them both in containers without conflicts.
Containers vs Virtual Machines
Containers and virtual machines both provide isolation, but they work very differently under the hood. Understanding this difference is crucial to appreciating why containers have become so popular.
Virtual Machines run a full operating system with its own kernel on top of a hypervisor. Each VM needs its own OS installation, consuming gigabytes of disk space and significant memory. Boot times are measured in minutes. If you run three VMs, you're running three complete operating systems simultaneously.
Containers share the host OS kernel and only package the application layer. They're megabytes in size (not gigabytes), start in seconds (not minutes), and you can run dozens on a single machine without breaking a sweat.
# VM approach: Each app gets a full OS
App A β Guest OS β Hypervisor β Host OS β Hardware
App B β Guest OS β Hypervisor β Host OS β Hardware
# Container approach: Apps share the kernel
App A β Container Runtime β Host OS β Hardware
App B β Container Runtime β Host OS β Hardware
This lightweight architecture makes containers ideal for microservices, where you might run hundreds of small services instead of one monolithic application. The resource efficiency is staggering β a server that runs 10 VMs might comfortably run 100 containers.
| Feature | Virtual Machines | Containers |
|---|---|---|
| Startup Time | Minutes | Seconds |
| Disk Space | Gigabytes (full OS) | Megabytes (app layer only) |
| Performance | Near-native | Native (no hypervisor overhead) |
| Isolation | Complete (separate kernel) | Process-level (shared kernel) |
| Portability | Limited (hypervisor-dependent) | High (runs anywhere Docker runs) |
| Resource Usage | Heavy | Lightweight |
That said, VMs aren't obsolete. They provide stronger isolation since each VM has its own kernel. For security-critical workloads or when you need to run different operating systems, VMs remain the better choice. Many organizations use both: VMs for infrastructure isolation and containers for application deployment.
Core Docker Concepts
Docker has a few key concepts you need to understand before diving into commands and Dockerfiles. These building blocks form the foundation of how Docker works.
Images
A Docker image is a read-only template containing your application code, runtime, libraries, and dependencies. Think of it as a snapshot or blueprint. Images are built from instructions in a Dockerfile and stored in registries like Docker Hub.
Images are composed of layers. Each instruction in a Dockerfile creates a new layer. Docker caches these layers, so if you rebuild an image and only the last layer changed, Docker reuses the cached layers. This makes builds incredibly fast.
Containers
A container is a running instance of an image. You can create multiple containers from the same image, and each runs in isolation. When you stop a container, any changes made inside it are lost unless you explicitly save them or use volumes.
Containers are ephemeral by design. This immutability is a feature, not a bug β it ensures consistency and makes scaling trivial.
Dockerfile
A Dockerfile is a text file containing instructions to build a Docker image. It specifies the base image, copies your code, installs dependencies, and defines how to run your application. We'll dive deeper into Dockerfiles in a later section.
Docker Registry
A registry is a storage and distribution system for Docker images. Docker Hub is the default public registry, hosting millions of images. You can also run private registries for proprietary applications. When you run docker pull nginx, Docker downloads the nginx image from Docker Hub.
Volumes
Volumes are Docker's mechanism for persisting data. Since containers are ephemeral, any data written inside a container disappears when it stops. Volumes let you store data outside the container filesystem, surviving container restarts and deletions.
Pro tip: Use our Docker Command Generator to quickly create complex Docker commands without memorizing all the flags and options.
Essential Docker Commands
Let's walk through the Docker commands you'll use daily. These cover the entire container lifecycle from pulling images to cleaning up resources.
Working with Images
# Pull an image from Docker Hub
docker pull ubuntu:22.04
# List all local images
docker images
# Remove an image
docker rmi ubuntu:22.04
# Build an image from a Dockerfile
docker build -t myapp:1.0 .
# Tag an image for pushing to a registry
docker tag myapp:1.0 username/myapp:1.0
# Push an image to a registry
docker push username/myapp:1.0
Running Containers
# Run a container in the foreground
docker run ubuntu:22.04 echo "Hello Docker"
# Run a container in detached mode (background)
docker run -d nginx
# Run with a custom name
docker run -d --name my-nginx nginx
# Run with port mapping (host:container)
docker run -d -p 8080:80 nginx
# Run with environment variables
docker run -d -e POSTGRES_PASSWORD=secret postgres
# Run with a volume mount
docker run -d -v /host/path:/container/path nginx
# Run interactively with a shell
docker run -it ubuntu:22.04 /bin/bash
Managing Containers
# List running containers
docker ps
# List all containers (including stopped)
docker ps -a
# Stop a container
docker stop my-nginx
# Start a stopped container
docker start my-nginx
# Restart a container
docker restart my-nginx
# Remove a container
docker rm my-nginx
# Remove a running container (force)
docker rm -f my-nginx
# View container logs
docker logs my-nginx
# Follow logs in real-time
docker logs -f my-nginx
# Execute a command in a running container
docker exec my-nginx ls /usr/share/nginx/html
# Open a shell in a running container
docker exec -it my-nginx /bin/bash
# View container resource usage
docker stats
# Inspect container details
docker inspect my-nginx
Cleanup Commands
# Remove all stopped containers
docker container prune
# Remove unused images
docker image prune
# Remove unused volumes
docker volume prune
# Remove everything unused (containers, images, networks, volumes)
docker system prune -a
The -it flags deserve special mention. -i keeps STDIN open (interactive), and -t allocates a pseudo-TTY (terminal). Together, they let you interact with the container like a normal shell session.
Quick tip: Use docker ps -q to get just container IDs, which is useful for scripting. For example, docker stop $(docker ps -q) stops all running containers.
Writing Dockerfiles
A Dockerfile is where the magic happens. It's a recipe for building your Docker image, specifying exactly what goes into your container. Let's break down the most important instructions and best practices.
Basic Dockerfile Structure
# Start from a base image
FROM node:18-alpine
# Set the working directory
WORKDIR /app
# Copy package files
COPY package*.json ./
# Install dependencies
RUN npm install
# Copy application code
COPY . .
# Expose the port your app runs on
EXPOSE 3000
# Define the command to run your app
CMD ["node", "server.js"]
Key Dockerfile Instructions
FROM specifies the base image. Always use specific version tags (like node:18-alpine) instead of latest to ensure reproducible builds. Alpine variants are smaller and more secure.
WORKDIR sets the working directory for subsequent instructions. It creates the directory if it doesn't exist. This is cleaner than using RUN cd /app.
COPY copies files from your host machine into the image. The syntax is COPY source destination. Use COPY . . to copy everything, but be mindful of what you include (use .dockerignore).
RUN executes commands during the build process. Each RUN instruction creates a new layer. Chain commands with && to reduce layers:
# Bad: Creates 3 layers
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean
# Good: Creates 1 layer
RUN apt-get update && \
apt-get install -y curl && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
EXPOSE documents which port your application listens on. It doesn't actually publish the port β that happens with docker run -p.
CMD specifies the default command to run when the container starts. Use the JSON array format (["executable", "param1"]) to avoid shell processing issues.
ENTRYPOINT is similar to CMD but harder to override. Use it when you want your container to behave like an executable. You can combine ENTRYPOINT and CMD for flexible defaults.
Multi-Stage Builds
Multi-stage builds let you use multiple FROM statements in one Dockerfile. This is incredibly powerful for creating small production images while keeping build tools separate.
# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
# Production stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
EXPOSE 3000
CMD ["node", "dist/server.js"]
The production image only contains the compiled code and runtime dependencies, not the build tools. This can reduce image size by 70% or more.
.dockerignore File
Create a .dockerignore file to exclude files from the build context. This speeds up builds and reduces image size.
node_modules
npm-debug.log
.git
.env
*.md
.DS_Store
coverage
.vscode
Pro tip: Order your Dockerfile instructions from least to most frequently changing. Put COPY package*.json before COPY . . so Docker can cache the dependency installation layer even when your code changes.
Docker Compose for Multi-Container Apps
Docker Compose is a tool for defining and running multi-container applications. Instead of running multiple docker run commands with dozens of flags, you define everything in a docker-compose.yml file.
Compose is essential for local development when your app needs multiple services: a web server, database, cache, and message queue. It handles networking, volumes, and dependencies automatically.
Basic docker-compose.yml
version: '3.8'
services:
web:
build: .
ports:
- "3000:3000"
environment:
- DATABASE_URL=postgresql://db:5432/myapp
depends_on:
- db
- redis
volumes:
- .:/app
- /app/node_modules
db:
image: postgres:15-alpine
environment:
- POSTGRES_PASSWORD=secret
- POSTGRES_DB=myapp
volumes:
- postgres_data:/var/lib/postgresql/data
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
postgres_data:
Essential Compose Commands
# Start all services
docker-compose up
# Start in detached mode
docker-compose up -d
# Rebuild images before starting
docker-compose up --build
# Stop all services
docker-compose down
# Stop and remove volumes
docker-compose down -v
# View logs
docker-compose logs
# Follow logs for a specific service
docker-compose logs -f web
# Execute a command in a service
docker-compose exec web npm test
# List running services
docker-compose ps
# Scale a service
docker-compose up -d --scale web=3
Advanced Compose Features
Health checks ensure a service is ready before dependent services start:
services:
db:
image: postgres:15-alpine
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
Environment files keep secrets out of your compose file:
services:
web:
env_file:
- .env
- .env.local
Networks let you control how services communicate:
services:
web:
networks:
- frontend
- backend
db:
networks:
- backend
networks:
frontend:
backend:
This setup prevents the web service from directly accessing the database network, adding a layer of security.
Quick tip: Use docker-compose config to validate your compose file and see the final configuration after variable substitution. It's a lifesaver for debugging complex setups.
Docker Networking and Volumes
Understanding Docker networking and volumes is crucial for building real applications. These features let containers communicate and persist data beyond their ephemeral lifecycle.
Docker Networks
Docker creates isolated networks for containers. By default, containers on the same network can communicate using service names as hostnames.
# Create a custom network
docker network create myapp-network
# Run containers on the network
docker run -d --name db --network myapp-network postgres
docker run -d --name web --network myapp-network myapp
# Inside the web container, you can connect to postgres://db:5432
Docker supports several network drivers:
- bridge (default): Isolated network on a single host
- host: Container uses the host's network directly (no isolation)
- overlay: Multi-host networking for Docker Swarm
- none: No networking
Docker Volumes
Volumes are the preferred way to persist data. They're managed by Docker and stored outside the container filesystem.
# Create a named volume
docker volume create mydata
# Use the volume in a container
docker run -d -v mydata:/data postgres
# List volumes
docker volume ls
# Inspect a volume
docker volume inspect mydata
# Remove a volume
docker volume rm mydata
You can also use bind mounts to mount a host directory directly:
# Bind mount (absolute path required)
docker run -d -v /host/path:/container/path nginx
# Modern syntax (more explicit)
docker run -d --mount type=bind,source=/host/path,target=/container/path nginx
Bind mounts are great for development (live code reloading), while named volumes are better for production (Docker manages them, they're portable, and they work on all platforms).
| Feature | Named Volumes | Bind Mounts |
|---|---|---|
| Management | Docker manages location | You specify exact path |
| Portability | Works everywhere | Host-dependent |
| Performance | Optimized by Docker | Direct filesystem access |
| Use Case | Production data persistence | Development, config files |
| Backup | Use docker commands | Standard filesystem tools |
Docker Best Practices
Following best practices makes your Docker images smaller, more secure, and easier to maintain. These guidelines come from years of production experience.
Image Size Optimization
- Use Alpine-based images when possible (
node:18-alpinevsnode:18) - Use multi-stage builds to exclude build tools from production images
- Chain RUN commands to reduce layers
- Remove package manager caches:
apt-get clean,rm -rf /var/lib/apt/lists/* - Use
.dockerignoreto exclude unnecessary files
Security Best Practices
- Never run containers as root. Create a non-privileged user:
FROM node:18-alpine
RUN addgroup -g 1001 -S nodejs && adduser -S nodejs -u 1001
USER nodejs
- Use specific image tags, never
latest - Scan images for vulnerabilities:
docker scan myapp:1.0 - Don't store secrets in images. Use environment variables or secret management tools
- Keep base images updated to get security patches
- Use read-only filesystems when possible:
docker run --read-only
Build Optimization
- Order Dockerfile instructions from least to most frequently changing
- Copy dependency files before application code to leverage layer caching
- Use
COPYinstead ofADDunless you need ADD's special features - Minimize the number of layers (combine RUN commands)
- Use BuildKit for faster builds:
DOCKER_BUILDKIT=1 docker build
Runtime Best Practices
- Set resource limits to prevent containers from consuming all host resources:
docker run -d --memory="512m" --cpus="1.0" myapp
- Use health checks to ensure containers are actually ready
- Implement graceful shutdown by handling SIGTERM signals
- Use restart policies for automatic recovery:
--restart=unless-stopped - Log to STDOUT/STDERR, not files (Docker captures these automatically)
Pro tip: Use docker history myapp:1.0 to see all layers in your image and their sizes. This helps identify which instructions are bloating your image.
Common Pitfalls and How to Avoid Them
Even experienced developers make these mistakes. Learning to recognize and avoid them will save you hours of debugging.
Pitfall 1: Using :latest in Production
The latest tag is a moving target. Your builds become non-reproducible, and you might pull a breaking change without realizing it.
Solution: Always use specific version tags: postgres:15.2-alpine instead of postgres:latest.
Pitfall 2: Storing Data in Containers
Containers are ephemeral. Any data written to the container filesystem disappears when the container is removed.
Solution: Use volumes for all persistent data. Database files, uploaded content, logs β anything you need to keep.
Pitfall 3: Running as Root
By default, processes in containers run as root. This is a security risk if the container is compromised.
Solution: Create and switch to a non-root user in your Dockerfile. Most official images provide a non-root user you can switch to.
Pitfall 4: Ignoring Layer Caching
Poor Dockerfile instruction ordering means Docker rebuilds everything on every code change, even if dependencies haven't changed.
Solution: Copy dependency files first, install them, then copy application code. This way, the dependency layer is cached.
Pitfall 5: Exposing Unnecessary Ports
Publishing all ports with -P or exposing internal services to the host network creates security vulnerabilities.
Solution: Only publish ports that need external access. Use Docker networks for inter-container communication.
Pitfall 6: Not Setting Resource Limits
A runaway container can consume all host resources, crashing other containers and the host itself.
Solution: Always set memory and CPU limits in production. Use docker stats to monitor resource usage.
Pitfall 7: Hardcoding Configuration
Baking configuration into images means rebuilding for every environment (dev, staging, production).
Solution: Use environment variables for configuration. Pass them with -e or --env-file.
Quick tip: Use our Dockerfile Generator to create optimized Dockerfiles that follow best practices automatically.
Real-World Use Cases
Let's look at practical examples of how teams use Docker to solve real problems.
Microservices Architecture
A typical e-commerce platform might have separate services for users, products, orders, payments, and notifications. Each service runs in its own container with its own database.
Docker makes this manageable. Each team can develop, test, and deploy their service independently. Services communicate over Docker networks using service discovery.
services:
user-service:
build: ./services/users
environment:
- DB_URL=postgresql://user-db:5432/users
product-service:
build: ./services/products
environment:
- DB_URL=postgresql://product-db:5432/products
order-service:
build: ./services/orders
environment:
- USER_SERVICE_URL=http://user-service:3000
- PRODUCT_SERVICE_URL=http://product-service:3000
CI/CD Pipelines
Docker ensures your tests run in the same environment as production. Your CI pipeline builds a Docker image, runs tests inside a container, and pushes the image to a registry if tests pass.
# .github/workflows/ci.yml
- name: Build image
run: docker build -t myapp:${{ github.sha }} .
- name: Run tests
run: docker run