Docker Volumes Basics
Docker containers are ephemeral by design. Docker volumes let containers store data outside the container filesystem so it survives restarts, updates, and redeploys. This docker volumes tutorial shows the core concepts, a copy-paste example, common mistakes, and practical best practices for persisting data.
Concept explained
- Anonymous volume: created automatically by Docker; not convenient for recovery.
- Named volume: a Docker-managed volume identified by name; easy to reuse and back up.
- Bind mount: maps a host directory into the container; useful for local development but less portable.
- tmpfs: in-memory mount (not persistent across host reboots).
Key behaviors:
- Volumes are managed outside the container filesystem and outlive containers.
- Ownership and permissions matter: files are created with the container's UID/GID.
- Volumes are stored on the host (or the PaaS storage system) and can be backed up.
Use named volumes for production data (databases, uploads) and bind mounts for local dev.
Step-by-step example
Example: run Postgres with a named volume (minimal commands):
- Create a named volume (optional – docker will create it automatically):
docker volume create pgdata
- Run Postgres using the named volume:
docker run -d \
--name pg \
-e POSTGRES_PASSWORD=secret \
-v pgdata:/var/lib/postgresql/data \
postgres:15
- Inspect the volume:
docker volume ls
docker volume inspect pgdata
- Stop and remove the container (data persists):
docker stop pg
docker rm pg
# data remains in the volume
- Start a new Postgres container reusing the volume:
docker run -d --name pg2 -e POSTGRES_PASSWORD=secret -v pgdata:/var/lib/postgresql/data postgres:15
Docker Compose example (recommended for multi-service apps):
docker-compose.yml:
services:
web:
build: . # builds using ./Dockerfile
ports:
- "3000:3000"
environment:
- DATABASE_URL=postgres://postgres:password@db:5432/app
depends_on:
- db
db:
image: postgres:15
environment:
POSTGRES_PASSWORD: password
volumes:
- db-data:/var/lib/postgresql/data
volumes:
db-data:
Run:
docker compose up -d
Variations & gotchas
-
Bind mount (host folder):
docker run -v /home/me/app-data:/data myimage
Good for dev; less portable and may expose host files.
-
tmpfs (fast, not persistent):
docker run --tmpfs /cache:rw,size=100m myimage
-
Lost data scenarios:
- Using anonymous volumes (docker creates a random name) makes it hard to attach a new container later.
- Using tmpfs or in-container storage (no volume) loses data on container removal.
- Removing volumes:
docker volume rm
deletes data.
If you remove a named volume with data (docker volume rm pgdata
), the data is gone unless you have a backup.
Common mistakes
- Relying on container filesystem for persistent data (data disappears on container removal).
- Using anonymous volumes unintentionally (check
docker inspect
ordocker volume ls
). - Incorrect permissions: files created as root inside container and unreadable by app process.
- Mounting host directories with wrong SELinux context on systems with SELinux enabled.
- Assuming bind mounts work the same across different hosts or PaaS environments.
Best practices
- Use named volumes for production database and persistent app data.
- Define volumes in docker-compose for repeatability and CI/CD.
- Initialize volume ownership in an entrypoint script or Dockerfile (chown if necessary).
- Backup volumes regularly (see example backup commands below).
- Keep data and code separation: avoid writing app logs/data into image layers.
Backup example (archive a named volume, using a helper container):
docker run --rm -v pgdata:/data -v $(pwd):/backup alpine \
sh -c "cd /data && tar czf /backup/pgdata-backup.tar.gz ."
Restore example:
docker run --rm -v pgdata:/data -v $(pwd):/backup alpine \
sh -c "cd /data && tar xzvf /backup/pgdata-backup.tar.gz"
When to use / when not to use
When to use volumes:
- Databases (Postgres, MySQL)
- Object uploads, user files
- Shared state that must survive container recreation
When not to use volumes:
- Ephemeral caches that can be regenerated at boot (use tmpfs if faster)
- Immutable configuration baked into the image
Key takeaways
- Named volumes persist data across container recreate and are preferred for production.
- Bind mounts are convenient for local dev but are less portable and can introduce permission issues.
- Always plan for backups and test restore procedures.
- Be careful with anonymous volumes and tmpfs – they can silently cause lost data.