Containers Are Ephemeral: Volumes and Data
The Problem: Data Loss
Run a PostgreSQL container:
docker run -d --name my-db -e POSTGRES_PASSWORD=secret postgres
Connect to it and create some data:
docker exec -it my-db psql -U postgres
Inside psql:
CREATE DATABASE myapp;
\c myapp
CREATE TABLE users (id SERIAL, name TEXT);
INSERT INTO users (name) VALUES ('Alice'), ('Bob');
SELECT * FROM users;
Output:
id | name
----+-------
1 | Alice
2 | Bob
Perfect. Exit psql:
\q
Now remove the container:
docker rm -f my-db
Start PostgreSQL again:
docker run -d --name my-db -e POSTGRES_PASSWORD=secret postgres
Check your data:
docker exec -it my-db psql -U postgres -c "SELECT * FROM users;"
Output:
ERROR: relation "users" does not exist
Your data is gone.
Why this happens: Each container has its own filesystem. When you remove a container, that filesystem disappears.
This is by design. Containers are meant to be disposable.
Mental Note: Container filesystems are temporary. Anything written inside a container dies with the container.
Understanding Container Storage
When you run a container, Docker creates a read-write layer on top of the image:
The container layer is temporary. When the container is removed, this layer is deleted.
This creates a problem: Databases, uploaded files, logs - all need to persist beyond container lifetime.
Solution 1: Docker Volumes
Volumes are Docker's answer to persistent storage.
Creating a Volume
docker volume create postgres-data
List volumes:
docker volume ls
Output:
DRIVER VOLUME NAME
local postgres-data
Inspect the volume:
docker volume inspect postgres-data
Output:
[
{
"CreatedAt": "2024-01-15T10:45:23Z",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/postgres-data/_data",
"Name": "postgres-data",
"Options": null,
"Scope": "local"
}
]
The volume exists on your host at /var/lib/docker/volumes/postgres-data/_data. Docker manages it.
Using a Volume
Run PostgreSQL with the volume:
docker run -d --name my-db \
-e POSTGRES_PASSWORD=secret \
-v postgres-data:/var/lib/postgresql/data \
postgres
What -v postgres-data:/var/lib/postgresql/data means:
Format: -v VOLUME_NAME:CONTAINER_PATH
PostgreSQL stores its data at /var/lib/postgresql/data. By mounting a volume there, data goes to the volume instead of the container's temporary filesystem.
Create some data:
docker exec -it my-db psql -U postgres
CREATE DATABASE myapp;
\c myapp
CREATE TABLE users (id SERIAL, name TEXT);
INSERT INTO users (name) VALUES ('Alice'), ('Bob');
\q
Remove the container:
docker rm -f my-db
Start a new container with the same volume:
docker run -d --name my-db-new \
-e POSTGRES_PASSWORD=secret \
-v postgres-data:/var/lib/postgresql/data \
postgres
Check the data:
docker exec -it my-db-new psql -U postgres -c "\c myapp" -c "SELECT * FROM users;"
Output:
You are now connected to database "myapp" as user "postgres".
id | name
----+-------
1 | Alice
2 | Bob
Your data persisted!
Mental Note: Volumes live outside containers. Multiple containers can use the same volume. Data survives container removal.
Solution 2: Bind Mounts
Bind mounts link a directory on your host directly into a container.
Create a directory:
mkdir -p ~/my-postgres-data
Run PostgreSQL with a bind mount:
docker run -d --name my-db-bind \
-e POSTGRES_PASSWORD=secret \
-v ~/my-postgres-data:/var/lib/postgresql/data \
postgres
Same format, but you're using a host path instead of a volume name.
Check what's in the directory:
ls ~/my-postgres-data
Output:
base global pg_commit_ts pg_dynshmem pg_hba.conf pg_ident.conf
pg_logical pg_multixact pg_notify pg_replslot pg_serial
pg_snapshots pg_stat pg_stat_tmp pg_subtrans pg_tblspc
pg_twophase PG_VERSION pg_wal pg_xact postgresql.auto.conf postgresql.conf
PostgreSQL's data files are directly on your host filesystem.
Mental Note: Bind mounts give you direct access to files. Volumes are managed by Docker. Both persist data, but bind mounts let you see and edit files easily.
Volumes vs Bind Mounts
| Aspect | Volumes | Bind Mounts |
|---|---|---|
| Managed by | Docker | You |
| Location | Docker's storage area | Anywhere on host |
| Portability | Works across systems | Requires same path |
| Performance | Optimized | Depends on host FS |
| Use case | Production databases | Development, configs |
When to use volumes:
- Production databases
- Data that should be managed by Docker
- Sharing data between containers
- Backup and migration
When to use bind mounts:
- Development (live code reloading)
- Configuration files you need to edit
- Logs you want to inspect
- Anything you need direct access to
Real Example: Development Workflow
You're building a Node.js app. During development, you want code changes to reflect immediately.
Project structure:
my-app/
├── package.json
├── server.js
└── Dockerfile
server.js:
const express = require('express');
const app = express();
app.get('/', (req, res) => {
res.send('Hello World v1');
});
app.listen(3000, () => console.log('Server running'));
Dockerfile:
FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
CMD ["node", "server.js"]
Build the image:
docker build -t my-app .
Run with bind mount:
docker run -d --name dev-server \
-p 3000:3000 \
-v $(pwd):/app \
-v /app/node_modules \
my-app
What's happening here:
-v $(pwd):/app: Mount current directory to/appin container-v /app/node_modules: Anonymous volume to prevent overwriting node_modules
Visit http://localhost:3000 - you see "Hello World v1"
Edit server.js on your host:
app.get('/', (req, res) => {
res.send('Hello World v2');
});
Restart the container:
docker restart dev-server
Refresh your browser - you see "Hello World v2"
No rebuild needed. Your code changes are live.
Mental Note: Bind mounts during development save time. You edit code on your host, container uses those files directly.
Anonymous Volumes
Sometimes Docker creates volumes automatically.
Look at the PostgreSQL Dockerfile (simplified):
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y postgresql
VOLUME /var/lib/postgresql/data
CMD ["postgres"]
The VOLUME instruction tells Docker: "This path needs persistence."
Run PostgreSQL without specifying a volume:
docker run -d --name auto-vol postgres
Check volumes:
docker volume ls
Output:
DRIVER VOLUME NAME
local 1a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p7q8r9s0t1u2v3w4x5y6z7a8b9c0d1e2f3
Docker created an anonymous volume automatically.
Inspect the container:
docker inspect auto-vol --format='{{.Mounts}}'
Output:
[{volume 1a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p7q8r9s0t1u2v3w4x5y6z7a8b9c0d1e2f3 /var/lib/docker/volumes/1a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p7q8r9s0t1u2v3w4x5y6z7a8b9c0d1e2f3/_data /var/lib/postgresql/data local true }]
Your data is stored, but the volume name is a random hash. Hard to manage.
Remove the container:
docker rm -f auto-vol
The anonymous volume still exists:
docker volume ls
It's orphaned. This accumulates over time.
Mental Note: Always use named volumes in production. Anonymous volumes are hard to track and clean up.
Volume Management
Listing Volumes
docker volume ls
Inspecting a Volume
docker volume inspect postgres-data
Removing a Volume
docker volume rm postgres-data
You can't remove a volume that's in use:
docker volume rm postgres-data
Output:
Error response from daemon: remove postgres-data: volume is in use - [3b4c5d6e7f8a]
Stop and remove the container first.
Cleaning Up Unused Volumes
docker volume prune
Output:
WARNING! This will remove all local volumes not used by at least one container.
Are you sure you want to continue? [y/N] y
Deleted Volumes:
1a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p7q8r9s0t1u2v3w4x5y6z7a8b9c0d1e2f3
8a9b0c1d2e3f4g5h6i7j8k9l0m1n2o3p4q5r6s7t8u9v0w1x2y3z4a5b6c7d8e9f0
Total reclaimed space: 2.3GB
This removes all volumes not attached to any container.
Sharing Volumes Between Containers
Multiple containers can use the same volume.
Create a volume:
docker volume create shared-data
Run a container that writes data:
docker run --rm -v shared-data:/data alpine sh -c "echo 'Hello from container 1' > /data/message.txt"
Run another container that reads it:
docker run --rm -v shared-data:/data alpine cat /data/message.txt
Output:
Hello from container 1
Use case: Multiple services reading the same configuration or sharing files.
Example: Nginx serving files from an app container
# App container writes static files
docker run -d --name app -v shared-data:/app/public my-app
# Nginx serves those files
docker run -d --name web -p 8080:80 -v shared-data:/usr/share/nginx/html nginx
Both containers share the same volume at different mount points.
Mental Note: Volumes are not tied to a single container. They're independent resources that containers can attach to.
Volume Drivers
Volumes can use different storage backends.
Default is local (host filesystem). But you can use:
- NFS (network storage)
- Cloud storage (AWS EBS, Azure Disk)
- Distributed storage (Ceph, GlusterFS)
Create a volume with a specific driver:
docker volume create --driver local \
--opt type=nfs \
--opt o=addr=192.168.1.100,rw \
--opt device=:/path/to/nfs \
my-nfs-volume
This creates a volume backed by NFS. Useful for sharing data across multiple Docker hosts.
Most of the time, you'll use local volumes. Other drivers are for specific infrastructure needs.
Backup and Restore
Backing Up a Volume
Create a backup:
docker run --rm \
-v postgres-data:/source \
-v $(pwd):/backup \
alpine tar czf /backup/postgres-backup.tar.gz -C /source .
What's happening:
- Mount the volume at
/source - Mount current directory at
/backup - Run
tarto compress/sourcecontents - Save to
/backup/postgres-backup.tar.gz(your host)
You'll have postgres-backup.tar.gz on your host.
Restoring a Volume
Create a new volume:
docker volume create postgres-restored
Restore the backup:
docker run --rm \
-v postgres-restored:/target \
-v $(pwd):/backup \
alpine sh -c "cd /target && tar xzf /backup/postgres-backup.tar.gz"
What's happening:
- Mount new volume at
/target - Mount backup directory at
/backup - Extract tar file into
/target
Use the restored volume:
docker run -d --name restored-db \
-e POSTGRES_PASSWORD=secret \
-v postgres-restored:/var/lib/postgresql/data \
postgres
Your data is restored.
Mental Note: Volumes can be backed up and restored using simple tar commands. Critical for database migrations and disaster recovery.
Read-Only Volumes
Sometimes you want to prevent a container from modifying a volume.
docker run -d --name readonly-app \
-v my-config:/etc/config:ro \
my-app
The :ro flag makes the volume read-only inside the container.
Try to write:
docker exec readonly-app sh -c "echo 'test' > /etc/config/test.txt"
Output:
sh: can't create /etc/config/test.txt: Read-only file system
Use case: Configuration files you don't want the application to accidentally modify.
tmpfs Mounts: Temporary Storage
Sometimes you need temporary storage that's fast and doesn't persist.
docker run -d --name fast-cache \
--tmpfs /cache:rw,size=128m \
my-app
What this does:
- Mounts
/cachein memory (RAM) - Limited to 128MB
- Data is lost when container stops
Use case:
- Caching that doesn't need persistence
- Temporary processing files
- Security-sensitive data you don't want written to disk
Mental Note: tmpfs is in-memory storage. Fast but volatile. Use for temporary data only.
Practical Patterns
Pattern 1: Database Persistence
docker run -d --name db \
-v db-data:/var/lib/postgresql/data \
-e POSTGRES_PASSWORD=secret \
postgres
Always use named volumes for databases.
Pattern 2: Configuration Injection
docker run -d --name app \
-v $(pwd)/config.yml:/app/config.yml:ro \
my-app
Bind mount configuration as read-only.
Pattern 3: Log Collection
docker run -d --name app \
-v $(pwd)/logs:/var/log/app \
my-app
Bind mount logs to host for easy access.
Pattern 4: Development Environment
docker run -d --name dev \
-v $(pwd):/app \
-v /app/node_modules \
-p 3000:3000 \
my-dev-image
Bind mount code, but use anonymous volume for dependencies.
Common Mistakes
Mistake 1: Not Using Volumes for Databases
# Bad: Data will be lost
docker run -d --name db postgres
# Good: Data persists
docker run -d --name db -v db-data:/var/lib/postgresql/data postgres
Mistake 2: Using Absolute Paths in Bind Mounts
# Bad: Only works on your machine
docker run -v /Users/john/code:/app my-app
# Good: Works anywhere
docker run -v $(pwd):/app my-app
Mistake 3: Forgetting to Clean Up Volumes
Orphaned volumes accumulate. Run periodically:
docker volume prune
Mistake 4: Overwriting Dependencies
# Bad: Overwrites node_modules in container
docker run -v $(pwd):/app my-app
# Good: Preserves node_modules
docker run -v $(pwd):/app -v /app/node_modules my-app
Debugging Volume Issues
Check What's Mounted
docker inspect my-container --format='{{json .Mounts}}' | jq
Output:
[
{
"Type": "volume",
"Name": "postgres-data",
"Source": "/var/lib/docker/volumes/postgres-data/_data",
"Destination": "/var/lib/postgresql/data",
"Driver": "local",
"Mode": "z",
"RW": true,
"Propagation": ""
}
]
Shows all mounts, their types, and paths.
Check Volume Contents
docker run --rm -v postgres-data:/data alpine ls -lah /data
Lists files in the volume.
Check Permissions
docker run --rm -v postgres-data:/data alpine ls -ld /data
Shows directory permissions.
Volume Performance Considerations
Volumes: Optimized by Docker. Generally fast.
Bind mounts on Mac/Windows: Slower due to filesystem translation between host and Docker VM. Can be significantly slower for many small files.
Bind mounts on Linux: Native performance. No overhead.
Workaround for Mac/Windows: Use volumes for data, bind mounts only for code you're actively editing.
docker run -d \
-v $(pwd)/src:/app/src \ # Only source code (smaller, fewer files)
-v node_modules:/app/node_modules \ # Dependencies in volume (faster)
my-app
Mental Notes
Containers are disposable: Anything inside the container filesystem is temporary.
Volumes outlive containers: They're separate resources. Create them, name them, manage them.
Bind mounts are for development: Direct access to files. Volumes are for production.
Always name your volumes: Anonymous volumes are hard to track and clean up.
Multiple containers, one volume: Volumes aren't locked to a single container. Share data across services.
Read-only when possible: Prevents accidental modifications. Use :ro flag.
Backup your volumes: They're just directories. Use tar or your backup tool of choice.
Clean up regularly: docker volume prune removes unused volumes. Do it periodically.
The core insight: Containers are ephemeral by design. They should be disposable. Volumes are how you preserve state across container lifetimes. Master volumes, master Docker data management.