Docker Compose File Guide for AI Agents

Updated May 2026
A Docker Compose file is the single configuration document that defines your entire AI agent infrastructure. It specifies which services run, how they connect, where they store data, and what resources they consume. Mastering the Compose file format lets you define complex multi-service agent stacks in one readable YAML document that anyone on your team can understand and modify.

Docker Compose uses a YAML file (compose.yaml or docker-compose.yml) to describe your application stack. Each section of the file controls a different aspect of your deployment, from the container images to the network topology. These steps walk through every major section of a production Compose file for AI agent workloads, building up from basic service definitions to advanced configuration patterns.

Define Services and Their Images

The services section is the heart of your Compose file. Each key under services becomes a named container that Docker creates and manages. For a typical AI agent stack, you define services for the agent runtime, the model server (Ollama, vLLM, or a similar tool), the database (PostgreSQL, Redis, or both), and optionally a vector database like Qdrant or ChromaDB for retrieval-augmented generation.

Each service specifies either an image (for pre-built containers like postgres:16 or ollama/ollama) or a build context (for your custom agent code). The build section points to a directory containing your Dockerfile. You can also specify a specific Dockerfile name if you use different Dockerfiles for development and production builds.

Service names become DNS hostnames on the Compose network. Choose clear, descriptive names like agent, ollama, postgres, and qdrant rather than abbreviations. These names appear in your environment variable URLs (http://ollama:11434), your logs, and your monitoring dashboards, so readability matters.

Use the depends_on directive to declare startup ordering between services. Your agent service should depend on its database and model server so that Compose starts those services first. Combine depends_on with health check conditions (service_healthy) rather than simple container start conditions (service_started) so that your agent waits until its dependencies are actually ready to accept connections, not just running.

Configure Networks and Service Dependencies

Compose automatically creates a default bridge network for your stack, and all services join it. For production deployments, define custom networks to isolate traffic between service tiers. A common pattern uses a frontend network connecting your API gateway to the agent runtime and a backend network connecting the agent to databases and model servers. Services join only the networks they need.

Network aliases let you give a service multiple DNS names on different networks. This is useful when you want the same database service to be reachable as db on the backend network and as analytics-db on a monitoring network. Each alias resolves to the same container IP on that specific network.

The depends_on directive with condition: service_healthy ensures your agent does not start until its dependencies pass their health checks. Without this condition, Compose only waits for the container to start, which does not guarantee the service inside is ready. A PostgreSQL container takes several seconds to initialize its database after the container process starts, and an Ollama container may need time to load a model into GPU memory.

For complex stacks with many interdependent services, map out your dependency graph before writing the Compose file. Circular dependencies (where service A depends on B and B depends on A) will prevent Compose from starting your stack. Break circular dependencies by removing one direction and having that service handle reconnection gracefully at startup.

Set Up Volumes for Persistent Data

The volumes section at the top level of your Compose file declares named volumes that Docker manages. Each volume gets a name, and you reference it in service definitions using the volumes key under each service. The syntax maps a named volume to a path inside the container, like postgres_data:/var/lib/postgresql/data for PostgreSQL.

Create separate volumes for each service and data type. A typical AI agent stack uses volumes for the database (postgres_data), model weights (ollama_models), vector database indices (qdrant_data), and agent state or logs (agent_data). Separating volumes by function makes backup, migration, and debugging simpler because you can operate on each data store independently.

For development, bind mounts (mapping a host directory into the container) let you edit code on your host machine and see changes reflected in the container without rebuilding. Use the short syntax with a relative path like ./src:/app/src. In production, always use named volumes instead of bind mounts because named volumes are portable and do not depend on specific host directory structures.

Volume driver options let you configure storage backends beyond the local filesystem. For cloud deployments, volume drivers for AWS EBS, GCP Persistent Disk, or NFS provide replicated, snapshot-capable storage that named volumes on local disk cannot offer. Specify the driver and driver options in the volume definition.

Add Environment Variables and Resource Limits

The environment key under a service sets variables directly in the Compose file. Use this for non-sensitive configuration like MODEL_ENDPOINT, LOG_LEVEL, and AGENT_MODE. For sensitive values like API keys and database passwords, use the env_file key to reference a .env file that you exclude from version control through .gitignore.

Resource limits prevent any single service from consuming all available CPU or memory on the host. The deploy.resources section (available in Compose v3+ with Docker Engine) lets you set memory and CPU limits per service. For AI agents, the model server typically needs the most resources, so allocate the largest share of memory and all GPU resources to that service while limiting the agent runtime and database to what they actually need.

GPU allocation uses the deploy.resources.reservations.devices section. Specify the driver as nvidia, set count to the number of GPUs (or all for every available GPU), and list the capabilities as gpu. This configuration requires the NVIDIA Container Toolkit to be installed on the host.

When setting memory limits, leave headroom above typical usage. A model server that normally uses 8 GB of VRAM and 4 GB of system RAM should have a memory limit of at least 14 to 16 GB to handle peak loads without being killed by the OOM (out-of-memory) killer. Docker terminates containers that exceed their memory limit, which causes data loss if the service has unsaved state.

Configure Health Checks and Restart Policies

Health checks in Compose mirror the HEALTHCHECK instruction in Dockerfiles but can be overridden or added at the Compose level. Define a test command, interval, timeout, retries, and start_period for each service. For PostgreSQL, the test command is pg_isready. For an HTTP-based agent, curl against the health endpoint. For Ollama, check that the API responds on port 11434.

The start_period is critical for AI services that have long initialization times. Model servers may need 30 to 120 seconds to load a large model into GPU memory before they can respond to health checks. Set start_period to cover this initialization window so Docker does not mark the service as unhealthy and restart it during normal startup.

Restart policies determine what happens when a container exits. For production agent stacks, use restart: unless-stopped for all services. This restarts containers automatically after crashes or host reboots but respects manual docker compose stop commands. The alternative restart: always also restarts containers you intentionally stopped, which is usually not what you want during maintenance.

Combine health checks with depends_on conditions to create a robust startup sequence. Your agent service should specify depends_on with condition: service_healthy for its database and model server. This means Compose starts the database first, waits for its health check to pass, starts the model server, waits for its health check, and only then starts the agent. This ordering eliminates race conditions where the agent tries to connect to services that are not yet ready.

Test your health check and restart configuration by manually stopping individual services with docker compose stop and verifying they restart automatically. Also test the failure case by sending a kill signal to a service process and confirming Compose detects the failure and restarts the container within the expected interval.

Key Takeaway

A production Compose file defines services with pinned image versions, custom networks for traffic isolation, named volumes for data persistence, environment variables with .env files for secrets, resource limits for stability, health checks for dependency ordering, and restart policies for automatic recovery.