Prompt Composition: Building Prompts from Parts
The Problem with Monolithic Prompts
Most agent systems start with a single prompt string. It begins small: a few sentences describing the agent's role and a paragraph of instructions. Over time, it grows. New instructions are added for edge cases the agent mishandled. Tool descriptions are expanded with usage examples. Output format constraints are tightened after the agent produced malformed responses. Context injection points are added for dynamic data. Special-case handling is added for specific customers or scenarios.
After a few months of iteration, the prompt is thousands of tokens long. No single person understands all of it. Changes in one section have unexpected effects on behavior governed by another section. Testing involves running the entire agent and hoping the change did not break anything unrelated. The prompt becomes the kind of tangled, untestable artifact that software engineering spent decades learning to avoid.
The problems compound in multi-agent systems. If ten agents share some instructions but differ in others, the shared instructions are typically copied into each agent's prompt. When a shared instruction needs updating, the change must be applied to all ten copies. Inevitably, some copies are missed or updated incorrectly, leading to inconsistent behavior across agents that should be identical in that aspect.
Prompt composition addresses these problems by decomposing the monolithic prompt into discrete components, each with a specific purpose, that are assembled into a complete prompt at runtime. The composition system manages how components combine, what order they appear in, and how conflicts between components are resolved.
Component Types
Agent prompts decompose naturally into several categories of components, each with distinct characteristics.
Identity components define who the agent is. The agent's name, role, personality, and core behavioral guidelines. These components are stable, changing only when the agent's fundamental purpose changes. They form the foundation of the prompt and are typically placed at the beginning of the system message where they have the most influence on the model's behavior.
Instruction components define how the agent should work. Step-by-step procedures for common tasks, decision criteria for ambiguous situations, quality standards for outputs, error handling protocols, and escalation rules. These components change more frequently than identity components as the team discovers new edge cases and refines the agent's procedures.
Tool components describe the tools available to the agent. Each tool gets its own component containing the tool name, description, parameter specifications, usage examples, error handling guidance, and notes about when to use this tool versus alternatives. Tool components are maintained alongside the tool implementation and updated whenever the tool's interface or behavior changes.
Context components provide dynamic information that varies with each task or session. The current user's profile, the conversation history, relevant documents or data, results from previous agent actions, and any other information the agent needs for the specific task at hand. Context components are generated at runtime rather than authored manually.
Constraint components define the boundaries of acceptable output. Output format specifications, length limits, tone guidelines, content policies, and compliance requirements. These components act as guardrails that prevent the agent from producing outputs that violate organizational standards or regulatory requirements.
Composition Strategies
How components are combined into a final prompt matters as much as how they are decomposed. Several composition strategies address different needs.
Layered composition stacks components in a defined order, with each layer building on the ones before it. The base layer establishes identity. The instruction layer adds procedural guidance. The tool layer describes capabilities. The context layer provides task-specific information. The constraint layer adds guardrails. This approach is simple, predictable, and easy to debug because the component order is fixed and each layer has a clear position in the prompt.
Template composition uses a template with named slots that are filled by components at runtime. The template defines the structure: "You are {identity}. Your task is {task_description}. You have access to the following tools: {tool_list}. Follow these guidelines: {instructions}. Your output must {constraints}." Each slot is populated by the appropriate component. Template composition provides more flexibility in positioning than layered composition and makes the overall prompt structure visible at a glance.
Conditional composition includes or excludes components based on runtime conditions. If the current task involves code, include the coding instructions component. If the user is a premium customer, include the premium support component. If the system is in maintenance mode, include the maintenance notice component. Conditional composition keeps prompts lean by only including information relevant to the current situation, which preserves context window space for the actual task.
Priority-based composition assigns priorities to components and uses them to resolve conflicts and manage context window limits. When the assembled prompt exceeds the target size, lower-priority components are truncated or excluded first. This ensures that the most important instructions are always present even when context space is limited. Priority-based composition is especially valuable for agents that handle a wide variety of tasks, where the full set of instructions for all possible task types would exceed the context window.
Managing Component Interactions
Components do not exist in isolation. They interact, and those interactions can produce unexpected behavior if not managed carefully.
Instruction conflicts occur when two components give contradictory guidance. One component says "always include a confidence score with your answer" and another says "respond with just the answer, nothing else." The model resolves the conflict unpredictably, sometimes following one instruction and sometimes the other. Preventing instruction conflicts requires either careful component design that avoids overlapping concerns or an explicit conflict resolution strategy (later components override earlier ones, or specific component priorities determine which instruction wins).
Context dilution occurs when adding more components causes the model to pay less attention to each individual component. A prompt with three clear instructions produces more consistent behavior than a prompt with thirty instructions, even if all thirty are individually clear. Composition systems should monitor total prompt size and component count, trimming or consolidating components when the prompt becomes too large for reliable instruction following.
Ordering effects arise because models pay different amounts of attention to different positions in the prompt. Instructions at the beginning and end of a prompt generally receive more attention than instructions in the middle. Critical components should be placed in high-attention positions. Less critical components can occupy middle positions where some attention loss is acceptable. The composition system should control component ordering rather than leaving it to chance.
Testing component interactions requires integration testing that evaluates the assembled prompt as a whole, not just the individual components. A component that works perfectly in isolation might cause problems when combined with other components. The testing strategy should include a standard set of test cases that exercise the full assembled prompt, with assertions on both the content of the agent's responses and the behavioral properties (like which tool it chooses to call first).
Versioning and Deployment
Prompt components need the same version management discipline as software code. Each component should have a version identifier that changes when the component is modified. The assembled prompt should record which version of each component it includes, creating a complete manifest that can be reconstructed at any point in the future.
Version control serves several purposes. It enables rollback to a known-good configuration when a change causes problems. It supports A/B testing by deploying different component versions to different agent pools and comparing performance. It enables audit by showing exactly which instructions were in effect when a specific agent decision was made. And it supports debugging by allowing you to reproduce the exact prompt that an agent used for a specific task.
Deployment of prompt changes can follow the same patterns used for code deployment. A staging environment where changes are tested before reaching production. A canary deployment where changes are applied to a small percentage of agents first. A gradual rollout that increases the percentage over time as confidence in the change grows. These deployment patterns, combined with automated quality monitoring, provide confidence that prompt changes improve agent behavior rather than degrading it.
The version management system should also track component dependencies. If component A references concepts defined in component B, updating component B might require a corresponding update to component A. Dependency tracking ensures that related components are updated together and that incompatible version combinations are detected before they reach production.
Practical Implementation
The simplest viable implementation of prompt composition uses a directory of text files, one per component, with a configuration file that defines the composition order and any conditional inclusion rules. The composition engine reads the configuration, loads the specified components, applies any conditional logic, and concatenates the results into the final prompt. This can be implemented in a few dozen lines of code in any language.
More sophisticated implementations add features incrementally. Variable substitution lets components reference dynamic values. Include directives let components reference other components. Conditional blocks enable inline conditional logic within components. Validation rules check the assembled prompt against size limits, required sections, and format constraints. Each feature adds complexity but also adds capability that becomes important as the system grows.
The key implementation principle is that the composition should be deterministic. Given the same set of component versions and the same runtime parameters, the composition engine should always produce the same prompt. This determinism is essential for reproducibility: if you need to investigate why an agent behaved a certain way, you need to be able to reconstruct the exact prompt it used.
Prompt composition applies software engineering principles to prompt management. Decompose prompts into focused, reusable components. Assemble them at runtime using a deterministic composition strategy. Version each component independently and test the assembled whole. The investment in composition infrastructure pays back quickly as agent systems grow beyond a handful of simple prompts.