Stop Writing Monolithic System Prompts: The Skills Architecture
System prompts don't scale. Here's how I structure agent instructions as modular, versioned graph nodes in a database.
System prompts don't scale. Here's how I structure agent instructions as modular, versioned graph nodes in a database.

Every AI project starts the same way. A system_prompt.txt file.
"You are a helpful assistant. You help users with their tasks. You write in a professional tone. You always use markdown. You never apologize."
Then you add instructions for a new feature. Then guidelines for handling errors. Then a specific rule about pricing constraints.
Three months later, your system prompt is 4,000 words long. Half the instructions contradict the other half. Research shows that LLMs struggle most with instructions buried in the middle of long contexts—and contradictory rules only make it worse. When you want to update a rule, you have to untangle it from a massive text block.
Monolithic system prompts become increasingly difficult to maintain and degrade model performance as they grow.
When you treat instructions as a single file, you hit three walls:
1. Context dilution. If an agent is handling a billing query, it doesn't need to read the 500-word section on how to format code snippets. Stuffing everything into the context window makes the model pay less attention to the things that actually matter for the current task.
2. Version control nightmare. "Which version of the prompt generated this bad response?" Good luck finding out if your prompt is just a string in a codebase.
3. No reusability. If you have a Support Agent and a Sales Agent, they probably share 60% of their core rules (tone, escalation paths, brand guidelines). In a file-based system, you copy-paste. When rules change, they drift.
In my system, I stopped writing system prompts. I build Skills.
A skill is a workspace-scoped content block stored in the database. It is a single, focused instruction set.
flowchart TD
A[Agent: Support] -->|has_skill| B[Skill: Brand Tone]
A -->|has_skill| C[Skill: Billing Procedures]
A -->|has_skill| D[Skill: Escalation Rules]
E[Agent: Sales] -->|has_skill| B
E -->|has_skill| F[Skill: Lead Qualification]
class A,E decision
class B,C,D,F worker
Instead of a 4,000-word prompt, the Support Agent has three skills attached via graph edges in SurrealDB. The Sales Agent shares the Brand Tone skill but has its own specialized skills.
Skills are not files. They are structured data.
pub struct Skill {
pub id: String,
pub name: String,
pub description: String,
pub content: String,
pub version: i32,
pub triggers: Option<Vec<SkillTrigger>>,
}
When an agent spins up, the system queries the graph: "Get all active skills attached to this agent." It pulls the content blocks, orders them, and assembles the final system prompt dynamically at runtime.
Every time a skill is updated, a snapshot is saved in a skill_version table.
When evaluating a failed execution (see Eval Harnesses for AI Agents), I can trace exactly which version of the "Billing Procedures" skill was active at that moment.
If a new rule breaks agent behavior, I don't rollback a massive prompt file. I revert one specific skill to version 4.
Not all skills need to be active all the time.
{
"name": "Refund Processing",
"triggers": [
{ "type": "intent", "description": "User is requesting a refund" }
]
}
The agent's baseline context is small and cheap. When the router detects a refund intent, the "Refund Processing" skill is dynamically injected into the context. This solves context dilution and lowers token costs. (Cost management is critical; I cover this in The Cost Problem in AI.)
When product managers want to update how the agent handles angry customers, they don't need to dig through a JSON payload or a git repo.
They go to the dashboard, click on the "De-escalation" skill, and edit it. The architecture maps to human mental models. It treats agent instructions like a CMS, not like source code.
At runtime, the intelligence layer assembles the prompt deterministically:
[System Core]
You are an AI assistant representing Acme Corp.
[Skill: Brand Tone]
Maintain a professional, concise tone. Never use emojis.
[Skill: Lead Qualification]
Ask qualifying questions one at a time. Prioritize company size and timeline.
The assembly order is deliberate. Core instructions go first (benefiting from primacy bias), shared skills in the middle, and the most critical specialized skills last (benefiting from recency bias). This aligns with research showing LLMs attend most strongly to the beginning and end of context.
The instinct is to keep prompts in git because they feel like code. In my experience, they behave more like business rules.
Business rules change constantly. They are tweaked by non-engineers. They vary by customer and workspace. Storing them in git means every wording change requires a deploy—unless you invest in a separate configuration layer.
By moving agent instructions into a modular, graph-backed skills system, you decouple the engine from the behavior. The engine executes the graph. The skills define the personality and rules.
Monolithic system prompts are the spaghetti code of the AI era. Break them down. Make them modular. Give your agents skills, not scripts.