Mar 10, 20265 min read

Stop Writing Monolithic System Prompts: The Skills Architecture

System prompts don't scale. Here's how I structure agent instructions as modular, versioned graph nodes in a database.

Modular Skills Architecture

Every AI project starts the same way. A system_prompt.txt file.

"You are a helpful assistant. You help users with their tasks. You write in a professional tone. You always use markdown. You never apologize."

Then you add instructions for a new feature. Then guidelines for handling errors. Then a specific rule about pricing constraints.

Three months later, your system prompt is 4,000 words long. Half the instructions contradict the other half. Research shows that LLMs struggle most with instructions buried in the middle of long contexts—and contradictory rules only make it worse. When you want to update a rule, you have to untangle it from a massive text block.

Monolithic system prompts become increasingly difficult to maintain and degrade model performance as they grow.

The Problem with Prompt Files

When you treat instructions as a single file, you hit three walls:

1. Context dilution. If an agent is handling a billing query, it doesn't need to read the 500-word section on how to format code snippets. Stuffing everything into the context window makes the model pay less attention to the things that actually matter for the current task.

2. Version control nightmare. "Which version of the prompt generated this bad response?" Good luck finding out if your prompt is just a string in a codebase.

3. No reusability. If you have a Support Agent and a Sales Agent, they probably share 60% of their core rules (tone, escalation paths, brand guidelines). In a file-based system, you copy-paste. When rules change, they drift.

The Skills Architecture

In my system, I stopped writing system prompts. I build Skills.

A skill is a workspace-scoped content block stored in the database. It is a single, focused instruction set.

flowchart TD
    A[Agent: Support] -->|has_skill| B[Skill: Brand Tone]
    A -->|has_skill| C[Skill: Billing Procedures]
    A -->|has_skill| D[Skill: Escalation Rules]
    
    E[Agent: Sales] -->|has_skill| B
    E -->|has_skill| F[Skill: Lead Qualification]
    
    class A,E decision
    class B,C,D,F worker

Instead of a 4,000-word prompt, the Support Agent has three skills attached via graph edges in SurrealDB. The Sales Agent shares the Brand Tone skill but has its own specialized skills.

Skills are Database Records

Skills are not files. They are structured data.

pub struct Skill {
    pub id: String,
    pub name: String,
    pub description: String,
    pub content: String,
    pub version: i32,
    pub triggers: Option<Vec<SkillTrigger>>,
}

When an agent spins up, the system queries the graph: "Get all active skills attached to this agent." It pulls the content blocks, orders them, and assembles the final system prompt dynamically at runtime.

The Benefits of Modularity

1. Granular Versioning

Every time a skill is updated, a snapshot is saved in a skill_version table.

When evaluating a failed execution (see Eval Harnesses for AI Agents), I can trace exactly which version of the "Billing Procedures" skill was active at that moment.

If a new rule breaks agent behavior, I don't rollback a massive prompt file. I revert one specific skill to version 4.

2. Trigger-Based Activation

Not all skills need to be active all the time.

{
  "name": "Refund Processing",
  "triggers": [
    { "type": "intent", "description": "User is requesting a refund" }
  ]
}

The agent's baseline context is small and cheap. When the router detects a refund intent, the "Refund Processing" skill is dynamically injected into the context. This solves context dilution and lowers token costs. (Cost management is critical; I cover this in The Cost Problem in AI.)

3. Human-Readable Organization

When product managers want to update how the agent handles angry customers, they don't need to dig through a JSON payload or a git repo.

They go to the dashboard, click on the "De-escalation" skill, and edit it. The architecture maps to human mental models. It treats agent instructions like a CMS, not like source code.

The Assembly Process

At runtime, the intelligence layer assembles the prompt deterministically:

[System Core]
You are an AI assistant representing Acme Corp.

[Skill: Brand Tone]
Maintain a professional, concise tone. Never use emojis.

[Skill: Lead Qualification]
Ask qualifying questions one at a time. Prioritize company size and timeline.

The assembly order is deliberate. Core instructions go first (benefiting from primacy bias), shared skills in the middle, and the most critical specialized skills last (benefiting from recency bias). This aligns with research showing LLMs attend most strongly to the beginning and end of context.

Stop Treating Prompts as Code

The instinct is to keep prompts in git because they feel like code. In my experience, they behave more like business rules.

Business rules change constantly. They are tweaked by non-engineers. They vary by customer and workspace. Storing them in git means every wording change requires a deploy—unless you invest in a separate configuration layer.

By moving agent instructions into a modular, graph-backed skills system, you decouple the engine from the behavior. The engine executes the graph. The skills define the personality and rules.


Monolithic system prompts are the spaghetti code of the AI era. Break them down. Make them modular. Give your agents skills, not scripts.

Enjoyed this article?

Share it with others or connect with me