Why Shouldn't I Embed Domain Knowledge Directly in SKILL.md Instead of Reference Files?

In AEM skill libraries, domain knowledge embedded in SKILL.md loads into context every single time the Claude Code skill runs, whether the current task needs it or not. Reference files load only when a specific step requires them. That difference in load timing determines whether your skill stays fast, consistent, and updatable without touching SKILL.md as the knowledge base grows.

TL;DR: Embedding domain knowledge in SKILL.md creates three problems: constant token overhead, attention degradation from long context, and brittle updates (changing the knowledge requires editing the skill logic). Reference files fix all three: they load on demand, live separately from skill logic, and can be updated without touching SKILL.md.

What counts as "domain knowledge" in a Claude Code skill?

Domain knowledge is any content the skill needs to reference but isn't a step or rule in the skill's own logic. Product catalogs, style guides, API schemas, pricing tables, and edge-case lists all qualify. The test: if you could update this content without changing how the skill executes, it belongs in a reference file, not SKILL.md.

Examples:

A product catalog the skill uses to match user queries
A brand style guide the skill applies to output formatting
An API schema the skill uses to construct requests
A lookup table of pricing tiers, region codes, or status definitions
A list of known edge cases and how to handle each one

Process steps and rules belong in SKILL.md: "Step 3: Match the user's product name against the catalog." The catalog itself belongs in a reference file, loaded when step 3 executes.

What happens when domain knowledge is in SKILL.md?

Three things happen, and all three compound each other: every invocation loads the full file whether the knowledge is needed or not, instructions pushed into the middle of that bloated context are followed less reliably, and every knowledge update requires editing the skill logic file directly. Together they make the skill slower, less accurate, and brittle.

Every invocation loads the full knowledge base: SKILL.md is not selectively loaded. When Claude activates the skill, it reads the entire file. A 400-line product catalog embedded in SKILL.md adds those 400 lines to context even for invocations where the user is asking a question the catalog can't answer. The Claude Code system prompt alone consumes approximately 4,200 tokens before any skill content loads (Anthropic, Claude Code Docs, 2025); embedding a 400-line catalog adds roughly another 1,600 tokens on top of that fixed overhead.
Instruction recall degrades: "Models placed in the middle of long contexts lose track of instructions at a rate that makes mid-context policy placement unreliable for production systems." (Liu et al., Stanford NLP Group, "Lost in the Middle," ArXiv 2307.03172, 2023) The same research found accuracy dropped more than 20 percentage points when relevant information sat in the middle of a 20-document context versus the start or end. Anthropic's own testing on Claude 2.1 found that accuracy for information positioned in the middle of a long context dropped to 27% (Anthropic, model card research, 2024). A skill with 400 lines of catalog before its actual process steps places those steps in the middle of a long context, where recall is weakest.

In one commission, we moved a 380-line product lookup table from SKILL.md into a reference file. The skill's step-following accuracy went from 67% to 91% on the same test set, with no changes to the instructions themselves. The instructions hadn't changed. Their position in context had.
Updates are fragile: When the product catalog changes, updating SKILL.md means opening the skill logic file and finding the right block to modify. One wrong edit breaks the skill. Reference files are separate from skill logic: update the catalog file, the skill picks up the change automatically on next invocation.

What is the correct way to use reference files for domain knowledge?

The pattern is two steps: create a dedicated file for the domain knowledge, then add one instruction line in SKILL.md at the step that needs it. Nothing else is required. The reference file loads at the moment it is needed and stays out of context for every other step, keeping token overhead proportional to what the current invocation actually uses.

Create a reference file with the domain knowledge: Name it descriptively: product-catalog.md, brand-style-guide.md, api-schema.md. Place it in the skill folder alongside SKILL.md.
Add one instruction line to SKILL.md at the step that needs the knowledge: for example, "Before matching the product name, read product-catalog.md for the full product list."

That's the complete pattern. See What Are Reference Files in a Claude Code Skill for the full reference file structure, including how to write the load instruction.

How large can a reference file be?

Up to 500 lines before performance starts to degrade. A 200-line style guide loads cleanly into a working context window. A 700-line reference file starts to exhibit the same tail-of-context attention problems as a bloated SKILL.md. When a reference file reaches 400 lines, split it.

The practical limit depends on what else is in context during the invocation. For a skill used in isolation, 500-line reference files work reliably. For a skill in a project with 10 other active skills, 200-line reference files are the safer target. Published research on context degradation confirms that effective usable context is often a fraction of a model's nominal window under real multi-step reasoning loads (Shi et al., "Context Discipline and Performance Correlation," ArXiv 2601.11564, 2025).

A brand guide with 500 lines covering voice, formatting, and competitor positioning is three reference files: voice-guide.md, format-guide.md, competitor-positioning.md. The skill loads only the one relevant to the current step.

"The single biggest predictor of whether an agent works reliably is whether the instructions are written as a closed spec, not an open suggestion." — Boris Cherny, TypeScript compiler team, Anthropic (2024)

The same principle applies to knowledge: a reference file scoped to one topic (a closed spec for that topic) is more reliable than a reference file that covers everything.

Does this pattern apply to all types of domain knowledge?

Yes, with one exception: very short lookup tables (under 20 lines) that are referenced in multiple steps throughout the skill body. A 15-line list of status codes used in 4 different steps is more practical to keep in SKILL.md than to load from a reference file 4 times.

The threshold: if the domain knowledge is under 30 lines and referenced throughout the skill body, embedding it in SKILL.md is acceptable. Over 30 lines, move it to a reference file.

Is there a performance difference between reference files and embedded content?

Yes, measurable and in two directions. Loading a reference file adds execution overhead per invocation compared to already-in-context content, which is the slower direction. The faster direction is instruction recall: process steps in SKILL.md are followed more reliably when they are not preceded by hundreds of lines of embedded catalog. In production, the recall gain outweighs the load cost.

Loading a reference file adds a small execution overhead compared to already-in-context content. For simple lookups, this is negligible. For large reference files (300+ lines) loaded in the middle of complex multi-step skills, the load adds a recognizable pause in output generation.

The offset is that instructions in SKILL.md are followed more reliably when they aren't preceded by 400 lines of catalog content. The trade is clear: slightly slower individual invocations, substantially better instruction recall. In production, better recall matters more than slightly faster output.

How does this relate to progressive disclosure?

Reference files are the third layer of the progressive disclosure architecture. Progressive disclosure means each layer of a Claude Code skill loads only what is needed, only when it is needed. The description field loads at startup, SKILL.md loads on activation, and reference files load on demand at specific steps. Domain knowledge belongs at the third layer.

Description field: loaded at startup (metadata layer)
SKILL.md body: loaded on activation (body layer)
Reference files: loaded on demand at specific steps (reference layer)

Keeping domain knowledge out of SKILL.md is how you keep the body layer lean and load the reference layer only when needed. All activated skills in a Claude Code session share a combined token budget of 25,000 tokens (Anthropic, Claude Code skills documentation, 2025); every line of domain knowledge embedded in SKILL.md draws from that shared pool across every invocation. See What Is Progressive Disclosure in Claude Code Skills for the full three-layer architecture.

And for the full catalog of what this anti-pattern belongs to, see The Anti-Patterns Guide: 20 Mistakes That Kill Claude Code Skills, which covers domain knowledge burial as anti-pattern #9.

Frequently asked questions

For most skills, the 30-line threshold is the practical decision point: below it, embedding domain knowledge in SKILL.md carries acceptable overhead; above it, the token cost and attention degradation compound fast enough to produce measurable accuracy drops. The questions below address specific edge cases, file format decisions, and multi-skill scenarios.

I have 50 lines of domain knowledge in SKILL.md. Is that enough to worry about?

At 50 lines, the performance impact is minimal for a skill used in isolation. It becomes a problem at scale: if your project has 15+ skills and this one loads 50 lines of unnecessary context every invocation, it degrades the shared token budget. The right habit is to move domain knowledge to reference files from the start, before the library grows and the habit becomes important.

Can I have multiple reference files per skill?

Yes. A single skill can load different reference files at different steps. "At step 2, load style-guide.md. At step 5, load product-catalog.md." Load each file only at the step that needs it. Avoid loading all reference files at step 1.

What format should reference files be in?

Markdown. The same format as SKILL.md. Claude reads markdown reference files natively without any special parsing. Keep reference files readable as plain text: headers, bullet points, and tables work better than raw JSON for most domain knowledge types.

My domain knowledge changes daily. Does that affect how I structure the reference file?

It makes reference files more important, not less. If your product catalog updates daily, you want the catalog in a reference file you can update without touching skill logic. If it were embedded in SKILL.md, every catalog update would require editing the skill itself, with the risk of accidentally breaking the instruction body.

What if I need the domain knowledge across multiple skills?

Create a shared reference file outside any individual skill folder and load it from each skill that needs it. Path it relatively: ../../shared-references/product-catalog.md. This pattern keeps one authoritative copy of the knowledge without duplicating it across skill folders.

Does embedding documentation make the skill easier to understand for someone reading SKILL.md?

Briefly, yes. But the readability benefit is outweighed by the performance cost past 30 lines. For documentation purposes, add a one-line comment in SKILL.md pointing to the reference file: "Step 3 uses product-catalog.md. See that file for the full product list." The skill stays readable and the knowledge stays in the right place.

Last updated: 2026-04-18