title: "When Does the SKILL.md Body Get Loaded into Context?" description: "The SKILL.md body loads when Claude matches your prompt to the skill's description. It stays in context for the session and costs 400-1,000 tokens on trigger." pubDate: "2026-04-15" category: skills tags: ["claude-code-skills", "progressive-disclosure", "skill-body", "context-loading"] cluster: 14 cluster_name: "Progressive Disclosure Architecture" difficulty: intermediate source_question: "When does the SKILL.md body get loaded into context?" source_ref: "14.Intermediate.3" word_count: 1500 status: draft reviewed: false schema_types: ["Article", "FAQPage"]

TL;DR: The SKILL.md body loads when Claude's description-matching classifier identifies your prompt as a match for that skill. This happens mid-conversation, not at startup. Once loaded, the body stays in context for the rest of the session. A 200-line body at an average of 5 tokens per line costs roughly 1,000 tokens on trigger. Keep the body self-contained.

What triggers the body to load?

The body loads when Claude matches your prompt to the skill's description field — specifically, when its internal classifier determines that the semantic intent of your prompt satisfies the activation condition written in that description, not when you start the session or issue a keyword command. This is intent-matching, not lexical matching, so the exact words in your prompt do not need to appear in the description.

Claude evaluates every incoming prompt against all loaded skill descriptions (which are in the system prompt from startup). When the prompt's semantic intent matches a description's activation condition, Claude invokes the skill and loads its full SKILL.md body into the context window.

The matching is semantic, not lexical. If your description says "Use when the user asks to review a pull request," the body loads for "can you look at this PR?" even though those exact words don't appear in the description. Claude is matching intent.

If no description matches, no body loads. The skill costs zero additional tokens beyond its description in the startup registry.

What's the token cost when the body loads?

A 200-line SKILL.md body at an average of 4-5 tokens per line costs 800-1,000 tokens when triggered, making it equivalent in context weight to two to four model responses, though that cost only occurs once per session per skill regardless of how many times you invoke it (AEM internal benchmarks, 2025). That is a modest one-time load — equivalent to 2-4 Claude Sonnet responses — and it does not repeat for subsequent invocations of the same skill within the same session.

Where cost becomes a problem is when bodies exceed their necessary scope. We've seen production skill builds where the body exceeded 500 lines because the developer copied reference content directly into it instead of pointing to a reference file (AEM internal benchmarks, 2025). A 500-line body costs 2,000-2,500 tokens per invocation. Over 10 invocations in one session, that's 20,000-25,000 tokens consumed by one over-stuffed skill.

The right body length is the minimum needed to handle common inputs correctly. Edge cases, specialized domain knowledge, and large lookup tables belong in reference files.

"When you give a model an explicit output format with examples, consistency goes from ~60% to over 95% in our benchmarks." - Addy Osmani, Engineering Director, Google Chrome (2024)

That figure applies directly to skill bodies. A body with an explicit output contract and format examples produces consistent output. A body that gestures at the output format without specifying it produces inconsistent output at scale.

Does the body stay in context after the skill runs?

Once the SKILL.md body loads into the context window at first trigger, it stays there for the rest of the session — it is not evicted when the skill finishes executing, and every subsequent call to that skill within the same session reads from the already-loaded body at zero additional token cost. This means subsequent invocations of the same skill cost zero additional tokens within that session, but it also means that every loaded body accumulates and reduces the space available for responses as the session grows longer.

Two implications:

Positive: Subsequent invocations of the same skill in the same session don't incur the loading cost again. The body is already in context.

Negative: A body loaded early in a session occupies context space for everything that follows. Long sessions with multiple skill invocations accumulate body content in the context window. Late-session turns have less space available for responses.

The practical mitigation is session hygiene: start new sessions for significantly different tasks rather than extending one session across unrelated work. A session that starts with a code review skill and ends with a content writing skill has two bodies loaded, neither needed for the other.

Body changes made during a session also don't take effect until the next session. If you edit SKILL.md while a skill is in use, the running session works from the body that loaded at first trigger. Changes require a session restart to apply.

What makes a good body for reliable execution?

A body is a closed specification: every decision Claude needs to make on common inputs must have an explicit answer inside it — not a hint, not a suggestion, and not a pointer to external documentation that the body doesn't explicitly tell Claude to read. When the body leaves gaps, Claude fills them with plausible behavior derived from training data — and that behavior is almost never correct for your specific use case.

For example: "Step 3: read references/api-schema.md before generating any API call code" — not "consult the API documentation if available." The first is an answer. The second is a hint.

The structure we use in production builds:

  1. Trigger condition (when to use this skill, reiterated from the description for emphasis)
  2. Output contract (what the skill produces and what it does NOT produce)
  3. Process steps (numbered, sequential, imperative voice)
  4. Rules (edge cases and failure modes with explicit handling)
  5. Self-verification (the check Claude runs before delivering output)

A body in this structure runs 100-180 lines for a well-scoped skill (AEM internal benchmarks, 2025). Under 80 lines usually means the skill is underspecified. Over 250 lines usually means reference content leaked into the body, which belongs elsewhere.

For the full anatomy of a SKILL.md file and what goes in each section, see What Goes in a SKILL.md File?.

How do I know if my body is loading and being followed correctly?

The clearest signal is consistent output on first invocation in a fresh session — if the skill produces the right result the first time you trigger it with no prior iterative context, the body loaded and Claude followed it correctly from first contact. If it does not, the most common cause is instructions buried in the middle of a long body where model attention is weakest.

If the skill ignores specific instructions in the body, the most common causes are:

  1. Instructions buried too deep. Stanford NLP research on context attention ("Lost in the Middle," Nelson Liu et al., 2023, ArXiv 2307.03172) found that models follow instructions at the start and end of long contexts more reliably than those in the middle. Critical instructions belong in the first 30% of your body.

  2. Contradictory instructions. If process steps say one thing and the rules section says another, Claude picks the interpretation that seems most plausible. You lose.

  3. Missing reference pointer. The instruction is in a reference file, but the body doesn't tell Claude to read that file. Claude executes without the content. No error appears.

For skill-specific troubleshooting, see Why Isn't My Claude Code Skill Working?.

This loading model does not help with context overflow from very long conversations. Once context is full, earlier content (including loaded skill bodies) gets compressed or dropped. For sessions that run over 50,000 tokens, start a new session before quality degrades (Anthropic Claude model documentation, 2025).

FAQ: When the SKILL.md body loads

Does the body load every time I mention a topic the skill covers, or only when I explicitly invoke it? Claude's matching is based on the description's activation condition, not topic mentions. If your description says "Use when the user asks for a full pull request review," the body loads only when that specific intent is clear, not whenever code comes up.

Can I have a skill that loads its body at session startup instead of on-demand? Not through the standard skill mechanism. All bodies load on demand. Content you need always available belongs in CLAUDE.md, which loads at startup.

What is the practical upper limit for body length before Claude stops following it reliably? Quality holds up to approximately 300 lines. Above 350 lines, adherence to instructions in the middle of the body starts to degrade, as documented in Stanford NLP's "lost in the middle" research (2023). Instructions in positions 30%-70% of the body are less reliably followed than those at the start or end.

If I update my SKILL.md during a session, does the running session see the new content? No. The body loads once when the skill first triggers in a session. Changes to the file require starting a new session to take effect.

Can one skill body trigger another skill? Not directly. A skill body can instruct Claude to perform sub-tasks, but it cannot invoke another skill's body. For multi-skill orchestration, an agent architecture is the correct pattern.

What happens if Claude loads the wrong skill body because the description matched incorrectly? The skill runs against your prompt using the wrong instructions. Output is wrong and there is no error message. The fix is a more precise description with explicit negative conditions.

Last updated: 2026-04-15