title: "How Does the Metadata Layer Work at Startup in Claude Code Skills?" description: "The metadata layer is your skill's description field. Claude reads it at every session start and uses it as the sole basis for trigger matching." pubDate: "2026-04-15" category: skills tags: ["claude-code-skills", "progressive-disclosure", "metadata-layer", "skill-loading"] cluster: 14 cluster_name: "Progressive Disclosure Architecture" difficulty: intermediate source_question: "How does the metadata layer work at startup?" source_ref: "14.Intermediate.2" word_count: 1520 status: draft reviewed: false schema_types: ["Article", "FAQPage"]
TL;DR: The metadata layer is the description field in your skill's YAML frontmatter. Claude reads it from every installed skill at session start, assembles them into a system prompt registry at ~100 tokens per skill (Anthropic, 2024), and uses descriptions as the sole basis for trigger matching. Body content loads only when triggered.
What exactly is the metadata layer?
The metadata layer is the YAML frontmatter block at the top of your SKILL.md file in AEM (Agent Execution Model), specifically the description field — and it is the only part of your skill that Claude reads at session start, which means it is the sole signal Claude uses to decide whether your skill triggers at all. Claude Code reads this field from every installed skill at session start and assembles the descriptions into a skill registry in the system prompt.
These descriptions function as a list of available tools with their activation conditions. The name field also contributes (it becomes the tool name in the registry), but the description drives matching.
Everything else in your skill file, the process steps, the output contract, the rules, the failure modes: that's the body layer. It does not load at startup. It loads when the skill triggers.
This is not a quirk of Claude's architecture. It's a deliberate cost management decision. Descriptions are cheap summaries. Bodies are expensive instruction sets. Loading bodies at startup for skills that never get triggered would be a straight waste.
What does Claude do with all the descriptions at startup?
Claude reads all installed skill descriptions and loads them into its system prompt before you type a single message — this happens once per session, every session, so whatever you wrote in the description field is already shaping Claude's behavior before your first prompt lands. The descriptions function as a classifier input: when you send a prompt, Claude evaluates it against every loaded description to decide which skill to invoke.
Anthropic's 2024 tooling documentation confirms this evaluation uses semantic matching, not keyword search. A description reading "Use when the user asks to review a pull request" will trigger on "can you check this PR?" even though the exact words differ.
This has two practical implications for skill design:
- Your description needs to capture the semantic intent of your triggers, not just the exact words users type.
- Your description must exclude the semantic patterns of prompts that should NOT trigger the skill.
The cost figure of ~100 tokens per description includes more than the raw word count. The system prompt entry includes the skill name, formatting markers, and tool schema overhead alongside the description text. A 60-word description contributes approximately 100-120 tokens to the startup prompt (Anthropic token counting reference, 2024).
"The single biggest predictor of whether an agent works reliably is whether the instructions are written as a closed spec, not an open suggestion." - Boris Cherny, TypeScript compiler team, Anthropic (2024)
A vague description is an open suggestion. Claude matches against it loosely. The skill triggers incorrectly. A precise description is a closed spec for activation. The matching becomes reliable.
Why does the 1,024-character limit exist?
The description field has a hard limit of 1,024 characters (Claude Code documentation, 2024), enforced by Claude Code's tooling layer to keep the metadata layer lightweight and prevent startup costs from compounding across large skill libraries — without this cap, each additional skill would silently erode the context budget available for actual work. Without this cap, a developer with 30 installed skills could consume the entire context window before any conversation began — loading instructions Claude will never use in that session.
The limit caps each skill's startup cost.
1,024 characters is approximately 200-250 words (based on Anthropic's published token-to-character ratios, 2024). That is enough for: a precise trigger condition, 2-3 examples of what should NOT trigger the skill, and a one-line summary of what the skill produces. It is not enough for step-by-step instructions, which belong in the body.
In our production builds, descriptions average 300-450 characters. Longer is not better at this layer. A description that fills all 1,024 characters usually means step instructions leaked into the metadata layer. That is a design error: instructions in the description load at startup for every session, even when the skill never gets used.
What are the token economics of the metadata layer?
With 20 installed skills at approximately 100 tokens each, the metadata layer costs 2,000 tokens at startup — about 1% of Claude Sonnet's 200,000-token context window (Anthropic, 2024) — meaning the raw token cost of loading descriptions is cheap enough that it is rarely the thing that breaks as skill libraries grow. Scale to 100 skills and the cost reaches roughly 10,000 tokens, still a small fraction of available context, which means token cost is not the binding constraint at scale.
The binding constraint is classifier coherence.
Claude Code documentation recommends keeping active skill libraries under 30 skills for reliable discovery. Above that threshold, descriptions that are semantically similar produce trigger ambiguity: two skills both think they should respond to the same prompt. The token cost remains manageable well above 30 skills. The matching accuracy does not.
One limitation worth naming: the description-only matching model cannot handle multi-intent prompts reliably. If a user's message contains two distinct intents that map to two different skills, Claude picks one and ignores the other. The metadata layer has no mechanism for splitting a prompt or queuing multiple skill activations from a single message.
For a detailed breakdown of token economics across all three loading layers, see How Does Progressive Disclosure Save Tokens and Improve Performance?.
How should I write descriptions knowing they load at startup?
Write descriptions as activation conditions, not feature summaries — because Claude's classifier at startup needs to know when to use your skill, not what it does, and getting this wrong means your skill either triggers on prompts it shouldn't handle or stays silent on the ones it should catch. A feature summary tells Claude what the skill does; an activation condition tells it when to fire and, equally important, when not to.
A feature summary: "This skill helps with code review, pull requests, checking quality, identifying bugs, and improving code style."
An activation condition: "Use when the user provides a pull request URL and asks for a full review. Do not use for general code questions or debugging without a specific PR."
The feature summary tells Claude what the skill does. The activation condition tells Claude when to use it and when not to. Claude's classifier needs the latter.
Both positive and negative conditions belong in the description:
- Positive: "Use when the user asks to review a pull request and provides a URL."
- Negative: "Do not use for general coding questions, inline debugging, or cases where no PR URL is provided."
This is not keyword stuffing. It's giving the classifier enough signal to make the right call. For a focused guide on negative triggers, see What Are Negative Triggers and Why Should I Include Them in the Description?.
What if a code formatter breaks my description onto multiple lines?
The description must stay on a single line in the YAML frontmatter, because if Prettier or another formatter wraps it across multiple lines, Claude fails to parse the full description: the activation condition gets truncated, producing false positive triggers, failures to fire, or both — with no warning at parse time.
Two fixes:
- Add
*.mdto your.prettierignorefile to exclude skill files from formatting. - Keep your description under 80 characters. This eliminates line-wrap triggers entirely and has the secondary benefit of forcing tighter, more precise language.
Shorter descriptions are more reliable: harder to corrupt, faster to parse, and usually more precise. If your description requires 400 characters to express the activation condition, you probably need two separate skills, not a longer description.
For a complete guide to description design across all dimensions, see The SKILL.md Description Field: The One Line That Makes or Breaks Your Skill.
FAQ: The metadata layer at startup
Is the metadata layer loaded fresh every session, or cached between sessions? Fresh every session. Claude reads all skill description fields at the start of each session. There is no cross-session caching of the skill registry.
Can I have two skills with significantly overlapping descriptions? Yes, but expect unreliable triggering. When two descriptions are semantically similar, Claude's classifier picks one inconsistently. The fix is to write explicit negative conditions in each description to create a clear boundary between them.
Does the order of skills in the skills directory affect which one Claude picks when descriptions overlap? No documented ordering preference exists in Claude Code's tooling. When descriptions overlap, triggering is ambiguous regardless of file order. The correct fix is clearer, more distinct descriptions.
What happens if my description field is empty? Claude has no basis for triggering the skill automatically. The skill does not appear in Claude's auto-trigger registry. You can still invoke it manually by name, but it will never activate on natural language prompts.
What's the practical maximum number of installed skills before the metadata layer causes performance problems? Claude Code documentation cites approximately 30 skills as the discovery threshold. Above that, semantically similar descriptions create ambiguity. Token cost stays manageable above 30. Classifier accuracy does not. There is a hard cap on the metadata budget as well: the combined description and when_to_use text per skill is truncated at 1,536 characters in the skill listing (Claude Code documentation, 2025), so descriptions that exceed that limit are silently cut regardless of how many skills are installed.
Last updated: 2026-04-15