How Many Tokens Does Claude Use to Store My Skill Descriptions at Startup?

TL;DR: Each skill description uses roughly 30-120 tokens at session start, depending on its character length. A library of 20 skills with descriptions averaging 100 characters each costs approximately 500-2,000 tokens total for the startup index. That's the fixed cost of knowing your skills exist, and it's well within normal operating margins for most libraries.

The description field is the cheapest load-bearing element in AEM's skill system. It costs 30-120 tokens per skill. It runs every session, without exception. And it's the only lever that controls whether your skill triggers at all.

That combination, always-on but inexpensive, is what makes description optimization worth getting right.

How are skill descriptions loaded at session start?

At session start, Claude Code reads every SKILL.md file in your .claude/skills/ directory and loads only the description field from each one into context — not the full SKILL.md body, just the single trigger line that tells Claude what the skill does and when to activate it.

This is the Layer 1 index from the progressive disclosure architecture. It answers one question: what skills exist in this library, and when should each one activate?

Claude holds this index in context for the full session. Every user message gets evaluated against it. When a message matches a skill's description, that skill's body loads.

The description index is the only content that stays in context across the entire session regardless of which skills are triggered. Everything else is conditional.

How many tokens does a single skill description use?

A reliable planning estimate is approximately 1 token per 4 characters of English text (Anthropic Tokenizer Docs, 2024), which means an 80-character description costs roughly 20 tokens, a 150-character description costs roughly 38 tokens, and a 300-character description costs roughly 75 tokens at the model's tokenizer.

In practice, descriptions in AEM's production skill libraries run 80-200 characters. The 1-token-per-4-characters estimate puts typical descriptions at 20-50 tokens each.

Token counting for LLMs isn't perfectly deterministic, as tokenization varies by model and content type. Common English text and code tokenize differently. For practical planning purposes, the character-divided-by-4 formula is accurate enough.

For skills with multi-line descriptions, count all characters including the line breaks. A description that runs 3 lines at 80 characters each is 240 characters, or approximately 60 tokens.

What is the total token cost for different library sizes?

Using 50 tokens per description as the midpoint estimate, startup description index costs scale linearly from 250 tokens for a 5-skill library to 5,000 tokens for a 100-skill library — all well under 3% of Claude Sonnet's 200,000-token context window (Anthropic, 2024).

Library size	Tokens per description	Total index cost
5 skills	50 tokens	250 tokens
10 skills	50 tokens	500 tokens
20 skills	50 tokens	1,000 tokens
50 skills	50 tokens	2,500 tokens
100 skills	50 tokens	5,000 tokens

For context: Claude Sonnet's full context window is 200,000 tokens (Anthropic, 2024). A 20-skill library's description index consumes 0.5% of the context window. A 100-skill library consumes 2.5%.

The description index is not where token pressure comes from. The pressure comes from loading full SKILL.md bodies and reference files unnecessarily. At 20 skills with 800-token bodies each, naive full-body loading at startup costs 16,000 tokens — 16x the description index cost for the same library (AEM internal benchmark, 2025). See How Does Progressive Disclosure Save Tokens and Improve Performance? for the full comparison.

How do you reduce description token costs without losing trigger accuracy?

Short descriptions are cheaper but less precise, and long descriptions are more precise but more expensive — for most single-domain skills, the right balance lands at 80-120 characters, which costs 20-30 tokens and is specific enough to distinguish the skill from adjacent ones without wasting context budget (AEM internal benchmark across 14 production skill libraries, 2025).

Three techniques for trimming descriptions without degrading trigger accuracy:

Remove redundant scope qualifiers — "Use when the user asks me to review a pull request or review code changes in a branch or examine a diff" → "Use when the user asks to review a pull request, code diff, or branch changes." Same trigger coverage. 30% fewer characters.
Lead with the action verb, not the context setup — "When working on content for LinkedIn and the user wants to create a post or draft social copy" → "Use for drafting LinkedIn posts and social copy." The context setup ("when working on") is implied. Remove it.
Replace lists with category names — "Use for Python, JavaScript, TypeScript, Go, and Rust code formatting" → "Use for code formatting in any language." Unless you need to specifically exclude certain languages, the general category covers the same territory with fewer tokens.

The goal is descriptions that are as short as possible while remaining unambiguous about when to fire and when not to.

"When you give a model an explicit output format with examples, consistency goes from ~60% to over 95% in our benchmarks." — Addy Osmani, Engineering Director, Google Chrome (2024)

The same precision principle applies to trigger conditions. An explicit, specific description produces consistent activation. A vague or overly broad description produces erratic triggering, regardless of token count.

One calibration note for precise budgeting: Anthropic's model specifications show that Claude's 200,000-token context window corresponds to approximately 680,000 unicode characters — a ratio of 3.4 characters per token, slightly tighter than the 4-character rule of thumb and worth using if you are trimming descriptions to a specific token ceiling (Anthropic, 2025).

For a complete guide to writing descriptions that balance trigger precision and token efficiency, see The SKILL.md Description Field: The One Line That Makes or Breaks Your Skill.

When does the description index become a problem?

At standard library sizes under 50 skills, the description index is not a meaningful constraint: 50 skills at 50 tokens each totals 2,500 tokens, which is roughly the size of a short document and represents just 1.25% of the available context window in Claude Sonnet (Anthropic, 2024).

The description index becomes a constraint in two specific scenarios:

Scenario 1: Unusually long descriptions

If descriptions consistently run 400+ characters (100+ tokens each), a 20-skill library's index grows to 2,000+ tokens. The per-description cost is still low, but it accumulates. In AEM's audit of over-engineered skill libraries, descriptions exceeding 350 characters showed no measurable improvement in trigger accuracy over 120-character equivalents — the additional tokens are overhead, not signal (AEM internal analysis, 2025). Descriptions this long usually indicate content that should be in the SKILL.md body, not the description.

Scenario 2: Very large skill libraries

At 200+ skills, the description index approaches 10,000-15,000 tokens. Still workable, but now a meaningful portion of your startup overhead. At this scale, consider whether all skills need to be in the default .claude/skills/ directory, or whether some should be project-specific and only loaded in relevant contexts.

For most teams, neither scenario applies. The description index is the free lunch of the progressive disclosure architecture.

For a full overview of how all three loading tiers interact, see What is Progressive Disclosure in Claude Code Skills?.

Frequently asked questions

For most libraries under 50 skills, description token costs are not a meaningful constraint: the index runs 250-2,500 tokens at roughly 100 tokens per skill, consumes under 1.25% of Claude Sonnet's 200,000-token context window, and adds no measurable startup latency in libraries up to 100 skills. The threshold that matters is description length — descriptions under 60 characters risk ambiguous triggering, while descriptions over 300 characters start displacing higher-value context.

Is there a way to see exactly how many tokens my description index is using? Claude Code doesn't expose a real-time token counter in the interface. For an estimate, count the total characters in all your description fields and divide by 4. That gives you the approximate token count for your index.

Does having more skills mean slower response times at startup? Slightly. Claude reads more files at session start as your library grows. For libraries under 50 skills, the latency difference is imperceptible. The token cost matters more than the file-read latency at this scale.

If I make my descriptions shorter, will it affect how well Claude matches them? Yes, if you shorten them below the threshold of specificity needed for reliable matching. A description needs enough detail to distinguish the skill from adjacent skills. The minimum viable length is the shortest description that makes the trigger unambiguous. For most single-domain skills, that's 60-100 characters.

Are skill names also loaded at startup, or just descriptions? Skill names (the skill field or the name used to invoke the skill) are part of the startup index. Names are typically short (2-5 tokens each) and add negligible overhead.

Does the token cost of descriptions grow if I have the same skill installed in multiple projects? Each project's .claude/skills/ directory is loaded independently for that project's sessions. If you have 5 skills installed globally and 3 more in a project's local directory, that session loads 8 descriptions. There's no cross-project accumulation.

Last updated: 2026-04-14