How Do You Trace Claude's Decision Path When It Chooses One Skill Over Another?

Claude doesn't choose skills randomly. The selection happens at the description layer and follows deterministic-ish logic: the skill whose description best matches the current prompt wins. When Claude picks the wrong skill, the problem is almost always that two descriptions overlap on the exact phrase or intent that triggered the selection.

TL;DR: Trace Claude's skill selection by asking it directly, in the same session: "Which skill description matched this prompt and why?" Compare the descriptions of the skill that should have triggered vs the one that did. Find the overlapping phrase. Fix by adding a negative trigger to the correct skill's description or sharpening the incorrect skill's scope. This takes under 15 minutes for most conflicts.

How Does Claude Actually Select a Skill?

Claude selects skills using description fields as a classifier. At session start, every skill's metadata layer, including the description, loads into the system prompt. When a user prompt arrives, Claude matches it against those descriptions to decide whether any skill should activate.

The mechanism is not keyword-matching. It's semantic similarity, and it's probabilistic. A description that says "Use this skill when the user asks for a code review" will activate when someone says "look at this function and tell me what's wrong," even though no keywords overlap. That semantic range is both the power and the failure mode.

When two skills have semantically similar descriptions, Claude picks the one whose description is the stronger match at that moment. That choice is influenced by word placement, specificity, and imperative vs passive framing. In our builds, a skill with a directive description ("Invoke when the user asks for a blog post review") consistently beats a passive one ("Used for reviewing blog content") on identical prompts.

How Do You Ask Claude to Explain Its Own Skill Choice?

The fastest diagnostic is to ask Claude directly in the session where the wrong skill fired.

Ask: "Which skill description did you just match to my prompt, and what phrase in the description triggered it?"

Claude will tell you. The answer isn't always precise. The model's self-report of its own classification logic is approximate. But it identifies the match. In practice, Claude will say something like: "I matched your prompt 'review this post' to the 'content-reviewer' skill because the description includes 'reviewing written content for quality.'" That tells you exactly which description is competing with your intended skill.

Follow up: "Which other skills were close candidates for matching that prompt?" Claude will list them. If the skill that should have triggered is on that list, the problem is a tie or near-tie between two descriptions.

What Does a Conflicting Description Pair Look Like?

Here's a pattern common in multi-skill setups:

Skill A (content-reviewer): "Invoke when the user asks to review, critique, or improve written content including blog posts, emails, and social media copy."

Skill B (seo-reviewer): "Use when the user wants to check or improve the SEO performance of written content."

When the user asks "Can you review my blog post for quality?" both descriptions activate. "Review written content" matches Skill A. "Review" plus "blog post" matches Skill B semantically, even though SEO wasn't mentioned. The model picks one. It won't always pick the same one.

The fix is a negative trigger on Skill A: "Invoke when the user asks to review written content. NOT when the request is specifically about SEO performance." The explicit exclusion breaks the tie.

For more on negative triggers and how they prevent false positives, see What Are Negative Triggers and Why Should I Include Them in the Description?.

How Do You Reproduce the Conflict Systematically?

If the wrong skill fires inconsistently, some prompts triggering the right one and some the wrong one, follow this three-step reproduction protocol:

Step 1: Run the exact phrase that caused the wrong activation five times in a fresh session. Note how many times each skill fires. If the wrong skill fires three or more times out of five, the description overlap is severe.

Step 2: Identify the phrase in your prompt that caused the wrong activation. Run progressively stripped versions: "review this post" becomes "check this post" becomes "look at this post." Find where the wrong skill stops triggering. That's the semantic boundary where the descriptions stop overlapping.

Step 3: Compare the two competing descriptions against the phrase where the overlap starts. The overlapping region is what needs to be differentiated.

"The single biggest predictor of whether an agent works reliably is whether the instructions are written as a closed spec, not an open suggestion." — Boris Cherny, TypeScript compiler team, Anthropic (2024)

How Do You Fix a Description Conflict Without Breaking the Other Skill?

The goal is to differentiate the two skills without narrowing either one so much that legitimate activations stop working. Three approaches in order of preference:

Approach 1: Add a negative trigger to the more general skill. If Skill A is broad ("reviewing written content") and Skill B is specific ("SEO review"), add "NOT SEO, NOT technical metrics, NOT keyword analysis" to Skill A's description. This pushes SEO-adjacent prompts toward Skill B without touching Skill B.

Approach 2: Add a domain-specific qualifier to the narrower skill. If Skill B fires too broadly, make its scope explicit: "Use ONLY when the user explicitly asks about SEO score, keyword density, meta descriptions, or search ranking." The word "ONLY" and the explicit list of conditions prevents semantic drift into general review requests.

Approach 3: Test after every change with the reproduction set. After editing either description, re-run the five-prompt test from Step 1. Verify that the correct skill now fires consistently and that the other skill hasn't lost its legitimate activations. A fix that solves the conflict but breaks a different activation is still a bug.

For more on how description specificity affects activation rates, see How Do I Write Trigger Phrases That Make My Skill Activate Reliably?.

What If the Conflict Is Between a Custom Skill and Claude's Default Behavior?

Sometimes the wrong "skill" isn't a skill at all. Claude defaults to a general response instead of activating the custom skill. This is not a conflict between two skills. It's the custom skill losing the competition against Claude's default behavior.

The cause is a passive or weak description. A description that says "This skill can be used for..." will lose to Claude's default behavior on many prompts. The default behavior doesn't have a description; it's always present. The custom skill needs to win actively.

Fix: make the description imperative and specific. "Invoke this skill when..." with explicit trigger phrases. In our builds, imperative descriptions achieve 100% activation rates on the target prompts. Passive descriptions sit at 77% on the same test set.

For more on imperative vs passive descriptions and the activation rate data behind them, see What Did Testing of 650 Activation Trials Reveal About Description Styles?.

Frequently Asked Questions

Is there a way to see which skills Claude is considering before it picks one? Not directly. There's no built-in tool that shows Claude's candidate list before selection. The closest approach is asking Claude post-selection: "Which other skills were near-candidates for that prompt?" Use the /skills command to verify the full skill list is loaded, then ask Claude to explain its choice after the fact.

Can two skills with completely different names still conflict? Yes. The skill name isn't used for matching; the description is. Two skills with completely different names but overlapping descriptions will conflict on the overlapping prompts.

How many skills does a library need before conflicts become a real problem? In our experience, conflicts start appearing reliably above 8-10 skills in a library and become frequent above 15-20. The more skills that cover adjacent domains, the more descriptions need explicit differentiation.

What's the right way to test if I've fixed a conflict? Create a test set of 5-10 prompts that should trigger each skill. Run the full set in a fresh session. If each prompt activates the correct skill 100% of the time, the conflict is resolved. If there's still variation, the descriptions still overlap on some semantic region.

Can I force Claude to always use a specific skill? You can't force it, but you can make the correct skill's description the dominant match by making it imperative, adding explicit trigger phrases, and adding negative triggers to exclude everything else. A description like "Invoke ONLY when the user asks for X, Y, or Z. NOT for A, B, or C" gives the description strong directional force.

What should I do if the wrong skill fires only on one specific phrase? That specific phrase is in or semantically close to the wrong skill's description. Add a negative trigger to the wrong skill's description that excludes that phrase. Then re-test to confirm the correct skill now activates on that phrase.

Last updated: 2026-04-23