title: "Should I Put My Instructions in CLAUDE.md or Create a Separate Skill?" description: "The token economics, discovery mechanics, and practical decision rules for choosing between CLAUDE.md and a SKILL.md file for any given instruction set." pubDate: "2026-04-13" category: skills tags: ["claude-code-skills", "claude-md", "token-economics", "skill-architecture"] cluster: 2 cluster_name: "Skills vs Agents vs Prompts" difficulty: intermediate source_question: "Should I put my instructions in CLAUDE.md or create a separate skill?" source_ref: "2.Intermediate.4" word_count: 1620 status: draft reviewed: false schema_types: ["Article", "FAQPage"]

Should I Put My Instructions in CLAUDE.md or Create a Separate Skill?

Quick answer: Put instructions in CLAUDE.md when they are always-on constraints that apply to every task in the project. Create a skill when the instructions define a specific, repeatable workflow triggered by name. The decision is not stylistic, it has direct consequences for context window usage, instruction reliability, and team leverage.


Why Does Choosing Between CLAUDE.md and a Skill Actually Matter?

CLAUDE.md and skills are not interchangeable containers for instructions. They have different load mechanics, different scope, and different costs. Putting workflow instructions in CLAUDE.md does not break anything immediately. It degrades performance gradually, and that degradation is difficult to trace back to its source.

The practical consequence: a CLAUDE.md file that has absorbed multiple workflows over time starts competing with itself. Every line of CLAUDE.md competes with every other line for Claude's attention. When a project fact like "use the src/ folder structure" shares context budget with a detailed 12-step release process, both instructions become less reliable (Source: Claude Code context window management, 2026).

A 500-line CLAUDE.md is a project that never got around to building the skills.

Research into how AI models prioritize context suggests that content appearing in the first 30% of a context window receives disproportionate citation weight — estimated at 44.2% of all citations in retrieval-augmented settings (Source: Kevin Indig, Growth Memo, February 2026 (growth-memo.com)). Instructions buried deep in a long CLAUDE.md are structurally disadvantaged.


How Does Token Cost Differ Between CLAUDE.md and a Skill?

Every instruction you put in CLAUDE.md is loaded in full on every session, whether it is relevant to the current task or not. Skills load only metadata at startup and their full content only when triggered. For a 300-line workflow block, this difference can reduce context pressure by an order of magnitude across a working day.

CLAUDE.md loads entirely on every session. Every line, every time, regardless of whether it is relevant to the task in progress. The cost is paid upfront and repeated on every invocation.

Skills load their metadata at startup, roughly 100 tokens per skill (Source: Claude Code context window architecture, 2026), and their full content only when triggered. A /release-notes skill costs 100 tokens at startup on every session and full cost only when you run a release. If you run releases twice a week, the full content loads twice a week. Run code reviews 20 times a day, and the /review skill loads 20 times a day, but only then, and only that skill.

The comparison for a 300-line workflow instruction block:

Approach Load behavior Token cost per session Token cost when task runs
In CLAUDE.md Every session Full 300 lines always Included
As a skill Only on trigger ~100 tokens metadata Full 300 lines once

For workflows that run infrequently, the skill approach reduces context pressure significantly. For always-needed context, CLAUDE.md is appropriate because the instructions apply every session anyway.


When Should I Put Instructions in CLAUDE.md?

CLAUDE.md is the right location for instructions that must be active on every session without exception: always-on project facts, standing rules with no conditional trigger, and context that changes how Claude interprets all other instructions. If the instruction would be wrong to omit from any task, it belongs in CLAUDE.md.

CLAUDE.md is right when the instructions are:

Always-relevant project facts. "This project uses PostgreSQL 16, not MySQL." "The api/ directory contains all external endpoints." "Always use named exports, not default exports." These are facts Claude needs before it can understand any task in the project. Loading them once at startup is correct.

Standing rules with no trigger. "Never commit directly to main." "All PR descriptions must include a test plan section." "Error messages follow the format [MODULE_CODE]: human description." These apply to every piece of work. No trigger needed. No conditional loading.

Context that changes how Claude interprets everything else. The following belong in CLAUDE.md because they are foundational, not task-specific:

  • Project architecture summaries
  • Constraint notes
  • Team protocol summaries

These shape how Claude reads all subsequent instructions.


When Should I Create a Separate Skill?

Create a skill when the instructions define a named, repeatable workflow that only applies during that workflow. Step-by-step process definitions, task-specific output contracts, and instructions that create noise outside their use case all belong in skills. If the instruction would be irrelevant to most sessions, it is a skill, not a CLAUDE.md entry.

Build a skill when the instructions are:

Step-by-step process definitions. Numbered steps, decision trees, output format specifications, anything that reads like a procedure manual belongs in a skill. "When running a release: step 1, step 2, step 3..." is a /release skill, not a CLAUDE.md section.

Task-specific output contracts. If the instructions specify any of the following, this is an output contract:

  • What the output looks like
  • Length and format requirements
  • Required sections or field names

Output contracts belong in skills, which are built around them. An output contract in CLAUDE.md applies to all output, which is the wrong scope.

Expertise that applies only during one workflow phase. The following matter when running the relevant workflow and create noise otherwise:

  • Security review criteria
  • Changelog formatting rules
  • Deploy checklist items

Skills load on demand. CLAUDE.md loads always.

Instructions you want versioned and shared as a unit. A skill is a file in version control. If you want to share a workflow with your team, update it in one place, and track its history in Git, a skill gives you that. Instructions embedded in CLAUDE.md are harder to isolate, share, or review independently.


Can I Use CLAUDE.md and Skills Together?

Yes, and combining them is the recommended production pattern. CLAUDE.md holds the small set of always-on facts and pointers; skills hold every named workflow. Together, they keep your CLAUDE.md under 50 lines for most projects while giving Claude full, reliable instructions for every workflow you run.

CLAUDE.md and skills are most reliable in combination, with clear scope boundaries.

A production pattern:

  • CLAUDE.md holds: tech stack summary (6 lines), folder structure (4 lines), standing rules (8 lines), pointers to skills (5 lines). Total: under 25 lines of actual instructions.
  • Skills hold: every workflow with a name and a trigger.

The CLAUDE.md pointers look like: "For commit messages, use /commit. For PR reviews, use /review-pr. For release notes, use /release-notes." Three lines that tell Claude where to find the full process without embedding the full process in CLAUDE.md.

This architecture keeps CLAUDE.md under 50 lines for most projects. Everything else is skills. Context window pressure stays low. Instructions stay reliable.

"Frontier LLMs can follow approximately 150–200 instructions with reasonable consistency. Beyond that, compliance degrades." — Kyle, HumanLayer (November 2025, https://www.humanlayer.dev/blog/writing-a-good-claude-md)


How Do I Audit My CLAUDE.md for Skill Candidates?

Scan your CLAUDE.md for three signals: blocks with numbered steps, output format specifications, and content that only applies during one specific workflow phase. Each is a skill candidate. What remains after extraction — project facts, standing rules, contextual summaries — is your actual CLAUDE.md. For most established projects, this audit surfaces 3–5 candidates.

Run this diagnostic before deciding whether to split:

  1. Count the lines. Over 150 lines is a signal that workflow instructions have accumulated in CLAUDE.md.
  2. Identify every block that only applies during one specific task. These are skill candidates.
  3. Identify every block with numbered steps. These are skill candidates.
  4. Identify every output format specification. These are skill candidates.
  5. What remains is your actual CLAUDE.md content. Project facts, standing rules, contextual summaries.

For most projects with established workflows, this audit surfaces 3–5 skill candidates. Moving them out of CLAUDE.md produces:

  • A shorter, more reliable CLAUDE.md
  • 3–5 skills that are easier to test, version, and share independently

For a detailed breakdown of what each section of a skill file should contain, see What Goes in a SKILL.md File?.

For the full framework covering skills, agents, prompts, and CLAUDE.md together, see Claude Code Skills vs Agents vs Prompts: When to Use Which.


FAQ

How do I know if my CLAUDE.md is too long?

Over 150 lines is the threshold where performance starts to degrade. Over 300 lines, the file has clearly absorbed workflow instructions that belong in skills. The diagnostic is content, not length: any block with numbered steps, output format specs, or task-specific criteria that only applies during one workflow phase is a candidate for extraction into a skill.

A SKILL.md file should follow: frontmatter (name, description, trigger) > process steps in numbered order > output contract (what the output must contain) > failure modes and constraints > optional reference file pointers. This order puts the process and output contract front-and-center, which is what Claude needs most when executing the skill.

Does moving instructions from CLAUDE.md to a skill change how Claude executes them?

Yes. Instructions in CLAUDE.md apply as standing context to all tasks. Instructions in a skill apply only when the skill is triggered and are scoped to that skill's process. For workflow instructions, this scoping improves reliability: Claude is not carrying 12-step review criteria when writing a commit message. Each instruction set is loaded when relevant and not otherwise.

Can I reference a skill from CLAUDE.md without embedding the full instructions?

Yes, and this is the recommended pattern. A one-line reference like "For commit messages, use /commit" is enough. Claude knows to invoke the skill when that task is needed. The full instructions stay in the skill file, loaded only when the skill runs.

What happens if I never migrate CLAUDE.md workflow instructions to skills?

The project continues to work, but with degraded instruction reliability as CLAUDE.md grows. Each additional line in CLAUDE.md slightly dilutes every other line's weight in the context window. Workflow instructions embedded in CLAUDE.md also become harder to test, version, and share independently. The cost is gradual and invisible until CLAUDE.md is long enough to produce noticeable inconsistency.


Last updated: 2026-04-13