CLAUDE.md is always-on context. It loads at the start of every Claude Code session and stays in the background for everything that session does. SKILL.md is an invokable procedure. It loads when you invoke it by slash command, or when Claude detects that its description matches your current intent, and then it runs.

Mixing these up creates predictable failures. Multi-step workflows in CLAUDE.md get applied inconsistently because Claude tries to follow them in every context, not just when they are relevant. Standing project rules in SKILL.md get ignored because they are not loaded until someone explicitly invokes the skill.

At Agent Engineer Master, CLAUDE.md versus SKILL.md placement is the first architecture question in every commission. Getting it right at the start prevents a category of failure that cannot be fixed by better instructions.

TL;DR: Put always-true project context in CLAUDE.md: stack conventions, behavioral defaults, file pointers. Put specific, invokable procedures in SKILL.md: multi-step workflows, output-constrained tasks, domain processes. CLAUDE.md loads every session and costs attention whether you need it or not. SKILL.md loads only when triggered, so misplaced instructions in CLAUDE.md degrade every interaction.

What does CLAUDE.md do?

CLAUDE.md is a persistent context file that Claude Code loads automatically at the start of every session. It contains project identity, standing constraints, conventions, and any context Claude needs to operate correctly in your repository without being told explicitly in each conversation.

CLAUDE.md loads hierarchically. Claude reads the CLAUDE.md at your home directory, then the one at your project root, then any CLAUDE.md files in subdirectories that are relevant to the current working context. Each layer can add to or override the layer above it. A monorepo can have CLAUDE.md files in each package directory with package-specific conventions that override root-level defaults (source: Claude Code documentation, 2026).

The content that belongs in CLAUDE.md is the content that is true all the time, regardless of what task you are doing:

  • This project uses TypeScript 5.x with strict mode.
  • The backend is Express on Node 20; do not suggest newer APIs.
  • Always write tests alongside new functions.
  • The database schema is in db/schema.prisma — read it before any database question.

These instructions are relevant to every Claude Code interaction in this repository. They belong in the background, not in a procedure.

What does SKILL.md do?

A SKILL.md file is an invokable procedure: a named specification with process steps, an output contract, and a trigger condition. It loads only when you invoke it or when Claude detects the intent match. When you are not running the skill, it does not consume attention. When you are, it defines exactly what Claude does.

The SKILL.md format has three required components:

  1. A name field in the frontmatter, which becomes the slash command.
  2. A description field that controls automatic triggering. This field has a 1,024-character limit and must stay on a single line (source: Claude Code documentation, 2026).
  3. Numbered process steps that Claude follows in sequence.

A SKILL.md without process steps is not a skill. It is a context statement masquerading as a procedure. The process steps are what make it invokable and what make the output predictable. The official Claude Code documentation recommends keeping SKILL.md under 500 lines and moving detailed reference material to separate supporting files (source: Claude Code skills documentation, code.claude.com/docs/en/skills, 2026).

"When you give a model an explicit output format with examples, consistency goes from ~60% to over 95% in our benchmarks." - Addy Osmani, Engineering Director, Google Chrome (2024)

This is why the output contract in a SKILL.md is not optional. Consistent output does not come from the model's goodwill. It comes from an explicit specification of what to produce. For more on what goes inside a SKILL.md, see What Is a Claude Code Skill?.

How do they load differently?

CLAUDE.md loads automatically, silently, and always. Every session in that project has those instructions in context from the first token. There is no invocation, no slash command, and no trigger condition. It is present before you type a single character. CLAUDE.md is the foundation Claude Code stands on.

SKILL.md loads on demand. Explicitly: you type /skill-name. Automatically: Claude detects an intent match via the description field. If neither condition is met, the SKILL.md is not loaded and its instructions have no effect on the session.

This loading difference determines what each file should contain. CLAUDE.md is read before any user input is processed. SKILL.md is read when a specific task begins. Putting a multi-step workflow in CLAUDE.md is like printing your deployment checklist on the office kitchen wall. The information is present. Nobody follows it systematically when it matters.

Each SKILL.md file adds approximately 100 tokens to Claude's startup context budget when loaded (source: AEM research synthesis, 2026). This is the cost of each skill being available for automatic triggering. CLAUDE.md's token cost is fixed at session start and applies to every interaction. A typical project CLAUDE.md consumes around 1,800 tokens at startup, which is why the official Claude Code documentation recommends keeping it under 200 lines and moving reference content to skills or path-scoped rules (source: Claude Code documentation, code.claude.com/docs/en/context-window, 2026). Across 2,303 real-world Claude Code repositories, the median CLAUDE.md is 485 words, putting actual session startup cost for most projects closer to 650 tokens than the 1,800-token upper bound (source: Chatlatanagulchai et al., "Agent READMEs: An Empirical Study of Context Files for Agentic Coding," ArXiv 2511.12884, 2025). The tradeoff: CLAUDE.md is always available; skills are on-demand and cost tokens only when triggered.

What belongs in CLAUDE.md vs SKILL.md?

CLAUDE.md holds context that is true for every task in the project: stack identity, behavioral defaults, file pointers, and standing permissions. SKILL.md holds procedures that apply to specific tasks: multi-step workflows, output-constrained processes, and domain-specific steps that only matter when the relevant task is running.

CLAUDE.md contains:

  • Project identity: what the codebase is, what stack it uses, what constraints are absolute.
  • File pointers: where schemas live, which files to read before certain question types.
  • Behavioral defaults: "always write tests for new functions," "never use any type."
  • Team conventions: naming conventions, PR format preferences, branch naming.
  • Permissions and restrictions: what Claude is and is not allowed to do in this project.

SKILL.md contains:

  • Multi-step workflows: review-pr, deploy-staging, generate-changelog.
  • Output-constrained procedures: tasks where the format of the result must match a template or schema.
  • Domain-specific processes: steps that require loading reference files, calling specific tools, or applying expertise from a particular domain.
  • Repeated tasks: anything you run more than three times a week and need to produce consistent output.

The clearest signal that instructions belong in SKILL.md and not CLAUDE.md: if you ever need to not follow the instructions for a given session, the instructions should be in a skill. CLAUDE.md is always true. Skills are invoked when relevant.

See When Should I Use a Skill Instead of Writing Instructions in CLAUDE.md? for the decision framework in detail.

Can the same instruction appear in both files?

Yes, and this is the strongest pattern. A standing rule in CLAUDE.md sets the requirement; the corresponding SKILL.md procedure enforces it mechanically in the relevant context. The two files operate at different layers: CLAUDE.md states what the standard is, SKILL.md implements the check that ensures the standard is met. For example:

  • CLAUDE.md: "All PRs require passing tests and a migration plan for schema changes."
  • SKILL.md (review-pr): Step 3 — Check for schema changes. If present, verify a migration file exists in db/migrations/. If absent, flag as a blocker in the output.

The CLAUDE.md rule states the requirement. The SKILL.md procedure enforces it mechanically. This is the strongest pattern: CLAUDE.md sets the standard, SKILL.md implements the check.

The limit of this analysis: Some instructions work equally well in either file. If you have a single-sentence rule that applies to specific tasks, it can live in CLAUDE.md or in the relevant skill. The overhead is small enough that the placement does not matter much. This matters most for multi-step workflows (always in SKILL.md) and for universal project context (always in CLAUDE.md).

See Should I Put My Instructions in CLAUDE.md or Create a Separate Skill? for the specific decision.

What happens when CLAUDE.md gets too long?

CLAUDE.md's practical limit is determined by Claude's attention mechanics. Instructions buried deep in a long CLAUDE.md file receive less reliable adherence than instructions near the top. The effective range is roughly the first 200 lines: past that point, adherence degrades measurably, and past 400 lines it drops to levels that make the instructions unreliable in production. A 2023 Stanford NLP study found that models lose track of instructions placed in the middle of long contexts (source: Liu et al., "Lost in the Middle," ArXiv 2307.03172).

At AEM, we have audited CLAUDE.md files up to 800 lines long. The instructions in the first 200 lines are reliably followed. The instructions between lines 400 and 800 are followed about 60% of the time, which is worse than having no instruction at all (source: AEM commission audits, 2026). The fix is straightforward: move the rarely-triggered procedures out of CLAUDE.md and into SKILL.md files where they are only loaded when relevant.

Frequently Asked Questions

Most placement errors come from a handful of recurring edge cases: contradictions between the two files, sessions running without any CLAUDE.md present, and skills that write back to CLAUDE.md. The questions below address these specific scenarios with direct answers drawn from AEM commission experience.

Can CLAUDE.md override SKILL.md instructions? CLAUDE.md and SKILL.md are both in context when a skill is active. If they contradict each other, CLAUDE.md instructions loaded at session start typically take precedence over mid-session additions. In practice, contradictions indicate a design problem: the CLAUDE.md rule and the skill were written independently without coordination.

What happens if CLAUDE.md is missing? Claude Code continues to function. Sessions start without the project-level context that CLAUDE.md would have provided. This typically means Claude makes more generic assumptions about the codebase and requires more explanation in each conversation.

Can a skill modify CLAUDE.md? A skill can write to any file it has permission to write to, including CLAUDE.md. This is a valid pattern for skills that update standing project context -- for example, a skill that records a new dependency decision in CLAUDE.md after a dep audit. Use it carefully: unintentional CLAUDE.md modifications affect every subsequent session.

How should I split a long CLAUDE.md into skills? Identify every multi-step procedure in the file. Each one becomes a SKILL.md. Identify any instruction that applies only to specific task types. Each one becomes part of the relevant skill's process steps. What remains in CLAUDE.md should be always-true project context.

Does CLAUDE.md need frontmatter? No. CLAUDE.md is plain markdown. It does not use YAML frontmatter and does not have a name field. Skills require frontmatter because the name field creates the slash command and the description field enables automatic triggering.


Last updated: 2026-05-05