What Is a Learnings File in a Claude Code Skill?
A learnings file is a plain-text reference file inside a Claude Code skill folder that accumulates behavioral corrections from real production runs. It tells the skill how to handle patterns that the original SKILL.md instructions did not anticipate. Claude reads it every time the skill triggers and applies its contents alongside the main skill instructions. In AEM's skill architecture, the learnings file is the self-improvement mechanism that closes the gap between a skill's first deployment and its production-ready behavior.
TL;DR: learnings.md stores behavioral corrections from real runs, written as direct statements the skill applies on every execution. It lives inside the skill folder and is loaded as a reference file when the skill triggers. Entries describe how Claude's behavior should change across all inputs, not exceptions for specific entities.
The learnings file is not a log. It does not record what happened. It records what should happen differently going forward.
What Goes in a Learnings File?
Behavioral corrections from real production runs go in the learnings file: observed patterns where Claude's output deviated from the intended behavior, such as collapsing a list into prose or mismatching a client's formality register, written as direct instructions that change how the skill behaves on the next run. Each entry pairs a trigger condition with a corrected response, so the skill closes specific gaps rather than restating general rules.
A behavioral correction has a specific structure: a pattern that appeared in real use, and the correct behavior to replace it. Written as a direct instruction:
- "When the input contains a numbered list, preserve the list format. Do not collapse it to prose."
- "When the user's draft uses British English spelling, match that convention throughout the output."
- "When the input includes a deadline date, include it prominently in the output header, not buried in the body."
Each entry answers two implicit questions: when does this apply, and what should happen. Entries that cannot answer both questions are not ready to go in the file.
What does NOT go in the learnings file:
- Facts about specific clients or entities (those belong in
edge-cases.md) - Rules that should always apply regardless of input (those belong in SKILL.md directly)
- One-off observations that have not been reinforced by more than one run
- Aspirational notes about what the skill should eventually do (not corrections from what it did)
"The single biggest predictor of whether an agent works reliably is whether the instructions are written as a closed spec, not an open suggestion." — Boris Cherny, TypeScript compiler team, Anthropic (2024)
The learnings file extends the closed spec with what real-world use reveals. A SKILL.md that covers every case from day one is rare. The learnings file is how the spec closes its gaps over time. Research at NAACL 2024 found that role-play prompting, where the model is given a precise behavioral role rather than an open instruction, raised accuracy on the AQuA reasoning benchmark from 53.5% to 63.8% and on Last Letter from 23.8% to 84.2% (Kong et al., NAACL 2024). The specificity of the instruction is the variable that moves the needle.
How Is Learnings.md Different from Edge-Cases.md?
The distinction is behavioral pattern versus factual exception: learnings entries describe how the skill should behave across all inputs when a specific trigger condition appears, while edge-cases entries describe exceptions tied to one named entity, client, or format, and routing a correction to the wrong file is the most common mistake and the hardest to diagnose.
Learnings file entries describe how the skill should behave across all inputs when a specific pattern appears. They apply universally when the trigger condition is present.
Edge-cases file entries describe exceptions for specific named entities: clients, formats, or contexts. They apply only when that specific entity appears in the input.
| Type | Example | File |
|---|---|---|
| Behavioral pattern | "When input uses numbered lists, preserve the format" | learnings.md |
| Factual exception | "Client Orinoco uses EUR billing, not GBP" | edge-cases.md |
| Universal rule | "Always include a summary section at the top" | SKILL.md |
Confusing these three destinations is the most common learnings-file mistake. A client-specific exception in learnings.md gets applied to all inputs, not just that client's. A behavioral pattern in edge-cases.md gets loaded only when the matching entity appears, which removes it from the universal application it needs. A 2025 benchmarking study on AI-assisted industrial programming found that prompt structure had a greater influence on output correctness than the choice of model, with structured prompts producing a twofold increase in safety compliance over unstructured baselines (ScienceDirect, 2025). Routing corrections to the right file is that structural decision.
What Should Each Entry Look Like?
Each entry is a direct instruction, not a description of what went wrong: it names a trigger condition and states the correct behavior when that condition appears, written in present-tense imperative form so Claude can apply it immediately on the next run without interpreting what the author intended. History gives Claude nothing to act on. A rule ("When input contains a numbered list, preserve the format") changes behavior on every subsequent run. That distinction is the difference between a log and a spec.
Bad format: "Claude collapsed the list into prose in the run from 2026-03-12." Good format: "When the input contains a numbered list, preserve the list format in the output."
Bad format: "User prefers shorter summaries." Good format: "Keep summary sections to 3 sentences maximum. One sentence for context, one for the key finding, one for the recommended action."
The good format tells Claude exactly what to do when the condition arises. The bad format describes history, which Claude cannot act on.
Aim for specificity at what the specificity-injection framework calls V3 or V4: not "use the right tone" but "match the formality register of the input, formal for business documents, direct for technical requests."
"When you give a model an explicit output format with examples, consistency goes from ~60% to over 95% in our benchmarks." — Addy Osmani, Engineering Director, Google Chrome (2024)
In our commissions, entries that survive more than two consolidation passes are the ones with this level of specificity. Vague entries get pruned because they cannot be consolidated without losing meaning.
How Large Can a Learnings File Get?
Hard cap: 100 lines, with consolidation triggered at 80, because Claude reads reference files under real context-window conditions where instructions placed in the middle of long inputs receive measurably less attention than those near the start, and a 120-line learnings file puts roughly half its entries in the low-attention zone. Hard caps are not style preferences. They are where entries go to be ignored.
The research behind this is specific. Nelson Liu et al. at Stanford NLP Group measured a 30%+ accuracy drop on multi-document question answering when the answer document moved from position 1 to position 10 in a 20-document context (ArXiv 2307.03172, 2023). A 2025 study testing 18 frontier models including GPT-4.1, Claude, and Gemini confirmed the pattern holds across every model tested: accuracy degrades as input length grows (Chroma, 2025).
The consolidation process at 80 lines:
- Read all entries and identify themes (3-5 entries grouped around the same topic)
- Write one consolidated instruction that captures the essential signal from each group
- Delete the individual entries, keep the consolidated version
- Leave entries intact that describe rare, high-stakes situations with no related entries to group
After consolidation, a well-maintained file sits at 40-60 entries covering the full range of observed failure modes. That level is both comprehensive enough to be useful and short enough for Claude to apply throughout.
How Does Claude Read the Learnings File?
The learnings file is a reference file loaded when the skill triggers, not at session startup, which means it affects what Claude does after a skill is selected rather than which skill gets selected, and the entries take effect only when the SKILL.md instructions explicitly tell Claude to read the file. Without that instruction, the file sits unread in the folder.
This means learnings entries do not affect Claude's trigger decision. They affect only what Claude does after the skill has been selected. A learnings entry cannot change which skill gets activated; it can only change how the active skill behaves once running.
The skill's SKILL.md instructions must explicitly reference the learnings file:
Before producing output, read learnings.md and edge-cases.md. Apply all entries to this run.
Without this instruction, Claude treats the learnings file as an inert file in the folder. The reference instruction is what tells Claude to load it and apply its contents. Anthropic's engineering team reported that LLMs progressed from 40% to over 80% on the SWE-Bench Verified software engineering benchmark in a single year, with much of the gain traceable to more precise task specifications and structured instruction loading (Anthropic Engineering, 2025). The same specificity principle scales down to individual skill instructions: a 650-trial experiment testing Claude skill descriptions found that directive-style instructions achieved 100% activation against 77% for passive default descriptions, a 20x difference in activation odds (Ivan Seleznov, cited in Marc Bara, Medium 2026).
For more on how reference files load in the progressive disclosure architecture, see How Are Reference Files Loaded on Demand.
How Do I Add Entries to the Learnings File?
Entries come from the feedback gate at the end of each skill run: when a run produces imperfect output, the gate prompts a routing decision about which file the correction belongs in, and the entry gets written before the session closes while the observation is still precise enough to be actionable.
When a run produces imperfect output and the user answers the gate questions, the skill designer (or Claude, with explicit routing instructions) identifies which entries belong in learnings versus edge-cases versus SKILL.md.
The entry gets written before the session closes. Not in batch at the end of the week. Not mentally noted for later. The session-end is when the observation is freshest and the routing decision is clearest.
The file starts with zero entries and grows with every real production run that reveals a gap. By the fourth or fifth week of daily use, a maintained learnings file has enough entries to visibly change the skill's behavior on new inputs. Cleanlab's 2025 survey of 1,837 AI practitioners found that 62% of production teams cited improving observability and feedback loops as their most urgent investment area, ahead of model performance or cost reduction (Cleanlab, AI Agents in Production 2025).
What Else Do Skill Builders Ask About Learnings Files?
How many entries does a learnings file need before it starts helping?
Three to five specific entries is enough to change behavior on the most common failure modes. A learnings file with two well-written entries outperforms a 30-entry file full of vague observations. Quality of entries matters more than quantity.
Should I write learnings entries during development or only during production use?
During production use only. Development-phase corrections belong in SKILL.md directly. The learnings file captures patterns that emerge from real inputs you did not design for, not anticipated edge cases you can handle during design.
Can learnings entries conflict with SKILL.md instructions?
Yes, and when they do, SKILL.md instructions take precedence. A learnings entry is an addendum to the spec, not an override. If an entry consistently conflicts with a SKILL.md rule, the correct fix is to update SKILL.md and remove the entry, not to keep both.
When do I need to add a date to each learnings entry?
Never. Dates in learnings entries add visual noise without functional benefit. The learnings file is not an audit log. If you need to track when a pattern was first observed, keep a separate changelog. The learnings file should contain only what Claude needs to act on.
How do I share behavioral patterns across multiple skills?
Each skill has its own learnings file because the behavioral patterns are specific to that skill's task. If two skills share a common behavioral pattern (such as "preserve list formats"), add it to both files or promote it to a shared reference file that both skills load explicitly. Sharing a single learnings file directly between skills breaks the per-skill specificity that makes entries actionable.
What happens when I update SKILL.md and the learnings file has conflicting entries?
Audit the learnings file after every SKILL.md update. Entries that duplicate what is now in SKILL.md should be deleted: they add no information and contribute to file length without benefit. Entries that now contradict SKILL.md should be deleted or rewritten. The learnings file and SKILL.md should tell a coherent story, not argue with each other.
How do I know when an entry has been reinforced enough to stay?
An entry has been reinforced when the same pattern appeared in two or more independent runs without being contradicted. One run is anecdote. Two runs with the same observation across different inputs is signal worth keeping.
Last updated: 2026-04-16