Presenting multiple output options is the highest-impact design decision in Claude Code skill engineering. It transforms a skill from a single-shot generator into a co-creation tool. But presenting 6 options of equal weight without a recommendation isn't co-creation. It outsources the expert judgment the skill was supposed to provide.

TL;DR: Skills that present multiple options without a recommended default create decision paralysis. The user came to the skill for expertise, not a menu. A recommended default (formatted as "Option 1 (recommended)") does the expert judgment work the skill is supposed to do. Without it, users either pick randomly, pick the first item, or abandon the choice entirely.

Why does output variation exist in Claude Code skills?

Output variation is the skill design pattern of presenting 3–5 alternatives at a human decision point instead of committing to a single output. In our skill engineering practice at AEM, we've consistently found output variation to be the highest-impact design decision available when building Claude Code skills: it converts one-shot outputs into a structured conversation between the user and Claude.

The mechanism: instead of "here is your LinkedIn post," the skill outputs "here are 3 LinkedIn posts — a punchy opening, a question-led hook, and a data-led argument. Choose one or say what to change." The user stays in control. The output quality improves because Claude generates across multiple framings simultaneously, rather than committing to the first framing it finds plausible.

Output variation done correctly looks like this:

Option 1 (recommended): [version]
Reasoning: Best fit for the stated goal of X.

Option 2: [version]
Reasoning: Use this if Y matters more than X.

Option 3: [version]
Reasoning: The conservative choice — lower risk, lower differentiation.

Output variation done incorrectly looks like this:

Option 1: [version]
Option 2: [version]
Option 3: [version]
Option 4: [version]
Option 5: [version]
Option 6: [version]

The second format is a menu. The user now has to do the work the skill was supposed to do. Airbnb found that friction reduction, not feature addition, drove 70% weekly AI tool adoption across their engineering teams (Faros AI Productivity Report, 2025). The same dynamic holds inside a skill output: removing the decision burden drives repeat use.

When a user receives undifferentiated output options from a skill, three things happen at roughly equal rates: they pick the first option without reading the others, they ask Claude to choose for them, or they close the output and iterate manually. None of those outcomes represent the structured co-creation the skill was built to enable. All three are waste.

The psychology behind this is well-documented. Hick and Hyman's research (1952) established that decision time increases logarithmically with the number of options: RT = a + b[log2(N)]. Add a sixth undifferentiated option and you extend the user's evaluation window, not their satisfaction. In Iyengar and Lepper's 2000 jam study at Stanford, shoppers confronted with 24 jam varieties were 10x less likely to buy than shoppers confronted with 6 varieties. The larger set attracted more browsers. It converted far fewer buyers.

A Claude Code skill isn't a jam display, but the psychological mechanism is identical. The three failure modes in detail:

  • They pick the first one without reading the others. The first option has structural primacy. Users anchor by position, not by quality. The skill generated 5 unnecessary outputs and none of them were evaluated.
  • They ask Claude to choose. "Which one is best?" The skill could have answered that question in its own output. Instead it created a second round-trip to get the recommendation it should have included.
  • They close the output and iterate manually. The skill felt like work rather than help. Users who hit this pattern consistently stop invoking the skill for that task type.

"Developers don't adopt AI tools because they're impressive — they adopt them because they reduce friction on tasks they repeat every day." — Marc Bara, AI product consultant (2024)

A skill that produces 6 equal-weight options adds friction. The expert recommendation removes it.

The recommended default is the skill's expression of judgment: it identifies the best option for the user's stated goal, gives a one-sentence reason, and frames the alternatives so the user knows when to reach for them instead. A well-written default does in 30 words what a second Claude round-trip would do in 10 seconds. Research on labeled defaults supports this: approximately 60% of users interpret a labeled default as an implicit recommendation, even when no explanation is given (Nielsen Norman Group, 2023). With explicit reasoning, that signal becomes unambiguous. Three components make it work:

  1. A label. "Option 1 (recommended)" or "Option 1 [recommended starting point]" placed in the heading, not buried in the description. The user identifies the recommendation at a glance.
  2. A reason. One sentence explaining why this option is recommended over the others. Not a compliment to the option, but a functional distinction: "Best fit for a cold audience who doesn't know your background yet" or "Highest information density, recommended if the reader has 30 seconds, not 3 minutes."
  3. A contrast. The other options need context too. Without it, "recommended" just means "first." With it, "recommended" means "the right fit for your stated constraints." The contrast transforms the option list from a menu into a decision guide.

The format itself matters as much as the label. "When you give a model an explicit output format with examples, consistency goes from ~60% to over 95% in our benchmarks." (Addy Osmani, Engineering Director, Google Chrome, 2024). A SKILL.md that specifies the recommended label format is not cosmetic. It is the instruction that produces reliable output structure.

In practice, the instruction in SKILL.md looks like this:

After drafting 3 options:
- Mark the strongest fit for the stated goal as "(recommended)"  
- Add 1-sentence reasoning beneath each option explaining when to use it
- Do not present options that are functionally identical — if two options are the same register, collapse them into one

This is about 40 words of instruction. It changes the entire character of the skill output.

What if the options are genuinely equally valid?

"All options are equal" is almost never true. It is almost always "equal on the dimension I optimized for." Three responses exit that framing: cut the option count, add a second differentiating axis, or name the best-fit option explicitly. One limit: when user input is too sparse to distinguish options, the recommendation becomes arbitrary.

  • Present fewer options (or ask a clarifying question first, if the user's stated context is too thin to differentiate). If you cannot meaningfully differentiate 6 outputs, you have generated 4 unnecessary ones. Cut to 3. If you cannot differentiate 3, cut to 2. Procter and Gamble reduced their Head and Shoulders line from 26 to 15 products and saw a 10% sales increase (Iyengar, Columbia Business School, 2010). Fewer, differentiated options outperform more, undifferentiated ones.
  • Reframe the options. "Equal" usually means "equal on the dimension I optimized for." Introduce a second dimension: length, formality, risk level, audience familiarity. On that second axis, the options diverge.
  • Make the default explicit anyway. Even when options are genuinely close in quality, one fits the user's stated context better than the others. That becomes the recommended default. "These are all strong. Option 2 maps most directly to the tone you described in your brief."

The one wrong answer is "I'll present all of them without a recommendation and let the user decide." That is a skill that has done half its job.

This pattern shows up in the broader anti-patterns of skill design. For the complete catalog of output-related mistakes, see What Are the Most Common Mistakes When Building Claude Code Skills? and The Anti-Patterns Guide: 20 Mistakes That Kill Claude Code Skills. For how reference file structure can compound this problem, see What Happens When Reference Files Chain to Other Reference Files?.


Frequently asked questions

Three output options with a labeled recommended default is the correct format for most Claude Code skills. The recommended default does the expert judgment work; alternatives give the user room to override it. The pattern applies equally to expert users and 2-option skills. The one exception: when user input is too sparse, ask a clarifying question before presenting options.

How many output options is the right number?

Three is the sweet spot for most skills. It gives enough range to cover distinct approaches without triggering choice overload. Five is the maximum before the user stops reading all the options. One is not output variation — it's just an output.

Yes. The recommendation is removable. An expert user can ignore the recommendation and read all options with the same evaluation they'd apply without it. A non-expert user cannot get that value without the recommendation. It costs nothing to include it.

A recommendation with reasoning is not bias — it's transparency. "This is recommended because it matches your stated audience" gives the user information to accept or reject the recommendation. Presenting options without reasoning gives the user no information about the tradeoffs. The labeled recommendation is more honest, not less.

Yes. Even with 2 options, label which one you'd use given the user's context. "Option 1 (recommended for your stated goal)" takes 5 words. It eliminates the mental work of figuring out which to try first.

What if the user doesn't specify enough context for a recommendation?

The skill should ask for the missing context before presenting options, not present undifferentiated options and hope the user has enough context to evaluate them. A checkpoint that asks one clarifying question (tone? audience? length?) produces better output and a better user experience than 6 untailored options.

How does this interact with irreversible action gates?

For skills that publish, send, or delete content, a recommended default at the final review step prevents the user from accepting the wrong output by accident. The recommendation anchors attention on the right choice. The gate prevents the action until the user explicitly confirms — the combination of recommendation plus gate is the correct pattern for high-stakes outputs.


Last updated: 2026-04-18