Setup performance — per-pattern expectancy, honestly.
A 55% overall win rate hides a lot. It can be 80% pullback-at-support and 30% catalyst plays you keep taking. The single average is the arithmetic mean of bets you'd love to make more of and bets you should stop entirely — and you can't tell them apart from the average alone. v7.2's Setup Performance subtab tags every BUY journal entry with one of eight setup types and surfaces per-type win rate, expectancy in R, profit factor, and best/worst ticker. Plus sample-size bands so you don't believe tiny n.
The eight setup types
Every BUY entry runs through _classifySetup(), a priority-ordered detector. The first matching label wins:
| # | Setup type | Detection |
|---|---|---|
| 1 | bull_flag | Tight 3-8 bar consolidation off a recent breakout, declining volume, no break of the flag's lower edge |
| 2 | catalyst_play | Entry within 1 session of a detected catalyst (earnings beat, M&A, regulatory clearance, scale-insider buy) |
| 3 | earnings_runup | Entry 1-10 sessions before a known earnings date, no other catalyst, IV expansion present |
| 4 | breakout_from_base | Clean break of a 20+ bar consolidation top with volume >1.5× |
| 5 | pullback_at_support | Entry near a structural support level (primary_S from levels module), prior trend up, ADX > 20 |
| 6 | oversold_bounce | RSI < 30 reversal, no other obvious setup ("catching a knife") |
| 7 | trend_continuation | Established uptrend, EMA21 > EMA50 > EMA200, no specific trigger beyond "trend intact" |
| 8 | other | Everything that doesn't fit. The "I just felt like it" bucket |
Priority ordering matters. A trade that's both "near a known earnings date" and "near support" gets catalyst_play, not pullback_at_support — the catalyst is the higher-information label, so the outcome attributes to the catalyst lens. The order is a calibration choice; you can argue with it, but you can't argue both labels apply equally.
The four numbers per type
- Win rate — percent of closed trades ending above entry net of fees
- Expectancy in R — (win rate × avg winner in R) − ((1 − win rate) × avg loser in R). The single most important number for any setup. It's what each trade is worth in expectation
- Profit factor — gross winners / gross losers. 2.0+ is great. 1.5-2.0 is solid. Below 1.2 is a setup you're paying for the privilege of taking
- Best / worst ticker — the actual ticker that contributed the most P&L for that setup, and the one that contributed the worst. Useful for spotting whether a bad setup is one specific ticker tanking the average or a genuine pattern weakness
Sample-size bands
Per-setup math is dangerous below a certain sample size. A "100% win rate on bull flags" from 3 trades is noise, not signal. The subtab gates display by sample size:
| n | Band | Display |
|---|---|---|
| < 5 | — | Hidden entirely. Not useful enough to risk over-weighting |
| 5–14 | developing (amber) | Directional, not conclusive. Use as a hint, not a rule |
| 15–29 | solid (green) | Confident-enough for decisions |
| ≥ 30 | established (green) | Statistically meaningful at swing-trading timescales |
The bands are deliberately conservative. Most retail traders over-trust small-sample wins ("I'm 4-0 on this setup, I love it") and over-trust small-sample losses ("0-3, this setup doesn't work for me"). Both are coin-flip data dressed as conclusions. The display rules force you to see "developing" rather than "winning" until you've genuinely paid the cost of running 15+ of that setup.
Reading the subtab — two patterns
- Monthly review. First weekend of the month, open the subtab. Note any setup type that crossed a sample-size band threshold. Note any that materially changed expectancy. Adjust which setups you're actively scanning for next month accordingly.
- Pre-add-to-watchlist check. If the candidate would be (say) a catalyst play, check your historical catalyst-play expectancy first. If you're 30% on catalyst plays with negative expectancy and the new entry is a catalyst play, that's the framework asking you a hard question before you take the trade.
What it explicitly doesn't do
- It doesn't tell you whether the setups are good in absolute terms. Only whether they're good for you, on your sample, in the regimes you've traded. A 70% win rate on pullbacks could mean skill, or it could mean the market favored mean-reversion lately.
- It doesn't auto-veto. If your expectancy on catalyst plays is -0.3R, the framework won't refuse to let you place them. The subtab is information; the trade decision stays yours. Auto-veto on bad-personal-expectancy is roadmap, not shipped — we're not convinced it's the right answer yet.
- It doesn't combine setup type with regime. "Pullback at support in a TRENDING regime" is a different bet from "pullback at support in a CHOPPY regime"; the subtab averages them. Multi-axis breakdown is queued.
- It doesn't handle SELL trades. Short setups + short-side options get classified separately if at all. Current subtab is BUY-focused.
The real lesson
Most traders' edge is narrower than they think. What looks like "I'm a 55% trader" is usually "I'm a 75% trader on pullbacks who keeps cancelling that edge with catalyst plays I'm 32% on." Setup performance is the framework letting you discover that pattern in your own data — with sample-size guardrails so you don't over-trust noise, with explicit setup-type labels so you can compare apples to apples, with the worst-ticker breakdown so you can see whether a bad setup is one specific ticker or a real weakness. The trader doesn't have to be intuitive about this anymore. The math is just there.
Related: L22 — journal canonical · L33 — performance accounting · setup performance blog post