v7.x ARCHITECTUREADVANCED · LESSON 39 / 41~6 min read

Setup performance — per-pattern expectancy, honestly.

A 55% overall win rate hides a lot. It can be 80% pullback-at-support and 30% catalyst plays you keep taking. The single average is the arithmetic mean of bets you'd love to make more of and bets you should stop entirely — and you can't tell them apart from the average alone. v7.2's Setup Performance subtab tags every BUY journal entry with one of eight setup types and surfaces per-type win rate, expectancy in R, profit factor, and best/worst ticker. Plus sample-size bands so you don't believe tiny n.

The eight setup types

Every BUY entry runs through _classifySetup(), a priority-ordered detector. The first matching label wins:

#	Setup type	Detection
1	bull_flag	Tight 3-8 bar consolidation off a recent breakout, declining volume, no break of the flag's lower edge
2	catalyst_play	Entry within 1 session of a detected catalyst (earnings beat, M&A, regulatory clearance, scale-insider buy)
3	earnings_runup	Entry 1-10 sessions before a known earnings date, no other catalyst, IV expansion present
4	breakout_from_base	Clean break of a 20+ bar consolidation top with volume >1.5×
5	pullback_at_support	Entry near a structural support level (primary_S from levels module), prior trend up, ADX > 20
6	oversold_bounce	RSI < 30 reversal, no other obvious setup ("catching a knife")
7	trend_continuation	Established uptrend, EMA21 > EMA50 > EMA200, no specific trigger beyond "trend intact"
8	other	Everything that doesn't fit. The "I just felt like it" bucket

Priority ordering matters. A trade that's both "near a known earnings date" and "near support" gets catalyst_play, not pullback_at_support — the catalyst is the higher-information label, so the outcome attributes to the catalyst lens. The order is a calibration choice; you can argue with it, but you can't argue both labels apply equally.

The four numbers per type

Win rate — percent of closed trades ending above entry net of fees
Expectancy in R — (win rate × avg winner in R) − ((1 − win rate) × avg loser in R). The single most important number for any setup. It's what each trade is worth in expectation
Profit factor — gross winners / gross losers. 2.0+ is great. 1.5-2.0 is solid. Below 1.2 is a setup you're paying for the privilege of taking
Best / worst ticker — the actual ticker that contributed the most P&L for that setup, and the one that contributed the worst. Useful for spotting whether a bad setup is one specific ticker tanking the average or a genuine pattern weakness

Sample-size bands

Per-setup math is dangerous below a certain sample size. A "100% win rate on bull flags" from 3 trades is noise, not signal. The subtab gates display by sample size:

n	Band	Display
< 5	—	Hidden entirely. Not useful enough to risk over-weighting
5–14	developing (amber)	Directional, not conclusive. Use as a hint, not a rule
15–29	solid (green)	Confident-enough for decisions
≥ 30	established (green)	Statistically meaningful at swing-trading timescales

The bands are deliberately conservative. Most retail traders over-trust small-sample wins ("I'm 4-0 on this setup, I love it") and over-trust small-sample losses ("0-3, this setup doesn't work for me"). Both are coin-flip data dressed as conclusions. The display rules force you to see "developing" rather than "winning" until you've genuinely paid the cost of running 15+ of that setup.

Reading the subtab — two patterns

Monthly review. First weekend of the month, open the subtab. Note any setup type that crossed a sample-size band threshold. Note any that materially changed expectancy. Adjust which setups you're actively scanning for next month accordingly.
Pre-add-to-watchlist check. If the candidate would be (say) a catalyst play, check your historical catalyst-play expectancy first. If you're 30% on catalyst plays with negative expectancy and the new entry is a catalyst play, that's the framework asking you a hard question before you take the trade.

What it explicitly doesn't do

It doesn't tell you whether the setups are good in absolute terms. Only whether they're good for you, on your sample, in the regimes you've traded. A 70% win rate on pullbacks could mean skill, or it could mean the market favored mean-reversion lately.
It doesn't auto-veto. If your expectancy on catalyst plays is -0.3R, the framework won't refuse to let you place them. The subtab is information; the trade decision stays yours. Auto-veto on bad-personal-expectancy is roadmap, not shipped — we're not convinced it's the right answer yet.
It doesn't combine setup type with regime. "Pullback at support in a TRENDING regime" is a different bet from "pullback at support in a CHOPPY regime"; the subtab averages them. Multi-axis breakdown is queued.
It doesn't handle SELL trades. Short setups + short-side options get classified separately if at all. Current subtab is BUY-focused.

The real lesson

Most traders' edge is narrower than they think. What looks like "I'm a 55% trader" is usually "I'm a 75% trader on pullbacks who keeps cancelling that edge with catalyst plays I'm 32% on." Setup performance is the framework letting you discover that pattern in your own data — with sample-size guardrails so you don't over-trust noise, with explicit setup-type labels so you can compare apples to apples, with the worst-ticker breakdown so you can see whether a bad setup is one specific ticker or a real weakness. The trader doesn't have to be intuitive about this anymore. The math is just there.