Level 3
10 min readLesson 19 of 43

Sample Size: How Many Trades Is Enough?

Why your 10-trade win rate is meaningless

Sample Size: How Many Trades Is Enough?

Your strategy has a 90% win rate over 10 trades. Impressive? Not really. With only 10 trades, that win rate is meaningless. You could flip a coin and get heads 9 out of 10 times by pure chance.

Sample size is one of the most misunderstood concepts in trading validation. Too many traders get excited about strategies based on a handful of signals, only to watch performance regress dramatically as more trades accumulate.

Why Small Samples Deceive

Statistics has a concept called the law of large numbers. The more observations you have, the closer your sample average gets to the true underlying average. With few observations, random variation dominates.

Imagine a strategy with a true win rate of 55%. If you run 10 trades, you might observe anywhere from 30% to 80% win rate just due to random chance. The confidence interval is enormous.

Run 100 trades and the observed win rate will cluster more tightly around 55%, maybe ranging from 45% to 65%. Run 1,000 trades and you'll likely observe between 52% and 58%.

The point: small samples can show wildly misleading results. A 90% win rate over 10 trades might come from a strategy with a true win rate of 55%. You wouldn't know until you've seen many more trades.

Statistical Significance Basics

Statistical significance answers the question: could this result have happened by chance?

If your strategy shows a 60% win rate, statistical tests ask how likely it is that a strategy with a true 50% win rate (no edge) could have produced those results by luck. If that probability is low, say less than 5%, we call the result statistically significant.

The larger your sample size, the easier it is to detect real effects. A 52% win rate over 50 trades is not statistically different from 50% because random variation could easily produce that result. A 52% win rate over 5,000 trades is highly significant because the probability of random variation producing consistent 52% performance over that many trades is tiny.

This is why sample size matters so much. You need enough trades to distinguish signal from noise.

The 1000+ Trade Rule of Thumb

As a practical guideline, we don't consider a strategy validated until it has at least 1,000 trades in the backtest.

Why 1,000? It's not a magic number, but it's roughly where statistical power becomes meaningful for typical trading strategy performance. At 1,000 trades, a 55% win rate is statistically distinguishable from 50% with reasonable confidence. At 100 trades, it's not.

For strategies with higher expected performance, like 65%+ win rates, you can get meaningful signals from fewer trades, maybe 300-500. For strategies closer to 50% win rate, you need more, possibly several thousand.

The point isn't the specific number but the principle: you need enough samples that random variation can't plausibly explain your results.

Confidence Intervals

Confidence intervals quantify your uncertainty around a measured result.

If your strategy shows a 60% win rate over 500 trades, the 95% confidence interval might be roughly 56% to 64%. This means if you ran the strategy many times, you'd expect the true win rate to fall in this range 95% of the time.

Larger sample sizes produce tighter confidence intervals. The same 60% win rate over 2,000 trades might have a 95% confidence interval of 58% to 62%. You're more certain about the true value.

When evaluating strategies, always think about the confidence interval, not just the point estimate. A strategy showing 70% win rate with a confidence interval of 55% to 85% is much less impressive than it sounds. The true performance might be barely profitable.

Trade Count vs Time Period

A common mistake is conflating trade count with time period. "The strategy worked for 3 years" sounds impressive, but if it only made 40 trades over those 3 years, the sample is tiny.

What matters is trade count, not calendar time. A strategy that produces 50 trades per day has meaningful sample size within a month. A strategy that trades once per week needs years of data to accumulate meaningful samples.

This creates a tension. Frequent-trading strategies generate statistical power quickly but face higher transaction costs and execution complexity. Infrequent-trading strategies are simpler to execute but harder to validate.

We navigate this by favoring strategies with moderate trade frequency. Our signals typically generate hundreds of trades per year, enough to accumulate meaningful samples without the operational burden of ultra-high-frequency trading.

When Sample Size Requirements Feel Impossible

You might be thinking: if I need 1,000+ trades to validate a strategy, and my strategy only produces 100 trades per year, I need 10 years of data. That seems unrealistic.

A few approaches help.

Use longer historical periods. Crypto has limited history, but there's often more than you think. Bitcoin futures have been trading since 2017. Many altcoins have 3-4 years of useful data. Design strategies that can use this full history.

Test across multiple assets. If your strategy concept works for multiple coins, combining their trade history increases your effective sample size. A strategy that produces 100 trades per year per coin might produce 500+ trades per year across 5 coins.

Accept some uncertainty. In practice, you often have to deploy strategies with less than ideal sample sizes. The key is knowing your uncertainty and sizing positions accordingly. A strategy with 300 trades deserves smaller position sizes than one with 3,000 trades because you're less certain about its true performance.

Key Takeaways

Small samples produce misleading results because random variation dominates with few observations. Statistical significance tells you whether results could plausibly be due to chance. Target 1,000+ trades minimum for robust validation of typical strategies. Confidence intervals quantify your uncertainty, and larger samples produce tighter intervals. Trade count matters, not calendar time. If you can't reach ideal sample sizes, test across multiple assets and size positions according to your uncertainty.

Next, we'll examine which metrics actually matter when evaluating trading performance, and which ones are vanity metrics that mislead you.