Small-sample heavy-tail benchmark¶
Question¶
What should a user do when the sample size is small, the dimension is not tiny, and the data are heavy-tailed? This is the regime where empirical covariance, Ledoit-Wolf, OAS, and classical MCD can become unstable or misleading.
Design¶
The benchmark simulates elliptical Student-t data over a grid of sample sizes, feature dimensions, and degrees of freedom. Smaller degrees of freedom mean heavier tails. For each setting, each estimator is compared to the known population scatter using relative Frobenius error.
The main output is not a single timing number. It is the ranking across the whole grid: win rate, mean rank, median error, and median runtime.
Summary table¶
method |
appearances |
failures |
win_rate |
mean_rank |
median_error |
mean_error |
median_seconds |
|---|---|---|---|---|---|---|---|
robustcov Cauchy |
27 |
0 |
0.7407 |
1.4074 |
0.5994 |
0.7625 |
0.013307 |
robustcov StudentT(df=3) |
27 |
0 |
0.0000 |
3.1852 |
0.6675 |
0.8793 |
0.015101 |
robustcov HellTyler(exp) |
27 |
0 |
0.1111 |
3.2963 |
0.8503 |
0.9889 |
0.042900 |
robustcov RegTyler |
27 |
0 |
0.0370 |
3.5185 |
0.8021 |
12.0413 |
0.005107 |
robustcov KLTyler |
27 |
0 |
0.0370 |
4.5185 |
0.8021 |
12.0413 |
0.005073 |
sklearn MinCovDet |
27 |
0 |
0.1111 |
6.2963 |
2.1739 |
15.6922 |
0.024392 |
sklearn LedoitWolf |
27 |
0 |
0.0000 |
6.5556 |
2.5696 |
262.0354 |
0.000420 |
sklearn OAS |
27 |
0 |
0.0000 |
7.5926 |
5.3643 |
1629.2743 |
0.000325 |
sklearn Empirical |
27 |
0 |
0.0000 |
8.6296 |
6.9285 |
1681.1440 |
0.000449 |
Ranking plot¶
Interpretation¶
The important result is that RegularizedCauchy is the strongest default in this grid. It has
high win rate, low mean rank, and low median error. StudentTScatter is often close and is a
smoother alternative when the user wants less aggressive Cauchy-style radial downweighting.
The benchmark also explains why the package should not be positioned as a generic collection of older robust estimators. MVE is historically important, but the strongest evidence here is for regularized heavy-tail scatter in small-sample settings.
Run it yourself¶
python benchmarks/small_sample_heavy_tail.py --csv results/small_sample.csv
python benchmarks/benchmark_summary.py \
--input results/small_sample.csv \
--csv results/small_sample_summary.csv \
--html results/small_sample_summary.html \
--markdown results/small_sample_summary.md