Anomaly detection baselines¶

Question¶

How do robust covariance detectors compare with popular anomaly-detection baselines on a tabular heavy-tail task?

This benchmark is meant for orientation rather than final publication. It translates robust covariance output into the language many ML users expect: precision, recall, F1, ROC AUC, detection count, and runtime.

Benchmark design¶

The data contain heavy-tailed inliers and a small fraction of mixed outliers. Outliers include both mean-shifted points and leverage-like observations. The benchmark compares:

robustcov FastMCD robust-distance detector;
robustcov auto robust anomaly detector;
sklearn IsolationForest;
sklearn LocalOutlierFactor;
sklearn OneClassSVM;
sklearn EllipticEnvelope.

Results¶

Anomaly baseline comparison¶
method	seconds	precision	recall	f1	roc_auc	detected
sklearn EllipticEnvelope	0.4706082669999887	0.8055555555555556	0.8055555555555556	0.8055555555555556	0.9877214170692431	72
robustcov FastMCD detector	0.18479795199982618	0.75	0.75	0.75	0.9827227589908749	72
sklearn IsolationForest	0.13584519300002285	0.5694444444444444	0.5694444444444444	0.5694444444444444	0.9688338701019861	72
robustcov Auto detector	0.03845888500018191	0.5555555555555556	0.5555555555555556	0.5555555555555556	0.9685654857756307	72
sklearn OneClassSVM	0.009737448999658227	0.45714285714285713	0.4444444444444444	0.4507042253521127	0.9214472624798711	70
sklearn LocalOutlierFactor	0.021117043000231206	0.4444444444444444	0.4444444444444444	0.4444444444444444	0.553408480944713	72

Interpretation¶

The FastMCD detector is the most direct robustcov baseline for separable tabular anomalies. It converts robust Mahalanobis distances into anomaly labels using an empirical threshold. This is the easiest method to explain to users: fit a robust center and covariance, compute robust distances, and flag the largest distances.

AutoRobustAnomalyDetector is intended as a diagnostic ensemble rather than a universal replacement for dedicated anomaly detectors. It is useful when the user wants a robust covariance view of the data and wants to compare several scatter estimators quickly.

The sklearn methods are included because users will naturally compare against them. The result to emphasize is not that robust covariance always wins, but that it gives interpretable distances, covariance diagnostics, and a clear geometric explanation of flagged points.

Command¶

python benchmarks/anomaly_detection_baselines.py \
  --n 900 \
  --p 12 \
  --contamination 0.08 \
  --csv results/anomaly_baselines.csv

python examples/plot_anomaly_baselines.py \
  --input results/anomaly_baselines.csv \
  --output results/anomaly_baselines.png

What to look at first¶

Use F1 when contamination is known and you care about balanced precision/recall.
Use ROC AUC when you care more about ranking anomalies than choosing a threshold.
Inspect robust distance profiles after the benchmark; they explain why points are flagged.