Credit-card fraud result

Why this result matters

Credit-card fraud is a popular imbalanced anomaly-detection benchmark. It is a useful public example because users already understand the task, and because PR AUC and F1 are more informative than accuracy on rare fraud events.

Observed result

A local external run reported the following table.

Credit-card fraud external result

Method

Seconds

Precision

Recall

F1

ROC AUC

PR AUC

robustcov FastMCD

57.202

0.801

0.801

0.801

0.957

0.712

sklearn IsolationForest

3.392

0.262

0.262

0.262

0.948

0.143

sklearn EllipticEnvelope

12.518

0.213

0.213

0.213

0.920

0.125

sklearn LocalOutlierFactor

35.981

0.000

0.000

0.000

0.513

0.002

Plots

Credit-card fraud PR-AUC comparison

PR-AUC comparison. This metric is important for rare fraud because it focuses on precision/recall behavior under severe class imbalance.

Credit-card fraud F1 comparison

Thresholded F1 comparison at the same detected-count level.

Output from the run

method,seconds,precision,recall,f1,roc_auc,pr_auc,detected
robustcov FastMCD,57.2020,0.8008,0.8008,0.8008,0.9573,0.7121,492
sklearn IsolationForest,3.3915,0.2622,0.2622,0.2622,0.9475,0.1427,492
sklearn EllipticEnvelope,12.5178,0.2134,0.2134,0.2134,0.9199,0.1248,492
sklearn LocalOutlierFactor,35.9812,0.0000,0.0000,0.0000,0.5127,0.0021,492
saved outputs to results/external/credit_card_fraud

Interpretation

robustcov FastMCD was slower than IsolationForest in this run, but it produced a much stronger thresholded fraud-screening result and a much higher PR AUC. This is a good Kaggle/notebook story because the robust-distance score is interpretable and the metric gap is large.

How to reproduce

Download the credit-card fraud CSV manually, then run:

python examples_external/kaggle_credit_card_fraud.py \
  --data /path/to/creditcard.csv \
  --outdir results/external/credit_card_fraud

Outputs

The script writes:

  • metrics.csv;

  • pr_auc.png;

  • f1.png;

  • robust_score_profile.png;

  • summary.md.

Production note

This should be presented as an unsupervised screening score, not as a full competition-winning fraud pipeline. In supervised Kaggle settings, the robust score can also be used as a feature for gradient boosting or other classifiers.