Breast-cancer screening as anomaly ranking
==========================================

This real sklearn dataset is a useful reality check.  The classes are not generated by a covariance model, so robustcov should not be expected to dominate all baselines.

Result at a glance
------------------

EllipticEnvelope has the best F1 in this run.  FastMCD is close to IsolationForest and OneClassSVM but not best.  This is a valuable honest example: robust distances are competitive diagnostics, not universal winners.

What the data represent
-----------------------

The example uses the sklearn breast-cancer dataset with a reduced feature representation.  One class is treated as the anomaly class for an unsupervised screening comparison.

Why this estimator
------------------

``FastMCD`` is included as an interpretable robust-distance baseline.  It is compared with common sklearn anomaly detectors.

Reproduce the result
--------------------

.. code-block:: bash

   python examples/use_case_breast_cancer_screening.py

Output from the run
-------------------

.. literalinclude:: ../_static/gallery/breast_cancer_screening/output.txt
   :language: text

Figures and diagnostics
-----------------------

.. image:: ../_static/gallery/breast_cancer_screening/baseline_f1.png
   :alt: Breast-cancer screening as anomaly ranking — baseline f1
   :width: 760px


.. image:: ../_static/gallery/breast_cancer_screening/score_profile.png
   :alt: Breast-cancer screening as anomaly ranking — score profile
   :width: 760px


.. image:: ../_static/gallery/breast_cancer_screening/distance_panel.png
   :alt: Breast-cancer screening as anomaly ranking — distance panel
   :width: 760px


How to read the result
----------------------

Look at both F1 and ROC-AUC.  Similar F1 values can hide different score rankings, and score rankings matter if the practical task is review prioritization.

What this does not prove
------------------------

For medical datasets, supervised clinical models and feature engineering are usually necessary.  robustcov should be framed as an interpretable screening score.