Medical screening
=================

Status
------

.. admonition:: Competitive, not best
   :class: note

   On this medical tabular screening dataset, ``robustcov`` is close to the
   tested baselines, but it is not the best method.  ``EllipticEnvelope`` gives
   the strongest F1 and PR-AUC in this run.  We keep the result because it is a
   useful negative/competitive example: robust covariance is not expected to win
   every tabular anomaly benchmark.

Why this matters
----------------

Medical screening tables often contain mixed risk factors, nonlinear effects,
and population-level confounders.  Robust covariance can still provide a useful
interpretable anomaly score, but this is not always the best standalone detector
for such data.

Result summary
--------------

.. list-table:: Medical screening external benchmark
   :header-rows: 1

   * - Method
     - F1
     - PR-AUC
     - ROC-AUC
     - Seconds
   * - sklearn EllipticEnvelope
     - 0.5996
     - 0.6285
     - 0.6356
     - 1.5929
   * - sklearn IsolationForest
     - 0.5811
     - 0.5941
     - 0.6078
     - 0.9813
   * - robustcov Auto(StudentTScatter)
     - 0.5712
     - 0.5674
     - 0.5863
     - 2.5164
   * - sklearn LocalOutlierFactor
     - 0.5365
     - 0.5391
     - 0.5487
     - 10.5699

.. figure:: ../_static/external_results/medical_screening/pr_auc.png
   :alt: Medical screening PR-AUC comparison
   :width: 82%

   PR-AUC comparison.  ``robustcov`` is competitive but trails
   ``EllipticEnvelope`` and ``IsolationForest`` on this dataset.

.. figure:: ../_static/external_results/medical_screening/f1.png
   :alt: Medical screening F1 comparison
   :width: 82%

   F1 comparison at a fixed detection budget.

.. figure:: ../_static/external_results/medical_screening/runtime.png
   :alt: Medical screening runtime comparison
   :width: 82%

   Runtime comparison on a log scale.

Output from the run
-------------------

.. literalinclude:: ../_static/external_results/medical_screening/output.txt
   :language: text

Interpretation
--------------

This is a useful trust-building result.  It shows that ``robustcov`` is not
promoted as a universal winner.  The dataset likely contains risk structure that
is not purely covariance-shaped; tree-based or supervised models may be more
appropriate in a production medical screening setting.

Recommendation
--------------

Use ``robustcov`` here as a diagnostic score or preprocessing feature rather
than the sole production detector.  If labels are available, evaluate the robust
score alongside supervised models.