Unsupervised Anomaly Localization with Structural Feature-Autoencoders

Abstract

Unsupervised Anomaly Detection has become a popular method to detect pathologies in medical images as it does not require supervision or labels for training. Most commonly, the anomaly detection model generates a “normal” version of an input image, and the pixel-wise $l^p$-difference of the two is used to localize anomalies. However, large residuals often occur due to imperfect reconstruction of the complex anatomical structures present in most medical images. This method also fails to detect anomalies that are not characterized by large intensity differences to the surrounding tissue. We propose to tackle this problem using a feature-mapping function that transforms the input intensity images into a space with multiple channels where anomalies can be detected along different discriminative feature maps extracted from the original image. We then train an Autoencoder model in this space using structural similarity loss that does not only consider differences in intensity but also in contrast and structure. Our method significantly increases performance on two medical data sets for brain MRI.

Publication
In International MICCAI Brainlesion Workshop

Overview

Our strongest model so far 💪. Anomaly detection with image-reconstruction models doesn’t work well if anomalies are not hyperintense. FAE solves this problem.

Overview

Fig. 1. FAE trains an autoencoder not in image-space, but in the feature-space of a pretrained ResNet. This allows it to also capture anomalies that are not hyperintense in image-space. Using the Structural Similarity Index Measure (SSIM) further helps with this problem.

Results

Our model outperforms all competitors by a large margin. A recent analysis of 2023 still found it the strongest anomaly detection model so far.

Quantitative results

Fig. 2. Performance of all compared models and a random classifier, including error bars that indicate one standard deviation over $N = 5$ runs with different random seeds, on the BraTS data set. Our method performs statistically significantly better than all compared methods (t-test; $p ≤ 0.05$).

Qualitative results

Fig. 3. Examples of successful and failure cases.

Felix Meissen
Felix Meissen
Ph.D. Student for AI in Medicine

My research interests include anomaly detection, object detection, and anything related to medical images