Out-of-Distribution

OUT-OF-DISTRIBUTION DETECTION (OOD)

Detecting outlier, out-of-distribution or anomalous data is crucial for safety-critical applications. Such applications, which are ubiquitous in the industry, can be divided into two sets:

Anomaly Detection/Outlier Detection (AD)

Given a set of normal data – possibly corrupted by outliers/anomalous data – this task consists in designing a model of normal data that can be used to identify anomalies. Example applications are fraud detection or industrial visual inspection.

Out-Of-Distribution detection/Open-Set-Recognition (OOD)

This approach applies to machine learning models (e.g. neural networks) that have been trained to perform a specific task. The model is monitored to make sure that it is called on data similar to that it has been trained on (in-distribution). This process aims to avoid a random model behavior when calling on data that it was not trained to process (out-of-distribution). When such data is of interest, for instance when it is a new class that we may want to include in further learning, this task is dubbed Open-Set-Recognition. Examples of applications are selective inference, model monitoring, robustness, and OOD generalization…

Research axes

DEEL’s OOD scientific challenge tackles both approaches across the following research axes, with an important focus on high-dimensional problems such as vision tasks.

Past projects

(AD) Variational autoencoders and normalizing flows have been used to model the distribution of normal data. These models both allow to provide a likelihood for new data that can be used as a normality score. We also built on these models to apply a score from extreme values theory to high dimensional anomaly detection.

(AD) We also used Lipschitz neural networks to approximate the signed distance function to the boundary of the normal distribution for anomaly detection, leading to a publication at ICML 2023.

Ongoing projects

(AD) We are currently studying diffusion models and score-based generative models to approximate the normal distribution. The anomaly score can be constructed out of the new data likelihood (for score-based) the evidence lower bound (for diffusion models) or the reconstruction error after a noising-denoising process.

(OOD) We are studying Post-hoc OOD detection methods, i.e. algorithms that monitor an already trained neural network’s behavior at inference time to evaluate if the input data was OOD or not. To prepare intensive studies, benchmarks, and new contributions, we developed a library, Oodeel, that implements many baselines from the recent Post-hoc OOD literature. The library is available for models trained in tensorflow or pytorch and designed to be either easily used thanks to a simple, scikit-learn-like API, or customize for researchers designing new methods.

(OOD) backed up by Oodeel, we are studying the combination between Post-hoc OOD scores and object detection. To add a layer of robustness and confidence or improve the object detection project as a whole.

Achievements

Several technical reports for industrial partners on variational autoencoders, normalizing flows, and extreme value theory score.

Robust One-Class Classification with Signed Distance Function using 1-Lipschitz Neural Networks. Béthune, L., Novello, P., Boissin, T., Coiffier, G., Serrurier, M., Vincenot, Q., & Troya-Galvis, A. (ICML 2023)

OODEEL : Simple, compact, and hackable post-hoc deep OOD detection for already trained tensorflow or pytorch image classifiers.