A main issue of machine learning algorithms is their incapacity to process experiences outside the training domain. The challenge will be to detect unknown observations, and to be able to process take into account these new observations to update and increase capacity of the algorithms (including Deep neural networks).
The whole machinery of machine learning technics relies on the assumption that the learning sample conveys all the information of the data. Hence the algorithm is only able to learn what is observed in the data used to train the algorithm. So no novelty can be forecast using standard methods, while in practice new behavior may appear in the observed data.
For instance this situation may occur when dealing with:
- an unknown-unknown example (not inside the test dataset) whose processing is not required
- an example close to the training distribution (like adversarial examples) which must be proceed
- new observations acquired by the validation team or the final user which has to be learnt. For example, for autonomous car, traffic signs classification networks must be able to adapt to new ones without a long and expensive retraining
In all cases, the system should handle these events to avoid the system to be over confident. The neural network should be able to adapt following these unknown observations.
Examples of industrial use cases
This challenge aims to be able to identify an unknown observation, create a new class with this observation and update the classifier with new classes.
Industrial applications of novelty detection are quite obvious, whatever the domain, enabling to reject input data, or avoid wrong decisions. It may also enable to raise alarms for anomaly detection.
Robustness to adversarial examples is also a kind of obvious application for safety critical systems, in order to be able to raise alarms when confidence on the output is low.
The capacity to learn from few data, or incrementally update a system with new class or data has many advantages from an industrial point of view. First in many domains, the whole dataset may not be available at the same time, or the distribution of data can shift with time. Thus maintenance of the system would require simple methods to upgrade the neural networks.
Example of unknown observation could be:
- a new traffic sign (80 km/h)
- a new type of turn lights for automotive systems
- a new type of objects in remote sensing image processing
State of the art and limitations
Scientific approach to solve the challenge
Success criteria for the challenge
Challenge will be successful if we have developped:
- Algorithms able to detect unknown data
- Algorithms able to classify unknown objects with few labelled examples
- Methodology to update incrementally the architecture of a networks to handle new observations of new classes
Dataset required for the challenge
- Standard datasets: for comparison with state of the art methods, we will use standard datasets such as MNIST, Cifar10 or Cifar100 to learn on part of the datasets and be able to detect unknown inputs or to upgrade with new classes
- Traffic sign:
We can either use standard datasets such as GTSRB (German Traffic Sign Recognition Benchmark), or propose such challenge on industrial datasets (such as the Airbus taxiway traffic sign dataset) - Turn sign:
Renault may provide a dataset of cars with annotated turn sign activity. The challenge can cope with either new kind of turn sign (such as turn signs with chaser effect), or either with warning class detection - Remote sensing:
We could use the same datasets as those used in 0 challenge, with the aim to be able to detect a new object, or to adapt the system to a new kind of sensor
