ROBUSTNESS VIA DETECTION AND ADAPTATION TO UNKNOWN DATA
Faculty members : Sébastien Gerchinovitz, Jean-Michel Loubes, Mathieu Serrurier
Datascientists : David Bertoin, David Vigouroux, Franck Mamalet
Students : M2 internship
SCOPE
A main issue of machine learning algorithms is their incapacity to process experiences outside the training domain. The challenge will be to detect unknown observations, and to be able to process take into account these new observations to update and increase capacity of the algorithms (including Deep neural networks).
The whole machinery of machine learning technics relies on the assumption that the learning sample conveys all the information of the data. Hence the algorithm is only able to learn what is observed in the data used to train the algorithm. So no novelty can be forecast using standard methods, while in practice new behavior may appear in the observed data.
For instance this situation may occur when dealing with:
- an unknown-unknown example (not inside the test dataset) whose processing is not required
- an example close to the training distribution (like adversarial examples) which must be proceed
- new observations acquired by the validation team or the final user which has to be learnt. For example, for autonomous car, traffic signs classification networks must be able to adapt to new ones without a long and expensive retraining
In all cases, the system should handle these events to avoid the system to be over confident. The neural network should be able to adapt following these unknown observations.
Examples of industrial use cases
This challenge aims to be able to identify an unknown observation, create a new class with this observation and update the classifier with new classes.
Industrial applications of novelty detection are quite obvious, whatever the domain, enabling to reject input data, or avoid wrong decisions. It may also enable to raise alarms for anomaly detection.
Robustness to adversarial examples is also a kind of obvious application for safety critical systems, in order to be able to raise alarms when confidence on the output is low.
The capacity to learn from few data, or incrementally update a system with new class or data has many advantages from an industrial point of view. First in many domains, the whole dataset may not be available at the same time, or the distribution of data can shift with time. Thus maintenance of the system would require simple methods to upgrade the neural networks.
Example of unknown observation could be:
- a new traffic sign (80 km/h)
- a new type of turn lights for automotive systems
- a new type of objects in remote sensing image processing
State of the art and limitations
Scientific approach to solve the challenge
Success criteria for the challenge
Challenge will be successful if we have developped:
- Algorithms able to detect unknown data
- Algorithms able to classify unknown objects with few labelled examples
- Methodology to update incrementally the architecture of a networks to handle new observations of new classes
Dataset required for the challenge
- Standard datasets: for comparison with state of the art methods, we will use standard datasets such as MNIST, Cifar10 or Cifar100 to learn on part of the datasets and be able to detect unknown inputs or to upgrade with new classes
- Traffic sign:
We can either use standard datasets such as GTSRB (German Traffic Sign Recognition Benchmark), or propose such challenge on industrial datasets (such as the Airbus taxiway traffic sign dataset) - Turn sign:
Renault may provide a dataset of cars with annotated turn sign activity. The challenge can cope with either new kind of turn sign (such as turn signs with chaser effect), or either with warning class detection - Remote sensing:
We could use the same datasets as those used in 0 challenge, with the aim to be able to detect a new object, or to adapt the system to a new kind of sensor
References
[1] Dongyu Meng and Hao Chen. MagNet: A Two-Pronged Defense against AdversarialExamples.InProceedings of the 2017 ACM SIGSAC Conference on Computer and 15 Communications Security – CCS ’17, pages 135–147, Dallas, Texas, USA, 2017. ACMPres
[2] Pouya Samangouei, Maya Kabkab, and Rama Chellappa.DEFENSEGAN: PROTECTING CLASSIFIERS AGAINST ADVERSARIAL ATTACKS USING GENERATIVE MODELS. page 17, 2018
[3] Gokula Krishnan Santhanam and Paulina Grnarova.Defending Against AdversarialAttacks by Leveraging an Entire GAN.arXiv:1805.10652 [cs, stat], May 2018. arXiv:1805.10652
[4] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, IanGoodfellow, and Rob Fergus. Intriguing properties of neural networks.arXiv:1312.6199[cs], December 2013. arXiv: 1312.6199
[5] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and HarnessingAdversarial Examples.arXiv:1412.6572 [cs, stat], December 2014. arXiv: 1412.6572
[6] Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial examples in the physicalworld.arXiv:1607.02533 [cs, stat], July 2016. arXiv: 1607.02533
[7] Vishaal Munusamy Kabilan, Brandon Morris, and Anh Nguyen.VectorDefense:Vectorization as a Defense to Adversarial Examples. April 2018
[8] Nicholas Carlini and David Wagner. Defensive Distillation is Not Robust to AdversarialExamples.arXiv:1607.04311 [cs], July 2016. arXiv: 1607.04311
[9] Shiwei Shen, Guoqing Jin, Ke Gao, and Yongdong Zhang.APE-GAN: AdversarialPerturbation Elimination with GAN.arXiv:1707.05474 [cs], July 2017.arXiv:1707.05474
[10] Hyeungill Lee, Sungyeob Han, and Jungwoo Lee. Generative Adversarial Trainer: Defenseto Adversarial Perturbations with GAN.arXiv:1705.03387 [cs, stat], May 2017. arXiv:1705.03387
[11] Diederik P. Kingma, Prafulla Dhariwal Glow: Generative Flow with Invertible 1×1 Convolutions, 10 Jul 2018. arXiv:1807.03039
[12] Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, David Duvenaud – FFJORD: Free-Form Continuous Dynamics For Scalable Reversible Generative Models – 11/2018 ICLR 2019
[13] Emilien Dupont, Arnaud Doucet, Yee Whye – The Augmented Neural ODEs. 2 april 2019. Arxiv: 1904.01681
[14] Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, Balaji Lakshminarayanan – Do deep generative models know what they don’t know? 02/2019 – ICLR 2019
[15] Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard.DeepFool:a simple and accurate method to fool deep neural networks.arXiv:1511.04599 [cs],November 2015. arXiv: 1511.04599
[16] O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, et D. Wierstra, « Matching Networks for One Shot Learning », arXiv:1606.04080 [cs, stat], juin 2016
[17] Y. Xian, C. H. Lampert, B. Schiele, et Z. Akata, « Zero-Shot Learning – A Comprehensive Evaluation of the Good, the Bad and the Ugly », arXiv:1707.00600 [cs], juill. 2017
[18] G. Koch, R. Zemel, et R. Salakhutdinov, « Siamese Neural Networks for One-shot Image Recognition », p. 8
[19] F. Schroff, D. Kalenichenko, et J. Philbin, « FaceNet: A Unified Embedding for Face Recognition and Clustering », 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 815 823, juin 2015
[20] R. Polikar, L. Upda, S. S. Upda, et V. Honavar, « Learn++: an incremental learning algorithm for supervised neural networks », IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 31, no 4, p. 497 508, nov. 2001
[21] M. McCloskey et N. J. Cohen, « Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem », in Psychology of Learning and Motivation, vol. 24, G. H. Bower, Éd. Academic Press, 1989, p. 109-165
[22] Y. Freund et R. E. Schapire, « A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting », Journal of Computer and System Sciences, vol. 55, no 1, p. 119 139, août 1997
[23] D. Medera et Š. Babinec, « Incremental Learning of Convolutional Neural Networks », p. 7, 2009
[24] T. Xiao, J. Zhang, K. Yang, Y. Peng, et Z. Zhang, « Error-Driven Incremental Learning in Deep Convolutional Neural Network for Large-Scale Image Classification », in Proceedings of the ACM International Conference on Multimedia – MM ’14, Orlando, Florida, USA, 2014, p. 177 186
[25] Z. Li et D. Hoiem, « Learning without Forgetting », arXiv:1606.09282 [cs, stat], juin 2016
[26] S.-A. Rebuffi, A. Kolesnikov, G. Sperl, et C. H. Lampert, « iCaRL: Incremental Classifier and Representation Learning », in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, p. 5533 5542
[27] Sumit Chopra, Raia Hadsell, and Yann LeCun. Learning a similarity metric discriminatively, with application to face verification. In Computer Vision and Pattern Recognition, 2005. CVPR
[28] K. Q. Weinberger, J. Blitzer, and L. K. Saul. Distance metric learning for large margin nearest neighbor classification. In NIPS. MIT Press, 2006. 2, 3