Lebeau, Eric
[UCL]
De Vleeschouwer, Christophe
[UCL]
Label errors can have a negative impact on the training of a convolutional neural network for image classification. Consequently, the learning of these label errors can lead to a decrease in overall performance with lower than expected image detection rates. Even with a well trained convolution neural network, a decrease in performance can be observed.\\ The goal of this paper is first to study the impact of the label errors. Next to provide a tool to identify the label errors in order to purify the database. Finally, the paper studies a method that allows the training of a convolutional neural network that is more robust to label errors.\\ The purification technique presented in this paper defines multiple criteria based on the neural network output to distinguish label errors from classification errors. These criteria give a probability to each image to contain a label error. The training of the convolutional neural network is based on these label error probabilities. Images that are suspected to have a label error will have a reduced impact on the training.\\ The paper presents different encouraging results obtained on a well-know database, MNIST. However, the usage of a more complex database, like CIFAR, displays the importance of an efficient network in order to have the maximum gain of the purification method.\\ A deeper analysis of the purification methods in different cases is also provided. Indeed, deeper tests show the negative impact of overfitting in the presence of label errors. Small networks seem to be less affected by this overfitting but their results are unfortunately not as good as the ones obtained with complex networks. Some alternative solutions like removing all errors (label errors and classification errors) during training by using 0 weights are tested. Results show that these solutions allow a certain stability but do not seem optimal. More creative solutions with iterative weights give promising results.\\ Finally, a real test on a more complex database finalized this paper. This test confirms the previously observed results but the overall performance and efficiency is not as good as for simpler databases like MNIST. The results seem to suffer from the not random noise between similar characters. However, they still confirm the interest of the method and suggest to deepen the research.\\
Bibliographic reference |
Lebeau, Eric. Study of label errors on a convolutional neural network. Ecole polytechnique de Louvain, Université catholique de Louvain, 2017. Prom. : De Vleeschouwer, Christophe. |
Permanent URL |
http://hdl.handle.net/2078.1/thesis:13006 |