Alejandro Valverde has kicked off the 2018 season with an impressive series of wins. Meanwhile Vincenzo Nibali delighted the tifosi with his victory in Milan San Remo. It is pretty easy to tell these two riders apart in the pictures above, but could computer distinguish between them?
Following up on my earlier blogs about neural networks, I have been taking a look at the updated version of fast.ai’s course on deep learning. With the field advancing at a rapid pace, this provides a good way to staying up to date with the state of the art. For example, there are now a couple of cheaper alternatives to AWS for accessing high powered GPUs, offered by Paperspace and Crestle. The latest fast.ai libraries include many new tools that work extremely well in practice.
There’s a view that deep learning requires hours of training on high-powered supercomputers, using thousands (or millions) of labelled examples, in order to learn to perform computer vision tasks. However, newer architectures, such as ResNet, are able to run on much smaller data sets. In order to test this, I used an image downloader to grab photos of Nibali and Valverde and manually selected about 55 decent pictures of each one.
I divided the images into a training set with about 40 images of each rider, a validation set with 10 of each and a test set containing the rest. Nibali appears in a range of different coloured jerseys, though the Astana blue is often present. Valverde is mainly wearing the old dark blue Movistar kit with a green M. There were more close-up shots of Nibali’s face than Valverde.
I was able to fine-tune a pre-trained ResNet neural network to this task, using some of the techniques from the fast.ai tool box, each designed to improve generalisation. The first trick was to augment the training set by performing minor transformations of the images at random, such as taking a mirror image, shifting left or right and zooming in a bit. The second set of tricks varied the rate of learning as the algorithm iterated repeatedly through the training set. A final useful technique created a set of variants of each test image and took the average of the predictions. Everything ran at lightning speed on a Paperspace GPU. After a run time of just a few minutes, the ResNet was able to score 17 out of 20 on the following validation set.
The confusion matrix shows that the model correctly identified all the Nibali images, but it was wrong on three pictures of Valverde. The first incorrect image (below) shows Valverde in the red leader’s jersey of the Tour of Murcia, which is not dissimilar to Nibali’s new Bahrain Merida kit, though he was wearing red in two of his training images. In the second instance, the network was fooled by the change in colour of Moviestar’s kit, which had become rather similar to Astana’s light blue. The figure of 0.41 above the close-up image indicates that the model assigned only a 41% probability that the image was Valverde. It probably fell below the critical 50% level, in spite of the blue/green colours, because there were were far more close-up shots of Nibali than Valverde in the training set.
Overall of 17 out of 20 on the validation set is impressive. However, the network had access to the validation set during training, so this result is “in sample”. A proper “out of sample” evaluation of the model’s ability made use the following ten images, comprising the test set that was kept aside.
Amazingly, the model correctly identified 9 out of the 10 pictures it had not seen before. The only error was the Valverde selfie shown in the final image. In order to work better in practice, the training set would need to include more examples of the riders’ 2018 kit. A variant of the problem would be to identify the team rather than the rider. The same network can be trained for multiple classes rather than just two.
This experiment shows that it is pretty straightforward to run state of the art image recognition tools remotely on a GPU somewhere in the cloud and come up with pretty impressive results, even with a small data set.
The next blog describes how to identify a rider’s team.