Predicting the World Champion

A couple of years ago I built a model to evaluate how Froome and Dumoulin would have matched up, if they had not avoided racing against each other over the 2017 season. As we approach the 2019 World Championships Road Race in Yorkshire, I have adopted a more sophisticated approach to try to predict the winner of the men’s race. The smart money could be going on Sam Bennett.

Deep learning

With only two races outstanding, most of this year’s UCI world tour results are available. I decided to broaden the data set with 2.HC classification European Tour races, such as the OVO Energy Tour of Britain. In order to help with prediction, I included each rider’s weight and height, as well as some meta-data about each race, such as date, distance, average speed, parcours and type (stage, one-day, GC, etc.).

The key question was what exactly are you trying to predict? The UCI allocates points for race results, using a non-linear scale. For example, Mathieu Van Der Poel was awarded 500 points for winning Amstel Gold, while Simon Clarke won 400 for coming second and Jakob Fuglsang picked up 325 for third place, continuing down to 3 points for coming 60th. I created a target variable called PosX, defined as a negative exponential of the rider’s position in any race, equating to 1.000 for a win, 0.834 for second, 0.695 for third, decaying down to 0.032 for 20th. This has a similar profile to the points scheme, emphasising the top positions, and handles races with different numbers of riders.

A random forest would be a typical choice of model for this kind of data set, which included a mixture of continuous and categorical variables. However, I opted for a neural network, using embeddings to encode the categorical variables, with two hidden layers of 200 and 100 activations. This was very straightforward using the fast.ai library. Training was completed in a handful of seconds on my MacBook Pro, without needing a GPU.

After some experimentation on a subset of the data, it was clear that the model was coming up with good predictions on the validation set and the out-of-sample test set. With a bit more coding, I set up a procedure to load a start list and the meta-data for a future race, in order to predict the result.

Predictions

With the final start list for the World Championships Road Race looking reasonably complete, I was able to generate the predicted top 10. The parcours obviously has an important bearing on who wins a race. With around 3600m of climbing, the course was clearly hilly, though not mountainous. Although the finish was slightly uphill, it was not ridiculously steep, so I decided to classify the parcours as rolling with a flat finish

PositionRiderPrediction
1Mathieu Van Der Poel0.602
2Alexander Kristoff0.566
3Sam Bennett0.553
4Peter Sagan0.540
5Edvald Boasson Hagen0.507
6Greg Van Avermaet0.500
7Matteo Trentin0.434
8Michael Matthews0.423
9Julian Alaphilippe0.369
10Mike Teunissen0.362

It was encouraging to see that the model produced a highly credible list of potential top 10 riders, agreeing with the bookies in rating Mathieu Van Der Poel as the most likely winner. Sagan was ranked slightly below Kristoff and Bennett, who are seen as outsiders by the pundits. The popular choice of Philippe Gilbert did not appear in my top 10 and Alaphilippe was only 9th, in spite of their recent strong performances in the Vuelta and the Tour, respectively. Riders in positions 5 to 10 would all be expected to perform well in the cycling classics, which tend to be long and arduous, like the Yorkshire course.

For me, 25/1 odds on Sam Bennett are attractive. He has a strong group of teammates, in Dan Martin, Eddie Dunbar, Connor Dunne, Ryan Mullen and Rory Townsend, who will work hard to keep him with the lead group in the hillier early part of the race. Then he will then face an extremely strong Belgian team that is likely to play the same game that Deceuninck-QuickStep successfully pulled off in stage 17 of the Vuelta, won by Gilbert. But Bennett was born in Belgium and he was clearly the best sprinter out in Spain. He should be able to handle the rises near the finish.

A similar case can be made for Kristoff, while Matthews and Van Avermaet both had recent wins in Canada. Nevertheless it is hard to look past the three-times winner Peter Sagan, though if Van Der Poel launches one of his explosive finishes, there is no one to stop him pulling on the rainbow jersey.

Appendix

After the race, I checked the predicted position of the eventual winner, Mads Pedersen. He was expected to come 74th. Clearly the bad weather played a role in the result, favouring the larger riders, who were able to keep warmer. The Dane clearly proved to be the strongest rider on the day.

References

Code used for this project

Cycling Physique

It is easy to assume that successful professional cyclists are all skinny little guys, but if you look at the data, it turns out that they have an average height of 1.80m and an average weight of around 68kg. If we are to believe the figures posted on ProCyclingStats, hardly any professional cyclists would be considered underweight. In fact, they would struggle to perform at the required level if they did not maintain a healthy weight.

Taller than you might think

According to a study published in 2013 and updated in 2019, the global average height of adult males born in 1996 was 1.71m, but there is considerable regional variation. The vast majority of professional cyclists come from Europe, North America, Russia and the Antipodes where men tend to be taller than those from Asia, Africa and South America. For the 41 Colombians averaging 1.73m, there are 85 Dutch riders with a mean height of 1.84m. See chart below.

Furthermore, road cycling involves a range of disciplines, including sprinting and time trialling, where size and raw power provide an advantage. The peloton includes larger sprinters alongside smaller climbers.

Not as light as expected

While 68kg for a 1.80m male is certainly slim, it equates to a body mass index of 21 (BMI = weight / (height)²), which is towards the middle of the recommended healthy range. BMI is not a sophisticated measure, as it does not distinguish between fat and muscle. Since muscle is more dense than fat and cyclists tend to have it a higher percentage of lean body mass, they will look slimmer than a lay person of equivalent height and weight. Nevertheless doctors use BMI as a guide and become concerned when it falls below 18.5.

Smaller Colombians and taller Dutch professional cyclists have similar BMIs

The chart includes over 1,100 professional cyclists, but very few pros would be considered underweight. The majority of riders have a BMI of between 20 and 22. Although Colombian riders (red) tend to be smaller, specialising in climbing, their average BMI of 20.8 is not that different from larger Dutch riders (orange) with a mean BMI of 21.2. The taller Colombians include the sprinters Hodeg, Gaviria and Molano.

Types of rider

Weights and heights of a sample of top professional cyclists

This chart shows the names of a sample of top riders. All-out sprinters tend to have a BMI of around 24, even if they are small like Caleb Ewan. Sprints at the end of more rolling courses are likely to be won by riders with a BMI of 22, such as Greipel, van Avermaet, Sagan, Gaviria, Groenewegen, Bennet and Kwiatkowski. Time trial specialists like Dennis and Thomas have similar physiques, though Dumoulin and Froome are significantly lighter and remarkably similar to each other.

GC contenders Roglic, Kruiswijk and Gorka Izagirre are near the centre of the distribution with a BMI around 21, close to Viviani, who is unusually light for a sprinter. Pinot, Valverde, Dan Martin, the Yates brothers and Pozzovivo appear to be light for their heights. Interestingly climbers such as Quintana, Uran, Alaphilippe, Carapaz and Richie Porte all have a BMI of around 21, whereas Lopez is a bit heavier.

If the figures reported on ProCyclingStats are accurate, George Bennet and Emanuel Buchmann are significantly underweight. Weighting 58kg for a height of 1.80m does not seem to be conducive to strong performance, unless they are extraordinary physical specimens.

Conclusions

Professional cyclists are lean, but they would not be able to achieve the performance required if they were underweight. It is possible that the weights of individual riders might vary over time by a couple of kilos, moving them a small amount vertically on the chart, but scientific approaches are increasingly employed by expert nutritionists to avoid significant weight loss over longer stage races. The Jumbo Foodcoach app was developed alongside the Jumbo-Visma team and, working with Team Sky, James Morton strove to ensure that athletes fuel for the work required. Excessive weight loss can lead to a range of problems for health and performance.

References

Code used for this analysis

Betting on the Tour

Source: ASO Media

On the eve of the Tour de France, the pundits have made their predictions, but when the race is over, they will be long forgotten. One way of checking your own forecasts is to take a look at the odds offered on the betting markets. These are interesting, because they reflect the actions of people who have actually put money behind their views. In an efficient and liquid market, the latest prices ought to reflect all information available. This blog takes a look at the current odds, without wishing to encourage gambling in any way.

The website oddchecker.com collates the odds from a number of bookmakers across a large range of bets. It is helpful to convert the odds into predicted probabilities. Focussing on the overall winner,  Egan Bernal is the favourite at 5/2 (equating to a 29% probability taking the yellow jersey), followed by Geraint Thomas at 7/2 (22%) and Jakob Fuglsang at 6/1 (14%). This gives a 51% chance of a winner being one of the two Team Ineos riders. The three three leading contenders are some distance ahead of Adam Yates, Richie Porte, Thibaut Pinot and Nairo Quintana. Less fancied riders include Roman Bardet, Steven Kruijswijk, Rigoberto Uran, Mikel Landa, Enric Mas and Vincenzo Nibali. Anyone else is seen as an outsider.

Ups and downs

The odds change over time, as the markets evaluate the performance and changing fortunes of the riders. In the following chart shows the fluctuations in the average daily implied winning chances of the three current favourites since the beginning of the year, according to betfair.com.

The implied probability that Geraint Thomas would repeat last year’s win has hovered between 20% and 30%, spiking up a bit during the Tour of Romandie. Unfortunately, Chris Froome’s odds are no longer available, as he was most likely the favourite earlier this year. However, his crash on 11 June instantaneously improved the odds for other riders, particularly Thomas and Bernal, though expectations for the Welshman declined after he crashed out of the Tour de Suisse on 18 June.

The betting on Fuglsang spiked up sharply during the Tirreno Adriatico, where he won a stage and came 3rd on GC, and the Tour of the Basque country, where he finished strongly. Apparently, his three podium results in the the Ardenne Classics had no effect on his chances of a yellow jersey, whereas his victory in the Critérium Dauphiné had a significant positive impact.

Egan Bernal, appeared from the shadows. At the beginning of the year, he was seen as a third string in Team Ineos. His victory in Paris Nice hardly registered on his odds for the Tour. But since Froome’s crash and Thomas’s departure from the Tour de Suisse, he became the bookies’ favourite.

With 65% of the money on the three main contenders, there are some pretty good odds available on other riders. A couple of crashes, an off day or a bit of bad luck could turn the race on its head. Clearly the Ineos and Astana teams are capable of protecting their GC contenders, but so too are Movistar, EF Education First, Michelton Scott, Groupama-FDJ, Bahrain Merida and others.

References

Code I used can be found here

Fuel for the work required: periodisation of carbohydrate intake

screenshot2019-01-31at16.06.16
Fuel for the work required, Impey et al, Sports Med (2018) 48:1031–1048

Last week I attended an event announcing the forthcoming launch of a new fitness app called Pillar. It offers combined training and nutrition advice to help athletes achieve their goals. Pillar is backed by a strong scientific team including Professor James Morton, Team Sky Head of Performance Nutrition, and Professor Graeme Close, England Rugby Head of Performance Nutrition.

James Morton gave a fascinating presentation about the periodisation of carbohydrate (CHO) fuelling, including a detailed description of the nutrition strategy he created to support Chris Froome’s famous 80km attack on stage 19 of the 2018 Giro d’Italia. His recent paper explains the underlying science. These are some of the key points.

  • Always go into competition fully fuelled with carbohydrate
    • Well-fuelled athletes perform for longer at higher intensities than those with depleted reserves
    • Basic biochemistry: fat burning is too slow and supplies of the phosphocreatine are too small to sustain intensities over 85% of VO2max
    • Theory is backed up by experiment
  • There are pros and cons to training with low levels of carbohydrate
    • Positive effects: Improved fat burning, changes in cell signalling, gene expression and enzyme/protein activity, potential to save precious glycogen stores for crucial attacks later in a race
    • Negative effects: Inconsistent evidence of improved performance, ability to complete training session may be compromised, reduced immunity, risks to bone health, loss of top end for those on high fat/low carb (ketogenic) diet
  • Different ways to train with low carbohydrate
    • doing two sessions in one day with minimal refuelling
    • low carb evening meal and breakfast: sleep low, train low the next morning
    • fasted rides
    • high fat/low carb diet

Is there a structured method of training that provides the benefits without the negatives?

  • The authors propose a glycogen threshold hypothesis
    • Positive effects seem to be dependent on commencing with muscle glycogen levels within a specific range
    • Levels have to be low enough to promote positive effects
    • But when too low, protein synthesis may be impaired and the ability to complete sessions is compromised
  • This leads to the idea of periodising carbohydrate consumption, meal by meal, around planned training sessions
  • “Fuelling for the work required”
    • low carbs before and during lighter training sessions
    • high carbs in preparation for and during rides with greater intensities
    • always refuel after training
  • The diagram above provides an example for an elite endurance cyclist
    • The red, amber, green colour coding indicates low, medium or high carbohydrate consumption
    • On day 1, the athlete aims to “train high” for a hard session
    • A lighter evening meal on day 1 prepares to “sleep low, train low” ahead of a lower intensity session on day 2
    • Carbohydrate intake rises after exercise on day 2 in anticipation of a high intensity session on day 3
    • Fuelling is moderated on the evening of day 3 as day 4 is assigned as a recovery day
    • Carbohydrate rises later on day 4 to prepare for the next block of training
  • The Pillar app aims to provide these leading edge scientific principles to amateur cyclists and other athletes

In order to put this into action, you need to know how much carbohydrate you are consuming. My assumption has been that my diet is reasonably healthy, but I have never actually measured it. So I have been experimenting with free app MyFitnessPal that can be downloaded onto your phone. This provides a simple and convenient way to track the nutritional composition of your diet, including a barcode scanner that recognises most foods. You can link it to other apps such as Training Peaks to take account of energy expended. However, neither of these tools plans nutrition ahead of training sessions. Pillar aims to fill this gap. It will be interesting to see whether this turns out to be successful.

References

Fuel for the Work Required: A Theoretical Framework for Carbohydrate Periodization and the Glycogen Threshold Hypothesis, SG Impey, MA Hearris, KM Hammond, JD Bartlett, J Louis, G Close, JP Morton, Sports Med (2018) 48:1031–1048, https://doi.org/10.1007/s40279-018-0867-7

Fuel for the work required: a practical approach to amalgamating train-low paradigms for endurance athletes, Impey SG, Hammond KM, Shepherd SO, Sharples AP, Stewart C, Limb M, Smith K, Philp A, Jeromson S, Hamilton DL, Close GL, Morton JP, Physiol Rep. 2016 May;4(10). pii: e12803. doi: 10.14814/phy2.12803

Low carbohydrate, high fat diet impairs exercise economy and negates the performance benefit from intensified training in elite race walkers, Burke LM, Ross ML, Garvican-Lewis LA, Welvaert M, Heikura IA, Forbes SG, Mirtschin JG, Cato LE, Strobel N, Sharma AP, Hawley JA.  J Physiol. 2017;595:2785–807

Low energy availability assessed by a sport-specific questionnaire and clinical interview indicative of bone health, endocrine profile and cycling performance in competitive male cyclists, BMJ Open Sport & Exercise Medicine,https://doi.org/10.1136/bmjsem-2018-000424

Fuelling for Cycling Performance

CF
Chris Froome (LaPresse)

Some commentators were skeptical of Team Sky’s explanation for Chris Froome’s 80km tour-winning attack on stage 19 of the Giro. His success was put down to the detailed planning of nutrition throughout the ride, with staff positioned at strategic refuelling points along the entire route.  If you consider how skeletal the riders look after two and a half weeks of relentless competition, along with the limits on what can be physically absorbed between stages, the nutrition story makes a lot of sense. Did Yates, Pinot and Aru dramatically fall by the wayside simply because they ran out of energy?

The best performing cyclists have excellent balancing skills. This includes the ability to match energy intake with energy demand. The pros benefit from teams of support staff monitoring every aspect of their nutrition and performance. However, many serious club-level cyclists pick up fads and snippets of information from social media or the cycling press that lead them to try out all kinds ideas, in an unscientific manner, in the hope of achieving an improvement in performance. Some of these activities have potentially harmful effects on the body.

Competitive riders can become obsessed with losing weight and sticking to extremely tough training schedules, leading to both short-term and long-term energy deficits that are detrimental to both health and performance. One of the physiological consequences can be a reduction in bone density, which is particularly significant for cyclists, who do not benefit from gravitational stress on bones, due to the non-weight-bearing nature of the sport. In a recent paper, colleagues at Durham University and I describe an approach for identifying male cyclists at risk of Relative Energy Deficit in Sport (RED-S).

You need a certain amount of energy simply to maintain normal life processes, but an athlete can force the body into a deficit in two ways: by intentionally or unintentionally restricting energy intake below the level required to meet demand or by increasing training load without a corresponding increase in fuelling.

EnergyBalance

Our bodies have a range of  ways to deal with an energy deficit. For the average, slightly overweight casual cyclist, burning some fat is not a bad thing. However, most competitive cyclists are already very lean, making the physiological consequences of an energy deficit more serious. Changes arise in the endocrine system that controls the body’s hormones. Certain processes can shut down, such as female menstruation, and males can experience a reduction in testosterone. Sex steroids are important for maintaining healthy bones. In our study of 50 male competitive cyclists, the average bone density in the lumbar spine, measured by DXA scan, was significantly below normal. Some relatively young cyclists had the bones of a 70 year old man!

The key variable associated with poor bone health was low energy availability, i.e. male cyclists exhibiting  RED-S. These riders were identified using a questionnaire followed by an interview with a Sports Endocrinologist. The purpose of the interview was to go through the responses in more detail, as most people have a tendency to put a positive spin on their answers. There were two important warning signs.

  • Long-term energy deficit: a prolonged significant weight reduction to achieve “race weight”
  • Short-term energy deficit: one or more fasted rides per week

Among riders with low energy availability, bone density was not so bad for those who had previously engaged in a weight-bearing sport, such as running. For cyclists with adequate energy availability, those with vey low levels of vitamin D had weaker bones. Across the 50 cyclists, most had vitamin D levels below the level of 90 nmol/L recommended for athletes, including some who were taking vitamin D supplements, but clearly not enough. Studies have shown that the advantages of athletes taking vitamin D supplements include better bone health, improved immunity and stronger muscles, so why wouldn’t you?

In terms of performance, British Cycling race category was positively related with a rider’s power to weight ratio, evaluated by 60 minute FTP per kg (FTP60/kg). Out of all the measured variables, including questionnaire responses, blood tests, bone density and body composition, the strongest association with FTP60/kg was the number of weekly training hours. There was no significant relationship between percentage body fat and FTP60/kg. So if you want to improve performance, rather than starving yourself in the hope of losing body fat, you are better off getting on your bike and training with adequate fuelling.

Cyclists using power meters have the advantage of knowing exactly how many calories they have used on every ride. In addition to taking on fuel during the ride, especially when racing, the greatest benefits accrue from having a recovery drink and some food immediately after completing rides of more than one hour.

For those wishing to know more about RED-S, the British Association of Sports and Exercise Medicine has provided a web resource.

A related blog will explore the machine learning and statistical techniques used to analyse the data for this study.

References

Low energy availability assessed by a sport-specific questionnaire and clinical interview indicative of bone health, endocrine profile and cycling performance in competitive male cyclists, BMJ Open Sport & Exercise Medicine,https://doi.org/10.1136/bmjsem-2018-000424

Relative Energy Deficiency in Sport, British Association of Sports and Exercise Medicine

Synergistic interactions of steroid hormones, British Journal of Sports Medicine

Cyclists: Make No Bones About It, British Journal of Sports Medicine

Male Cyclists: bones, body composition, nutrition, performance, British Journal of Sports Medicine

 

What are you looking at?

Screen Shot 2018-05-06 at 18.48.40.png

In a recent blog, I described an experiment to train a deep neural network to distinguish between photographs of Vincenzo Nibali and Alejandro Valverde, using a very small data set of images. In the conclusion, I suggested that the network was probably basing its decisions more on the colours of the riders’ kit rather than on facial recognition. This article investigates what the network was actually “looking at”, in order to understand better how it was making decisions.

The issues of accountability and bias were among the topics discussed at the last NIPS conference. As machine learning algorithms are adopted across industry, it is important for companies to be able to explain how conclusions are reached. In many instances, it is not acceptable simply to rely on an impenetrable black box. AI researchers and developers need to be able to explain what is going on inside their models, in order to justify decisions taken. In doing so, some worrying instances of bias have been revealed in the selection of data used to train the algorithms.

I went back to my rider recognition model and used an approach called “Class Activation Maps” to identify which parts of the images accounted for the network’s choice of rider. Making use of the code provided in lesson 7 of the course offered by fast.ai, I took advantage of my existing small set of training, validation and test images of the two famous cyclists. Starting with a pre-trained version of ResNet34, the idea was to replace the last two layers with four new ones, the crucial one being a convolutional layer with two outputs, matching the number of cyclists in the classification task. The two outputs of this layer were 7×7 matrix representations of the relevant image.

The final predictions of the model came from a softmax of a flattened average pooling of these 7×7 representations. The softmax output gave the probabilities of Nibali and Valverde respectively. Since there was no learning beyond the final convolution, the activations of the two 7×7 matrices represented the “Nibali-ness” and “Valverde-ness” of the image. This could be displayed as a heat map on top of the image.

Examples are shown below for the validation set of 10 images of Nibali followed by 10 of Valverde. The yellow patch of the heat map highlights the part of the image that led to the prediction displayed above each image. Nine out of ten were correct for Nibali and six for Valverde.

Screen Shot 2018-05-06 at 18.10.00.png
Class Activation Maps applied to the validation set

The heat maps were very helpful in understanding the model’s decision making process. It seemed that for Nibali, his face and helmet were important, with some attention paid to the upper part of his blue Astana kit. In contrast, the network did a very good job at identifying the M on Valverde’s Moviestar kit. It was interesting to note that the network succeeded in spotting that Nibali was wearing a Specialized helmet whereas Valverde had a Catlike design. Three errors arose in the photos of his face, which was mistaken for Nibali’s. In fact, any picture of a face led to a prediction of Nibali, as demonstrated by the cropped image below that was used for training.

Screen Shot 2018-05-06 at 18.21.58

Why should that be? Looking back at the training set, it turned out that, by chance, there were far more mugshots of Nibali, while there were more photos of Valverde riding his bike, with his face obscured by sunglasses. This was an example of unintentional bias in the training data, providing a very useful lesson.

The final set of pictures shows the predictions made on the out-of-sample test set. All the predictions are correct, except the first one, where the model failed to spot the green M on Valverde’s chest and mistook the blurred background for Nibali. Otherwise the results confirmed that the network looked at Nibali’s face, the rider’s helmet or Valverde’s kit. It also remembered seeing an image of Nibali holding the Giro trophy in the training set.

Screen Shot 2018-05-06 at 18.34.38.png
Class Activation Maps applied to the test set

In conclusion, Class Activation Maps provide a useful way of visualising the activations of hidden laters in a deep neural network. This can go some way to accounting for the decisions that appear in the output. The approach can also help identify unintentional bias in the training set.

Which team is that?

Screen Shot 2018-04-11 at 11.18.09

My last blog explored the effectiveness of deep learning in spotting the difference between Vincenzo Nibali and Alejandro Valverde. Since the faces of the riders were obscured in many of the photos, it is likely that the neural network was basing its evaluations largely on the colours of their team kit. A natural next challenge is to identify a rider’s team from a photograph. This task parallels the approach to the kaggle dog breed competition used in lesson 2 of the fast.ai course on deep learning.

Eighteen World Tour teams are competing this year. So the first step was to trawl the Internet for images, ideally of riders in this year’s kit. As before, I used an automated downloader, but this posed a number of problems. For example, searching for “Astana” brings up photographs of the capital of Kazakhstan. So I narrowed things down by searching for  “Astana 2018 cycling team”. After eliminating very small images, I ended up with a total of about 9,700 images, but these still included a certain amount of junk that I did have the time to weed out, such as photos of footballers or motorcycles in the “Sky Racing Team”,.

The following small sample of training images is generally OK, though it includes images of Scott bikes rather than Mitchelton-Scott riders and  a picture of  Sunweb’s Wilco Kelderman labelled as FDJ. However, with around 500-700 images of each team, I pressed on, noting that, for some reason, there were only 166 of Moviestar and these included the old style kit.

Screen Shot 2018-04-11 at 10.18.54.png
Small sample of training images

For training on this multiple classification problem, I adopted a slightly more sophisticated approach than before. Taking a pre-trained Resnet50 model, I performed some initial fine-tuning, on images rescaled to 224×224. I settled on an optimal learning rate of 1e-3 for the final layer, while allowing some training of lower layers at much lower rates. With a view to improving generalisation, I opted to augment the training set with random changes, such as small shifts in four directions, zooming in up to 10%, adjusting lighting and left-right flips. After initial training, accuracy was 52.6% on the validation set. This was encouraging, given that random guesses would have achieved a rate of 1 in 18 or 5.6%.

Taking a pro tip from fast.ai, training proceeded with the images at a higher resolution of 299×299. The idea is to prevent overfitting during the early stages, but to improve the model later on by providing more data for each image. This raised the accuracy to 58.3% on the validation set. This figure was obtained using a trick called “test time augmentation”, where each final prediction is based on the average prediction of five different “augmented” versions of the image in question.

Given the noisy nature of some of the images used for training, I was pleased with this result, but the acid test was to evaluate performance on unseen images. So I created a test set of two images of a lead rider from each squad and asked the model to identify the team. These are the results.

75 percent right.png
75% accuracy on the test set

The trained Resnet50 correctly identified the teams of 27 out of 36 images. Interestingly, there were no predictions of MovieStar or Sky. This could be partly due to the underrepresentation of MovieStar in the training set. Froome was mistaken for AG2R and Astana, in column 7, rows 2 and 3. In the first image, his 2018 Sky kit was quite similar to Bardet’s to the left and in the second image the sky did appear to be Astana blue! It is not entirely obvious why Nibali was mistaken for Sunweb and Astana, in the top and bottom rows. However, the huge majority of predictions were correct. An overall  success rate of 75% based on an afternoon’s work was pretty amazing.

The results could certainly be improved by cleaning up the training data, but this raises an intriguing question about the efficacy of artificial intelligence. Taking a step back, I used Bing’s algorithms to find images of cycling teams in order to train an algorithm to identify cycling teams. In effect, I was training my network to reverse-engineer Bing’s search algorithm, rather than my actual objective of identifying cycling teams. If an Internet search for FDJ pulls up an image of Wilco Kelderman, my network would be inclined to suggest that he rides for the French team.

In conclusion, for this particular approach to reach or exceed human performance, expert human input is required to provide a reliable training set. This is why this experiment achieved 75%, whereas the top submissions on the dog breeds leaderboard show near perfect performance.