Which Lord of the Rings characters do they look like? Ask an AI.
After building an app that uses deep learning to recognise Lord of the Rings characters, I had a bit of fun feeding in pictures of professional cyclists. This blog explains how the app works. If you just want to try it out yourself, you can find it here, but note that may need to be fairly patient, because it can take up to 5 minutes to fire up for the first time… it does start eventually.
Identifying wizards, hobbits and elves
The code that performs this task was based on the latest version of the excellent fast.ai course Practical Deep Learning for Coders. If you have done bit of programming in Python, you can build something like this yourself after just a few lessons.
The course sets out to defy some myths about deep learning. You don’t need to have a PhD in computer science – the fastai library is brilliantly designed and easy to use. Python is the language of choice for much of data science and the course runs in Jupyter notebooks.
You don’t need petabytes of data – I used fewer than 150 sample images of each character, downloaded using the Bing Image Search API. It is also straightforward to download publicly available neural networks within the fastai framework. These have been pre-trained to recognise a broad range of objects. Then it is relatively quick to fine-tune the parameters to achieve a specific task, such as recognising about 20 different Tolkien characters.
You don’t need expensive resources to build your models – I trained my neural network in just a few minutes, using a free GPU available on Google’s Colaboratory platform. After transferring the essential files to a github repository, I deployed the app at no cost, using Binder.
Thanks to the guidance provided by fastai, the whole process was quick and straightforward to do. In fact, by far the most time consuming task was cleaning up the data set of downloaded images. But there was a trick for doing this. First you train your network on whatever images come up in an initial search, until it achieves a reasonable degree of accuracy. Then take a look at the images that the model finds the most difficult to classify. I found that these tended to be pictures of lego figures or cartoon images. With the help of a fastai tool, it was simple to remove irrelevant images from the training and validation sets.
After a couple of iterations, I had a clean dataset and a great model, giving about 70% accuracy, which as good enough my purposes. Some examples are shown in the left column at the top of this blog.
The model’s performance was remarkably similar to my own. While Gollum is easy to identify, the wizard Saruman can be mistaken for Gandalf, Boromir looks a bit like Faramir and the hobbits Pippin and Merry can be confused.
Applications outside Middle Earth
One of the important limits of these types of image recognition models is that even if they work well in the domain in which they have been trained, they cannot be expected do a good job on totally different images. Nevertheless, I thought it would be amusing to supply the pictures of professional cyclists, particularly given the current vogue for growing facial hair.
My model was 87% sure that Peter Sagan was Boromir, but only 81.5% confident in the picture of Sean Bean. It was even more certain that Daniel Oss played the role of Faramir. Geraint Thomas was predicted to be Frodo Baggins, but with much lower confidence. I wondered for a while with Tadej Pogacar should be Legolas, but perhaps the model interpreted his outstretched arms as those of an archer.
I hoped that a heavily bearded Bradley Wiggins might come out as Gimli, but that did not not seem to work. Nevertheless it was entertaining to upload photographs of friends and family. With apologies for any waiting times to get to it running, you can try it here.
A report in VeloNews on the eve of the Tour de France stated that the French government had insisted that the “two strikes and you are out” policy must be enforced by the ASO. This means that if two positive COVID-19 test arise within a team or its support staff, the team will be removed from the race. This raises the possibility of the yellow jersey rider being ejected from the race if, for example, two mechanics record positive tests. This would be particularly unjust if it turns out that a test result was a false positive. So what are the chances that this might happen?
One of the great frustrations of the reporting on COVID testing has been the lack of clarity about what type of testing is being discussed. Tests fall in to two categories. Antigen tests use a sample from a nasal or pharyngeal swab to detect patients who currently have the disease, whereas antibody tests use a blood sample to identify patients who have developed antibodies as a result of exposure to the disease in the past – more than 28 days earlier.
There are two general types of antigen test. Real time polymerase chain reaction (RT-PCR) tests looks for specific viral fragments and need to be conducted in a laboratory, typically requiring at least 24 hours for a result. Less reliable rapid tests look for proteins associated with the COVID-19 virus, producing results in as little as 15 minutes.
The UCI requires riders and staff to be tested using RT-PCR, which is a very reliable method, having both high sensitivity (ability to detect those with the disease) and high specificity (ability to clear those without the disease). The relevant question for the Tour de France is the probability of a false positive RT-PRC test. Indeed Larry Warbass recently said he thought his result was a false positive, as he had experienced no symptoms and had maintained strict self isolation during training.
The evidence indicates that the machines performing the RT-PRC test are extremely unlikely to generate a false positive, because the test needs to find significant levels of three different targets to confirm the presence of COVID-19. In FDA experiments, 100% of negatives where correctly identified – there were no false positives. However, it remains possible that, in the moving circus of the Tour de France, a sample could become contaminated before it is tested or that samples might somehow be mislabelled. A high level of responsibility falls on the shoulders of team doctors to minimise these risks, but we can never be sure that it is zero.
One in a thousand
As a thought experiment, suppose that a negative RT-PCR test is 99.9% reliable, i.e. that one COVID-free person in a thousand somehow produces a false positive result. What is the chance that a team is unjustly sent home from the Tour?
Each team has eight riders plus support staff. Although teams might want to reduce the number of staff in the team bubble, it may be necessary to have extra catering staff in order to remain self sufficient. Let us assume an average of 17 staff on each of the 21 teams and that everybody has passed the required two negative tests prior to the start of the race. Assume further that nobody contracts COVID-19 throughout the race.
It has been indicated that everyone will be tested on the two rest days. Reassuringly, the probability of two or more false positives in a single team bubble of 25 people would be 0.03% (1-0.999^(25*2)). However, the probability that every team rider receives a negative result would be only 85% (0.999^168), meaning that there would be a 15% chance that at least one rider is unjustly ejected from the race. In fact, since at total of 1,050 tests would be taken by everyone in a team bubble, the chance of at least one person receiving a false positive would be surprisingly high: 65% (1- 0.999^1050).
Perhaps the assumption of 1 in a thousand false positives was a bit alarmist. Reducing it to 1 in thousand still produces a probability of 10% that somebody would be sent home during the Tour.
In some situations, draconian sanctions might deter team members or staff from reporting symptoms. One could imagine a soigneur or mechanic having to go home quietly after mysteriously spraining a wrist. However, this could create very negative press coverage if word got out that this person was infected.
Furthermore, the UCI rules place responsibility on the teams and specifically the team doctors to apply strict daily monitoring and controls to detect suspected COVID-19 cases.
While in the above scenarios no one actually contracted COVID-19, there is, of course a not inconsiderable chance that one of the 525 people in the team bubbles does actually become infected. If the virus spreads to more than one team, the whole race could become a fiasco.
But let’s keep our fingers crossed and hope Tour makes it to the Champs Elysées.
It was shocking to see footage of Remco Evenepoel’s horrific crash in Il Lombardia. Reports indicate that he broke his pelvis after falling from a bridge into a ravine. This follows the injuries sustained by his Deceuninck-QuickStep team mate Fabio Jakobsen in the Tour of Poland.
The video above shows the repairs to my pelvis carried out by the specialist team at St George’s Hospital. My accident was less spectacular than Remco’s, I just hit a large pothole, while riding in the Kent lanes last March. It took the ambulance two and a half hours to arrive, as this was just at the beginning of the COVID-19 crisis. In fact, lock-down was announced on the evening of my crash. There was a lot of uncertainty about the virus back then, so it was a pretty scary time to be in hospital. Nevertheless I have immense respect and gratitude for the NHS staff who looked after me.
I was given crutches the day after the operation and returned home the day after that, with strict instructions to remain non-weight-bearing on the injured leg for six weeks and then only partial weight-bearing for the next six weeks. An NHS physiotherapist contacted me and regularly provided a progression of exercises. I set myself additional challenges, like doing extra press-ups.
After six weeks of doing no proper exercise, I had lost 4kg. The circumference of my left thigh was 5cm less than the right. However, following a review at the hospital, I was given permission to start gentle exercise on my static turbo trainer. I began by removing the left pedal and performing single leg drills, but after a couple of days it was easier to put my injured leg on the pedal as a passenger. This also gave the hip some mobility.
After a week on the turbo, I was up to one hour a day at about 160 watts. It took a long time to increase this above 200 watts. I watched a lot of old cycling films, without any particular urge to go on Zwift. I started riding outside in mid-June, 12 weeks post op. My Garmin pedals allowed me to monitor the left-right balance as well as average power.The following chart shows that 21 weeks after my accident, balance is hovering around 48:52 and five minute power is back over 300 watts.
The psychological aspect of rehabilitation has been very important. I have focussed on targets and deadlines, marking each little achievement as a milestone. I am now walking without a limp, though running is still off limits. I even went kitesurfing a couple of weeks ago (don’t tell my surgeon about that one). I have been busy learning Italian, composing music and programming in Python.
Since heading back out on the roads, I have been riding cautiously, as my hip will not regain full strength until next spring. I plan to enter a couple of time trials to rekindle a sense of competition, without the danger of riding in a peloton. Racing again next season remains a goal.
Probably the most important mental aspect has been to stay positive at all times and never to spend time feeling sorry for oneself. This has been difficult as, inevitably, there have been a couple of set-backs when progress has seemed to reverse. But on the whole, my recovery has been astounding and, like Chris Froome, I remain optimistic about regaining my peak.
Remco will be back on the road next season, with the potential to pick up some results later in the year.
As a growing number of people seek to educate themselves on coronavirus COVID-19, while confined to their homes, a better understanding can be gained by taking a look at how to model an epidemic.
Researchers have created highly complex models of the spread of infections. For example, BlueDot’s disease-tracking model, described in this podcast, monitors the Internet with AI language translators and evaluates the network effects of transmission based on air travel itineraries. However, a surprising amount of insight can be gained from a very simple approach called the SIR model.
The SIR model divides the population into three classes. The susceptible class (S) includes everyone who can catch the infection. In the case of a novel virus like corona, it seems that the entire global population was initially susceptible. The infected class (I) includes all those currently infected and able to transmit the virus to susceptible people. The removed class (R) includes everyone who has recovered from the virus or, unfortunately, died. In the model, these people no longer transmit the disease nor are they susceptible. The idea is that people move from the susceptible class to the infected class to the removed class.
Although there is much focus in the media on the exponential rise of the total number of cases of coronavirus, this figure includes recoveries and deaths. In one sense this is a huge underestimate, because the figures only includes people who have taken a test and returned a positive result. As explained by Tomas Pueyo, many people do not display symptoms until around 5 days after infection and for over 90% these symptoms are mild, so there could be ten times more people infected than the official figures suggest. In another sense, the figures are a huge exaggeration, because people who have recovered are unlikely to be infectious, because their immune systems have fought off the virus.
The SIR model measures the number of infectious people. On the worldometers site these are called “active cases”. The critical insight of the SIR model, shown in the diagram above, is that the class of infected people grows if the daily number of new cases exceeds the number of closed cases.
Closed cases – removal rate
In a real epidemic, experts don’t really know how many people are infected, but they can keep track of those who have died or recovered. So it is best to start by considering the rate of transfer from infected to removed. After some digging around, it appears that the average duration of an infection is about 2 weeks. So in a steady state situation, about on person in 14 or 7% of those infected would recover every day. This percentage would be a bit less than this if the epidemic is spreading fast, because there would be more people who have recently acquired the virus, so let’s call it 6%. For the death rate, we need the number of deaths divided by the (unknown) total number of people infected. This is likely to be lower than the “case fatality rate” reported on worldometers because that divides the number of deaths only be the number of positive tests. The death rate is estimated to be 2-3%. If we add 3% to the 6% of those recovering, the removal rate (call it “a”) is estimated to be 9%.
In the absence of a cure or treatment for the virus, it is unlikely that the duration of infectiousness can be reduced. As long as hospitals are not overwhelmed, those who might otherwise have died may be saved. However, there is not much that governments or populations can do to speed up the daily rate of “closed cases”. The only levers available are those to reduce the number of “new cases” below 9%. This appeared to occur as a result of the draconian actions taken in China in the second half of February, but the sharp increase in new cases that became apparent over the weekend of 6/7 March spooked the financial markets.
New cases – infection rate
In the SIR model, the number of new infections depends on three factors: the number of infectious people, the number of susceptible people and the something called the infection rate (r), which measures the probability that an infected person passes on the virus to a susceptible person either through direct contact or indirectly, for example, by contaminating a surface, such as a door handle.
Governments can attempt to reduce the number of new infections by controlling each of the three driving factors. Clearly, hand washing and avoiding physical contact can reduce the infection rate. Similarly infected people are encouraged to isolate themselves in order to reduce the proportion of susceptible people exposed to direct or indirect contact. Guidance to UK general practitioners is to advise patients suffering from mild symptoms to stay at home and call 111, while those with serious symptoms should call 999.
When will it end?
As more people become infected, the number of susceptible people naturally falls. Until there is a vaccine, there is nothing governments can do to speed up this decline. Eventually, when enough people will have caught the current strain of the virus, so-called “herd immunity” will prevail. It is not necessary for everyone to have come into contact with the virus, rather it is sufficient for the number of susceptible people to be smaller than a critical value. Beyond this point infected people, on average, recover before they encounter a susceptible person. This is how the epidemic will finally start to die down.
When people refer to the transmission rate or reproduction rate of a virus, they mean the number of secondary infections produced by a primary infection across a susceptible population. This is equal to the number of susceptible people times the infection rate divided by the removal rate. This determines the threshold number of susceptible people below which the number of infections falls. The critical value is equal to removal rate relative to the infection rate (a/r) . When the number of susceptible people falls to this critical value, the number of infected people will reach a peak and subsequently decline. More susceptible people will still continue to be infected, but at a decreasing rate, until the infection dies out completely, by which time a significant part of the population will have been infected.
This looks very scary indeed
Running the figures through the SIR model produces some extremely scary predictions. At the time of writing, new cases of infection were rising at a rate of about 15%. At this rate, the virus could spread to 2/3 of the population before it dies out. If three percent of those infected die, the virus would kill 2 percent of the population. Based on results so far these would be largely elderly people or those suffering from complications, so it is extremely important that they are protected from infection. If the virus continues to run out of control, the number of deaths could run into the millions before the epidemic ends.
It is absolutely essential to reduce the infection rate and keep it low, particularly among the elderly and vulnerable groups
We should watch China carefully for new cases. If none arise, it suggests that a large proportion of the population gained immunity through infection, even though lower numbers of infections were reported. However, if the imposition of constraints temporary reduced the infection rate, leaving a large susceptible pool still vulnerable, the epidemic could re-emerge once the constraints are relaxed.
What to do?
The imposition of governments restrictions on travel and large gatherings are forcing the people to rethink their options. Where possible, office workers, university students, schoolchildren and sportsmen may find themselves congregating online in virtual environments rather than in the messy and dangerous real world.
Among cyclists, this ought to be good news for Zwift and other online platforms. Zwift seems to be particularly well-positioned, due to its strong social aspect, which allows riders to meet and race against each other in virtual races. It also has the potential to allow world tour teams to compete in virtual races.
In fact the ability to meet friends and do things together virtually would have applications across all walks of life. Sports fans need something between the stadium and the television. Businesses need a medium that fills the gap between a physical office and a conference call. Schools and universities require better ways to ensure that students can learn, while classrooms and lecture theatres are closed. These innovations may turn out to be attractive long after the coronavirus scare has subsided.
While the prospect of going to the pub or a crowded nightclub loses its appeal, cycling offers the average person a very attractive alternative way to meet friends while avoiding close proximity with large groups of people. As the weather improves, the chance to enjoying some exercise in the fresh air looks ever more enticing.
A couple of years ago I built a model to evaluate how Froome and Dumoulin would have matched up, if they had not avoided racing against each other over the 2017 season. As we approach the 2019 World Championships Road Race in Yorkshire, I have adopted a more sophisticated approach to try to predict the winner of the men’s race. The smart money could be going on Sam Bennett.
With only two races outstanding, most of this year’s UCI world tour results are available. I decided to broaden the data set with 2.HC classification European Tour races, such as the OVO Energy Tour of Britain. In order to help with prediction, I included each rider’s weight and height, as well as some meta-data about each race, such as date, distance, average speed, parcours and type (stage, one-day, GC, etc.).
The key question was what exactly are you trying to predict? The UCI allocates points for race results, using a non-linear scale. For example, Mathieu Van Der Poel was awarded 500 points for winning Amstel Gold, while Simon Clarke won 400 for coming second and Jakob Fuglsang picked up 325 for third place, continuing down to 3 points for coming 60th. I created a target variable called PosX, defined as a negative exponential of the rider’s position in any race, equating to 1.000 for a win, 0.834 for second, 0.695 for third, decaying down to 0.032 for 20th. This has a similar profile to the points scheme, emphasising the top positions, and handles races with different numbers of riders.
A random forest would be a typical choice of model for this kind of data set, which included a mixture of continuous and categorical variables. However, I opted for a neural network, using embeddings to encode the categorical variables, with two hidden layers of 200 and 100 activations. This was very straightforward using the fast.ai library. Training was completed in a handful of seconds on my MacBook Pro, without needing a GPU.
After some experimentation on a subset of the data, it was clear that the model was coming up with good predictions on the validation set and the out-of-sample test set. With a bit more coding, I set up a procedure to load a start list and the meta-data for a future race, in order to predict the result.
With the final start list for the World Championships Road Race looking reasonably complete, I was able to generate the predicted top 10. The parcours obviously has an important bearing on who wins a race. With around 3600m of climbing, the course was clearly hilly, though not mountainous. Although the finish was slightly uphill, it was not ridiculously steep, so I decided to classify the parcours as rolling with a flat finish
Mathieu Van Der Poel
Edvald Boasson Hagen
Greg Van Avermaet
It was encouraging to see that the model produced a highly credible list of potential top 10 riders, agreeing with the bookies in rating Mathieu Van Der Poel as the most likely winner. Sagan was ranked slightly below Kristoff and Bennett, who are seen as outsiders by the pundits. The popular choice of Philippe Gilbert did not appear in my top 10 and Alaphilippe was only 9th, in spite of their recent strong performances in the Vuelta and the Tour, respectively. Riders in positions 5 to 10 would all be expected to perform well in the cycling classics, which tend to be long and arduous, like the Yorkshire course.
For me, 25/1 odds on Sam Bennett are attractive. He has a strong group of teammates, in Dan Martin, Eddie Dunbar, Connor Dunne, Ryan Mullen and Rory Townsend, who will work hard to keep him with the lead group in the hillier early part of the race. Then he will then face an extremely strong Belgian team that is likely to play the same game that Deceuninck-QuickStep successfully pulled off in stage 17 of the Vuelta, won by Gilbert. But Bennett was born in Belgium and he was clearly the best sprinter out in Spain. He should be able to handle the rises near the finish.
A similar case can be made for Kristoff, while Matthews and Van Avermaet both had recent wins in Canada. Nevertheless it is hard to look past the three-times winner Peter Sagan, though if Van Der Poel launches one of his explosive finishes, there is no one to stop him pulling on the rainbow jersey.
After the race, I checked the predicted position of the eventual winner, Mads Pedersen. He was expected to come 74th. Clearly the bad weather played a role in the result, favouring the larger riders, who were able to keep warmer. The Dane clearly proved to be the strongest rider on the day.
It is easy to assume that successful professional cyclists are all skinny little guys, but if you look at the data, it turns out that they have an average height of 1.80m and an average weight of around 68kg. If we are to believe the figures posted on ProCyclingStats, hardly any professional cyclists would be considered underweight. In fact, they would struggle to perform at the required level if they did not maintain a healthy weight.
Taller than you might think
According to a study published in 2013 and updated in 2019, the global average height of adult males born in 1996 was 1.71m, but there is considerable regional variation. The vast majority of professional cyclists come from Europe, North America, Russia and the Antipodes where men tend to be taller than those from Asia, Africa and South America. For the 41 Colombians averaging 1.73m, there are 85 Dutch riders with a mean height of 1.84m. See chart below.
Furthermore, road cycling involves a range of disciplines, including sprinting and time trialling, where size and raw power provide an advantage. The peloton includes larger sprinters alongside smaller climbers.
Not as light as expected
While 68kg for a 1.80m male is certainly slim, it equates to a body mass index of 21 (BMI = weight / (height)²), which is towards the middle of the recommended healthy range. BMI is not a sophisticated measure, as it does not distinguish between fat and muscle. Since muscle is more dense than fat and cyclists tend to have it a higher percentage of lean body mass, they will look slimmer than a lay person of equivalent height and weight. Nevertheless doctors use BMI as a guide and become concerned when it falls below 18.5.
The chart includes over 1,100 professional cyclists, but very few pros would be considered underweight. The majority of riders have a BMI of between 20 and 22. Although Colombian riders (red) tend to be smaller, specialising in climbing, their average BMI of 20.8 is not that different from larger Dutch riders (orange) with a mean BMI of 21.2. The taller Colombians include the sprinters Hodeg, Gaviria and Molano.
Types of rider
This chart shows the names of a sample of top riders. All-out sprinters tend to have a BMI of around 24, even if they are small like Caleb Ewan. Sprints at the end of more rolling courses are likely to be won by riders with a BMI of 22, such as Greipel, van Avermaet, Sagan, Gaviria, Groenewegen, Bennet and Kwiatkowski. Time trial specialists like Dennis and Thomas have similar physiques, though Dumoulin and Froome are significantly lighter and remarkably similar to each other.
GC contenders Roglic, Kruiswijk and Gorka Izagirre are near the centre of the distribution with a BMI around 21, close to Viviani, who is unusually light for a sprinter. Pinot, Valverde, Dan Martin, the Yates brothers and Pozzovivo appear to be light for their heights. Interestingly climbers such as Quintana, Uran, Alaphilippe, Carapaz and Richie Porte all have a BMI of around 21, whereas Lopez is a bit heavier.
If the figures reported on ProCyclingStats are accurate, George Bennet and Emanuel Buchmann are significantly underweight. Weighting 58kg for a height of 1.80m does not seem to be conducive to strong performance, unless they are extraordinary physical specimens.
Professional cyclists are lean, but they would not be able to achieve the performance required if they were underweight. It is possible that the weights of individual riders might vary over time by a couple of kilos, moving them a small amount vertically on the chart, but scientific approaches are increasingly employed by expert nutritionists to avoid significant weight loss over longer stage races. The Jumbo Foodcoach app was developed alongside the Jumbo-Visma team and, working with Team Sky, James Morton strove to ensure that athletes fuel for the work required. Excessive weight loss can lead to a range of problems for health and performance.
On the eve of the Tour de France, the pundits have made their predictions, but when the race is over, they will be long forgotten. One way of checking your own forecasts is to take a look at the odds offered on the betting markets. These are interesting, because they reflect the actions of people who have actually put money behind their views. In an efficient and liquid market, the latest prices ought to reflect all information available. This blog takes a look at the current odds, without wishing to encourage gambling in any way.
The website oddchecker.com collates the odds from a number of bookmakers across a large range of bets. It is helpful to convert the odds into predicted probabilities. Focussing on the overall winner, Egan Bernal is the favourite at 5/2 (equating to a 29% probability taking the yellow jersey), followed by Geraint Thomas at 7/2 (22%) and Jakob Fuglsang at 6/1 (14%). This gives a 51% chance of a winner being one of the two Team Ineos riders. The three three leading contenders are some distance ahead of Adam Yates, Richie Porte, Thibaut Pinot and Nairo Quintana. Less fancied riders include Roman Bardet, Steven Kruijswijk, Rigoberto Uran, Mikel Landa, Enric Mas and Vincenzo Nibali. Anyone else is seen as an outsider.
Ups and downs
The odds change over time, as the markets evaluate the performance and changing fortunes of the riders. In the following chart shows the fluctuations in the average daily implied winning chances of the three current favourites since the beginning of the year, according to betfair.com.
The implied probability that Geraint Thomas would repeat last year’s win has hovered between 20% and 30%, spiking up a bit during the Tour of Romandie. Unfortunately, Chris Froome’s odds are no longer available, as he was most likely the favourite earlier this year. However, his crash on 11 June instantaneously improved the odds for other riders, particularly Thomas and Bernal, though expectations for the Welshman declined after he crashed out of the Tour de Suisse on 18 June.
The betting on Fuglsang spiked up sharply during the Tirreno Adriatico, where he won a stage and came 3rd on GC, and the Tour of the Basque country, where he finished strongly. Apparently, his three podium results in the the Ardenne Classics had no effect on his chances of a yellow jersey, whereas his victory in the Critérium Dauphiné had a significant positive impact.
Egan Bernal, appeared from the shadows. At the beginning of the year, he was seen as a third string in Team Ineos. His victory in Paris Nice hardly registered on his odds for the Tour. But since Froome’s crash and Thomas’s departure from the Tour de Suisse, he became the bookies’ favourite.
With 65% of the money on the three main contenders, there are some pretty good odds available on other riders. A couple of crashes, an off day or a bit of bad luck could turn the race on its head. Clearly the Ineos and Astana teams are capable of protecting their GC contenders, but so too are Movistar, EF Education First, Michelton Scott, Groupama-FDJ, Bahrain Merida and others.
Last week I attended an event announcing the forthcoming launch of a new fitness app called Pillar. It offers combined training and nutrition advice to help athletes achieve their goals. Pillar is backed by a strong scientific team including Professor James Morton, Team Sky Head of Performance Nutrition, and Professor Graeme Close, England Rugby Head of Performance Nutrition.
James Morton gave a fascinating presentation about the periodisation of carbohydrate (CHO) fuelling, including a detailed description of the nutrition strategy he created to support Chris Froome’s famous 80km attack on stage 19 of the 2018 Giro d’Italia. His recent paper explains the underlying science. These are some of the key points.
Always go into competition fully fuelled with carbohydrate
Well-fuelled athletes perform for longer at higher intensities than those with depleted reserves
Basic biochemistry: fat burning is too slow and supplies of the phosphocreatine are too small to sustain intensities over 85% of VO2max
doing two sessions in one day with minimal refuelling
low carb evening meal and breakfast: sleep low, train low the next morning
high fat/low carb diet
Is there a structured method of training that provides the benefits without the negatives?
The authors propose a glycogen threshold hypothesis
Positive effects seem to be dependent on commencing with muscle glycogen levels within a specific range
Levels have to be low enough to promote positive effects
But when too low, protein synthesis may be impaired and the ability to complete sessions is compromised
This leads to the idea of periodising carbohydrate consumption, meal by meal, around planned training sessions
“Fuelling for the work required”
low carbs before and during lighter training sessions
high carbs in preparation for and during rides with greater intensities
always refuel after training
The diagram above provides an example for an elite endurance cyclist
The red, amber, green colour coding indicates low, medium or high carbohydrate consumption
On day 1, the athlete aims to “train high” for a hard session
A lighter evening meal on day 1 prepares to “sleep low, train low” ahead of a lower intensity session on day 2
Carbohydrate intake rises after exercise on day 2 in anticipation of a high intensity session on day 3
Fuelling is moderated on the evening of day 3 as day 4 is assigned as a recovery day
Carbohydrate rises later on day 4 to prepare for the next block of training
The Pillar app aims to provide these leading edge scientific principles to amateur cyclists and other athletes
In order to put this into action, you need to know how much carbohydrate you are consuming. My assumption has been that my diet is reasonably healthy, but I have never actually measured it. So I have been experimenting with free app MyFitnessPal that can be downloaded onto your phone. This provides a simple and convenient way to track the nutritional composition of your diet, including a barcode scanner that recognises most foods. You can link it to other apps such as Training Peaks to take account of energy expended. However, neither of these tools plans nutrition aheadof training sessions. Pillar aims to fill this gap. It will be interesting to see whether this turns out to be successful.
Some commentators were skeptical of Team Sky’s explanation for Chris Froome’s 80km tour-winning attack on stage 19 of the Giro. His success was put down to the detailed planning of nutrition throughout the ride, with staff positioned at strategic refuelling points along the entire route. If you consider how skeletal the riders look after two and a half weeks of relentless competition, along with the limits on what can be physically absorbed between stages, the nutrition story makes a lot of sense. Did Yates, Pinot and Aru dramatically fall by the wayside simply because they ran out of energy?
The best performing cyclists have excellent balancing skills. This includes the ability to match energy intake with energy demand. The pros benefit from teams of support staff monitoring every aspect of their nutrition and performance. However, many serious club-level cyclists pick up fads and snippets of information from social media or the cycling press that lead them to try out all kinds ideas, in an unscientific manner, in the hope of achieving an improvement in performance. Some of these activities have potentially harmful effects on the body.
Competitive riders can become obsessed with losing weight and sticking to extremely tough training schedules, leading to both short-term and long-term energy deficits that are detrimental to both health and performance. One of the physiological consequences can be a reduction in bone density, which is particularly significant for cyclists, who do not benefit from gravitational stress on bones, due to the non-weight-bearing nature of the sport. In a recent paper, colleagues at Durham University and I describe an approach for identifying male cyclists at risk of Relative Energy Deficit in Sport (RED-S).
You need a certain amount of energy simply to maintain normal life processes, but an athlete can force the body into a deficit in two ways: by intentionally or unintentionally restricting energy intake below the level required to meet demand or by increasing training load without a corresponding increase in fuelling.
Our bodies have a range of ways to deal with an energy deficit. For the average, slightly overweight casual cyclist, burning some fat is not a bad thing. However, most competitive cyclists are already very lean, making the physiological consequences of an energy deficit more serious. Changes arise in the endocrine system that controls the body’s hormones. Certain processes can shut down, such as female menstruation, and males can experience a reduction in testosterone. Sex steroids are important for maintaining healthy bones. In our study of 50 male competitive cyclists, the average bone density in the lumbar spine, measured by DXA scan, was significantly below normal. Some relatively young cyclists had the bones of a 70 year old man!
The key variable associated with poor bone health was low energy availability, i.e. male cyclists exhibiting RED-S. These riders were identified using a questionnaire followed by an interview with a Sports Endocrinologist. The purpose of the interview was to go through the responses in more detail, as most people have a tendency to put a positive spin on their answers. There were two important warning signs.
Long-term energy deficit: a prolonged significant weight reduction to achieve “race weight”
Short-term energy deficit: one or more fasted rides per week
Among riders with low energy availability, bone density was not so bad for those who had previously engaged in a weight-bearing sport, such as running. For cyclists with adequate energy availability, those with vey low levels of vitamin D had weaker bones. Across the 50 cyclists, most had vitamin D levels below the level of 90 nmol/L recommended for athletes, including some who were taking vitamin D supplements, but clearly not enough. Studies have shown that the advantages of athletes taking vitamin D supplements include better bone health, improved immunity and stronger muscles, so why wouldn’t you?
In terms of performance, British Cycling race category was positively related with a rider’s power to weight ratio, evaluated by 60 minute FTP per kg (FTP60/kg). Out of all the measured variables, including questionnaire responses, blood tests, bone density and body composition, the strongest association with FTP60/kg was the number of weekly training hours. There was no significant relationship between percentage body fat and FTP60/kg. So if you want to improve performance, rather than starving yourself in the hope of losing body fat, you are better off getting on your bike and training with adequate fuelling.
Cyclists using power meters have the advantage of knowing exactly how many calories they have used on every ride. In addition to taking on fuel during the ride, especially when racing, the greatest benefits accrue from having a recovery drink and some food immediately after completing rides of more than one hour.
For those wishing to know more about RED-S, the British Association of Sports and Exercise Medicine has provided a web resource.
A related blog will explore the machine learning and statistical techniques used to analyse the data for this study.
In a recent blog, I described an experiment to train a deep neural network to distinguish between photographs of Vincenzo Nibali and Alejandro Valverde, using a very small data set of images. In the conclusion, I suggested that the network was probably basing its decisions more on the colours of the riders’ kit rather than on facial recognition. This article investigates what the network was actually “looking at”, in order to understand better how it was making decisions.
The issues of accountability and bias were among the topics discussed at the last NIPS conference. As machine learning algorithms are adopted across industry, it is important for companies to be able to explain how conclusions are reached. In many instances, it is not acceptable simply to rely on an impenetrable black box. AI researchers and developers need to be able to explain what is going on inside their models, in order to justify decisions taken. In doing so, some worrying instances of bias have been revealed in the selection of data used to train the algorithms.
I went back to my rider recognition model and used an approach called “Class Activation Maps” to identify which parts of the images accounted for the network’s choice of rider. Making use of the code provided in lesson 7 of the course offered by fast.ai, I took advantage of my existing small set of training, validation and test images of the two famous cyclists. Starting with a pre-trained version of ResNet34, the idea was to replace the last two layers with four new ones, the crucial one being a convolutional layer with two outputs, matching the number of cyclists in the classification task. The two outputs of this layer were 7×7 matrix representations of the relevant image.
The final predictions of the model came from a softmax of a flattened average pooling of these 7×7 representations. The softmax output gave the probabilities of Nibali and Valverde respectively. Since there was no learning beyond the final convolution, the activations of the two 7×7 matrices represented the “Nibali-ness” and “Valverde-ness” of the image. This could be displayed as a heat map on top of the image.
Examples are shown below for the validation set of 10 images of Nibali followed by 10 of Valverde. The yellow patch of the heat map highlights the part of the image that led to the prediction displayed above each image. Nine out of ten were correct for Nibali and six for Valverde.
The heat maps were very helpful in understanding the model’s decision making process. It seemed that for Nibali, his face and helmet were important, with some attention paid to the upper part of his blue Astana kit. In contrast, the network did a very good job at identifying the M on Valverde’s Moviestar kit. It was interesting to note that the network succeeded in spotting that Nibali was wearing a Specialized helmet whereas Valverde had a Catlike design. Three errors arose in the photos of his face, which was mistaken for Nibali’s. In fact, any picture of a face led to a prediction of Nibali, as demonstrated by the cropped image below that was used for training.
Why should that be? Looking back at the training set, it turned out that, by chance, there were far more mugshots of Nibali, while there were more photos of Valverde riding his bike, with his face obscured by sunglasses. This was an example of unintentional bias in the training data, providing a very useful lesson.
The final set of pictures shows the predictions made on the out-of-sample test set. All the predictions are correct, except the first one, where the model failed to spot the green M on Valverde’s chest and mistook the blurred background for Nibali. Otherwise the results confirmed that the network looked at Nibali’s face, the rider’s helmet or Valverde’s kit. It also remembered seeing an image of Nibali holding the Giro trophy in the training set.
In conclusion, Class Activation Maps provide a useful way of visualising the activations of hidden laters in a deep neural network. This can go some way to accounting for the decisions that appear in the output. The approach can also help identify unintentional bias in the training set.