For those of us who cannot imagine going cycling without recording every ride on a GPS device, there is no reason not to do the same when we go skiing. Although there are specific skiing apps that you can download onto your phone, they increase the drain on the battery and use up your roaming data allowance. A simple alternative is simply to put your Garmin in your pocket for the day.
It is worthwhile creating a specific activity profile for skiing. On my Garmin 520, this is an option on the setting menu. The interesting data fields are elevation, total ascent, total descent, distance and max speed. When you upload the file onto Strava, you have a nice map of the day’s mountainous activites. Make sure you set the sport as “Alpine Ski”, otherwise your rides up the chairlifts end up smashing loads of KOMs set by summer mountain bikers.
When I went heliskiing in Canada, I was charged according to the vertical distance covered. One might hope to do at least 100k feet or 30 vertical kilometres in a week. It turns out that, with the multiple, fast-moving lifts and prepared pistes of a modern ski resort, you can expect to cover far higher total ascents/descents or “dénivelé” as they say in French.
If you keep moving, you can also cover remarkably long distances in a day of alpine skiing. The large interconnected European resorts like, Val d’Isere/Tignes, Les Trois Vallées, Portes du Soleil and Zermatt/Cervinia provide exentensive ski areas, where it can be hard to reach both edges of the piste map in a day.
Skiing speed is an interesting statistic. Having a background in slalom racing and freestyle skiing, I don’t hang around. It turns out that my descending speeds tend to be remarkably similar to cycling, averaging 35-40kph on a typical long run, but top speeds are definitely higher on skis.
A decent day’s skiing
A few years ago, I came up with three criteria for a decent day’s skiing, based on total vertical metres, distance covered and maximum speed. A reasonable total descent is 10,000m. You also need to cover a total distance of 100km. But the most challenging and dangerous part is to include a maximum speed of 100kph, which should only be attempted by expert skiers, at their own risk and without endangering other people.
The image at the top of this page shows a day when I comfortably achieved all three targets. It was surprising to note that, over an 8 1/2 hour day, I was only moving for 4 3/4 hours and that included riding up the lifts.
Of course, when there is a fresh fall of snow like we had at Christmas in the Alps, you can forget all that and just head off piste.
Since my blog about Strava Fitness and Freshness has been very popular, I thought it would be interesting to demonstrate a simple model that can help you use these metrics to improve your cycling performance.
As a quick reminder, Strava’s Fitness measure is an exponentially weighted average of your daily Training Load, over the last six weeks or so. Assuming you are using a power meter, it is important to use a correctly calibrated estimate of your Functional Threshold Power (FTP) to obtain an accurate value for the Training Load of each ride. This ensures that a maximal-effort one hour ride gives a value of 100. The exponential weighting means that the benefit of a training ride decays over time, so a hard ride last week has less impact on today’s Fitness than a hard ride yesterday. In fact, if you do nothing, Fitness decays rate is about 2.5% per day.
Although Fitness is a time-weighted average, a simple rule of thumb is that your Fitness Score equates to your average daily Training Load over the last month or so. For example, a Fitness level of 50 is consistent with an average daily Training Load (including rest days) of 50. It may be easier to think of this in terms of a total Training Load of 350 per week, which might include a longer ride of 150, a medium ride of 100 and a couple of shorter rides with a Training Load of 50.
How to get fitter
The way to get fitter is to increase your Training Load. This can be achieved by riding at a higher intensity, increasing the duration of rides or including extra rides. But this needs to be done in a structured way in order be effective. Periodisation is an approach that has been tried and tested over the years. A four-week cycle would typically include three weekly blocks of higher training load, followed by an easier week of recovery. Strava’s Fitness score provides a measure of your progress.
Modelling Fitness and Fatigue
An exponentially weighted moving average is very easy to model, because it evolves like a Markov Process, having the following property, relating to yesterday’s value and today’s Training Load.
where is Fitness or Fatigue on day t and for Fitness or for Fatigue
This is why your Fitness falls by about 2.5% and your Fatigue eases by about 13.5% after a rest day. The formula makes it straightforward to predict the impact of a training plan stretching out into the future. It is also possible to determine what Training Load is required to achieve a target level of Fitness improvement of a specific time period.
Ramping up your Fitness
The change in Fitness over the next seven days is called a weekly “ramp”. Aiming for a weekly ramp of 5 would be very ambitious. It turns out that you would need to increase your daily Training Load by 33. That is a substantial extra Training Load of 231 over the next week, particularly because Training Load automatically takes account of a rider’s FTP.
Interestingly, this increase in Training Load is the same, regardless of your starting Fitness. However, stepping up an average Training Load from 30 to 63 per day would require a doubling of work done over the next week, whereas for someone starting at 60, moving up to 93 per day would require a 54% increase in effort for the week.
In both cases, a cyclist would typically require two additional hard training rides, resulting in an accumulation of fatigue, which is picked up by Strava’s Fatigue score. This is a much shorter term moving average of your recent Training Load, over the last week or so. If we assume that you start with a Fatigue score equal to your Fitness score, an increase of 33 in daily Training Load would cause your Fatigue to rise by 21 over the week. If you managed to sustain this over the week, your Form (Fitness minus Fatigue) would fall from zero to -16. Here’s a summary of all the numbers mentioned so far.
Whilst it might be possible to do this for a week, the regime would be very hard to sustain over a three-week block, particularly because you would be going into the second week with significant accumulated fatigue. Training sessions and race performance tend to be compromised when Form drops below -20. Furthermore, if you have increased your Fitness by 5 over a week, you will need to increase Training Load by another 231 for the following week to continue the same upward trajectory, then increase again for the third week. So we conclude that a weekly ramp of 5 is not sustainable over three weeks. Something of the order of 2 or 3 may be more reasonable.
A steady increase in Fitness
Consider a rider with a Fitness level of 30, who would have a weekly Training Load of around 210 (7 times 30). This might be five weekly commutes and a longer ride on the weekend. A periodised monthly plan could include a ramp of 2, steadily increasing Training Load for three weeks followed by a recovery week of -1, as follows.
This gives a net increase in Fitness of 5 over the month. Fatigue has also risen by 5, but since the rider is fitter, Form ends the month at zero, ready to start the next block of training.
To simplify the calculations, I assumed the same Training Load every day in each week. This is unrealistic in practice, because all athletes need a rest day and training needs to mix up the duration and intensity of individual rides. The fine tuning of weekly rides is a subject for another blog.
A tougher training block
A rider engaging in a higher level of training, with a Fitness score of 60, may be able to manage weekly ramps of 3, before the recovery week. The following Training Plan would raise Fitness to 67, with sufficient recovery to bring Form back to positive at the end of the month.
A general plan
The interesting thing about this analysis is that the outcomes of the plans are independent of a rider’s starting Fitness. This is a consequence of the Markov property. So if we describe the ambitious plan as [3,3,3,-2], a rider will see a Fitness improvement of 7, from whatever initial value prevailed: starting at 30, Fitness would go to 37, while the rider starting at 60 would rise to 67.
Similarly, if Form begins at zero, i.e. the starting values of Fitness and Fatigue are equal, then the [3,3,3,-2] plan will always result in a in a net change of 6 in Fatigue over the four weeks.
In the same way, (assuming initial Form of zero) the moderate plan of [2,2,2,-1] would give any rider a net increase of Fitness and Fatigue of 5.
A couple of years ago I built a model to evaluate how Froome and Dumoulin would have matched up, if they had not avoided racing against each other over the 2017 season. As we approach the 2019 World Championships Road Race in Yorkshire, I have adopted a more sophisticated approach to try to predict the winner of the men’s race. The smart money could be going on Sam Bennett.
With only two races outstanding, most of this year’s UCI world tour results are available. I decided to broaden the data set with 2.HC classification European Tour races, such as the OVO Energy Tour of Britain. In order to help with prediction, I included each rider’s weight and height, as well as some meta-data about each race, such as date, distance, average speed, parcours and type (stage, one-day, GC, etc.).
The key question was what exactly are you trying to predict? The UCI allocates points for race results, using a non-linear scale. For example, Mathieu Van Der Poel was awarded 500 points for winning Amstel Gold, while Simon Clarke won 400 for coming second and Jakob Fuglsang picked up 325 for third place, continuing down to 3 points for coming 60th. I created a target variable called PosX, defined as a negative exponential of the rider’s position in any race, equating to 1.000 for a win, 0.834 for second, 0.695 for third, decaying down to 0.032 for 20th. This has a similar profile to the points scheme, emphasising the top positions, and handles races with different numbers of riders.
A random forest would be a typical choice of model for this kind of data set, which included a mixture of continuous and categorical variables. However, I opted for a neural network, using embeddings to encode the categorical variables, with two hidden layers of 200 and 100 activations. This was very straightforward using the fast.ai library. Training was completed in a handful of seconds on my MacBook Pro, without needing a GPU.
After some experimentation on a subset of the data, it was clear that the model was coming up with good predictions on the validation set and the out-of-sample test set. With a bit more coding, I set up a procedure to load a start list and the meta-data for a future race, in order to predict the result.
With the final start list for the World Championships Road Race looking reasonably complete, I was able to generate the predicted top 10. The parcours obviously has an important bearing on who wins a race. With around 3600m of climbing, the course was clearly hilly, though not mountainous. Although the finish was slightly uphill, it was not ridiculously steep, so I decided to classify the parcours as rolling with a flat finish
Mathieu Van Der Poel
Edvald Boasson Hagen
Greg Van Avermaet
It was encouraging to see that the model produced a highly credible list of potential top 10 riders, agreeing with the bookies in rating Mathieu Van Der Poel as the most likely winner. Sagan was ranked slightly below Kristoff and Bennett, who are seen as outsiders by the pundits. The popular choice of Philippe Gilbert did not appear in my top 10 and Alaphilippe was only 9th, in spite of their recent strong performances in the Vuelta and the Tour, respectively. Riders in positions 5 to 10 would all be expected to perform well in the cycling classics, which tend to be long and arduous, like the Yorkshire course.
For me, 25/1 odds on Sam Bennett are attractive. He has a strong group of teammates, in Dan Martin, Eddie Dunbar, Connor Dunne, Ryan Mullen and Rory Townsend, who will work hard to keep him with the lead group in the hillier early part of the race. Then he will then face an extremely strong Belgian team that is likely to play the same game that Deceuninck-QuickStep successfully pulled off in stage 17 of the Vuelta, won by Gilbert. But Bennett was born in Belgium and he was clearly the best sprinter out in Spain. He should be able to handle the rises near the finish.
A similar case can be made for Kristoff, while Matthews and Van Avermaet both had recent wins in Canada. Nevertheless it is hard to look past the three-times winner Peter Sagan, though if Van Der Poel launches one of his explosive finishes, there is no one to stop him pulling on the rainbow jersey.
After the race, I checked the predicted position of the eventual winner, Mads Pedersen. He was expected to come 74th. Clearly the bad weather played a role in the result, favouring the larger riders, who were able to keep warmer. The Dane clearly proved to be the strongest rider on the day.
It is easy to assume that successful professional cyclists are all skinny little guys, but if you look at the data, it turns out that they have an average height of 1.80m and an average weight of around 68kg. If we are to believe the figures posted on ProCyclingStats, hardly any professional cyclists would be considered underweight. In fact, they would struggle to perform at the required level if they did not maintain a healthy weight.
Taller than you might think
According to a study published in 2013 and updated in 2019, the global average height of adult males born in 1996 was 1.71m, but there is considerable regional variation. The vast majority of professional cyclists come from Europe, North America, Russia and the Antipodes where men tend to be taller than those from Asia, Africa and South America. For the 41 Colombians averaging 1.73m, there are 85 Dutch riders with a mean height of 1.84m. See chart below.
Furthermore, road cycling involves a range of disciplines, including sprinting and time trialling, where size and raw power provide an advantage. The peloton includes larger sprinters alongside smaller climbers.
Not as light as expected
While 68kg for a 1.80m male is certainly slim, it equates to a body mass index of 21 (BMI = weight / (height)²), which is towards the middle of the recommended healthy range. BMI is not a sophisticated measure, as it does not distinguish between fat and muscle. Since muscle is more dense than fat and cyclists tend to have it a higher percentage of lean body mass, they will look slimmer than a lay person of equivalent height and weight. Nevertheless doctors use BMI as a guide and become concerned when it falls below 18.5.
The chart includes over 1,100 professional cyclists, but very few pros would be considered underweight. The majority of riders have a BMI of between 20 and 22. Although Colombian riders (red) tend to be smaller, specialising in climbing, their average BMI of 20.8 is not that different from larger Dutch riders (orange) with a mean BMI of 21.2. The taller Colombians include the sprinters Hodeg, Gaviria and Molano.
Types of rider
This chart shows the names of a sample of top riders. All-out sprinters tend to have a BMI of around 24, even if they are small like Caleb Ewan. Sprints at the end of more rolling courses are likely to be won by riders with a BMI of 22, such as Greipel, van Avermaet, Sagan, Gaviria, Groenewegen, Bennet and Kwiatkowski. Time trial specialists like Dennis and Thomas have similar physiques, though Dumoulin and Froome are significantly lighter and remarkably similar to each other.
GC contenders Roglic, Kruiswijk and Gorka Izagirre are near the centre of the distribution with a BMI around 21, close to Viviani, who is unusually light for a sprinter. Pinot, Valverde, Dan Martin, the Yates brothers and Pozzovivo appear to be light for their heights. Interestingly climbers such as Quintana, Uran, Alaphilippe, Carapaz and Richie Porte all have a BMI of around 21, whereas Lopez is a bit heavier.
If the figures reported on ProCyclingStats are accurate, George Bennet and Emanuel Buchmann are significantly underweight. Weighting 58kg for a height of 1.80m does not seem to be conducive to strong performance, unless they are extraordinary physical specimens.
Professional cyclists are lean, but they would not be able to achieve the performance required if they were underweight. It is possible that the weights of individual riders might vary over time by a couple of kilos, moving them a small amount vertically on the chart, but scientific approaches are increasingly employed by expert nutritionists to avoid significant weight loss over longer stage races. The Jumbo Foodcoach app was developed alongside the Jumbo-Visma team and, working with Team Sky, James Morton strove to ensure that athletes fuel for the work required. Excessive weight loss can lead to a range of problems for health and performance.
Last year, I experimented with using style transfer to automatically generate images in the style of @grandtourart. More recently I developed a more ambitious version of my rather simple bike identifier. The connection between these two projects is sunflowers. This blog describes how I built a flower identification app.
In the brilliant fast.ai Practical Deep Learning for Coders course, Jeremy Howard recommends downloading a publicly available dataset to improve one’s image categorisation skills. I decided to experiment with the 102 Category Flower Dataset, kindly made available by the Visual Geometry Group at Oxford University. In the original 2008 paper, the researchers used a combination of techniques to segment each image and characterise its features. Taking these as inputs to a Support Vector Machine classifier, their best model achieved an accuracy of 72.8%.
Annoyingly, I could not find a list linking the category numbers to the names of the flowers, so I scraped the page showing sample images and found the images in the labelled data.
Using exactly the same training, validation and test sets, my ResNet34 model quickly achieved an accuracy of 80.0%. I created a new branch of the GitHub repository established for the Bike Image model and linked this to a new web service on my Render account. The huge outperformance of the paper was satisfying, but I was sure that a better result was possible.
The Oxford researchers had divided their set of 8,189 labelled images into a training set and a validation set, each containing 10 examples of the 102 flowers. The remaining 6,149 images were reserved for testing. Why allocate less that a quarter of the data to training/validation? Perhaps this was due to limits on computational resources available at the time. In fact, the training and validation sets were so small that I was able to train the ResNet34 on my MacBook Pro’s CPU, within an acceptable time.
My plan to improve accuracy was to merge the test set into the training set, keeping aside the original validation set of 1,020 images for testing. This expanded training set of 7,261 images immediately failed on my MacBook, so I uploaded my existing model onto my PaperSpace GPU, with amazing results. Within 45 minutes, I had a model with 97.0% accuracy on the held-out test set. I quickly exported the learner and switched the link in the flowers branch of my GitHub repository. The committed changes automatically fed straight through to the web service on Render.
I discovered, when visiting the app on my phone, that selecting an image offers the option to take a photo and upload it directly for identification. Having exhausted the flowers in my garden, I have risked being spotted by neighbours as I furtively lean over their front walls to photograph the plants in their gardens.
It is very efficient to use smaller datasets and low resolution images for initial training. Save the model and then increase resolution. Often you can do this on a local CPU without even paying for access to a GPU. When you have a half decent model, upload it onto a GPU and continue training with the full dataset. Deploying the model as a web service on Render makes the model available to any device, including a mobile phone.
My final model is amazing… and it works for sunflowers.
One of the first skills acquired in the latest version of the fast.ai course on deep learning is how to create a production version of an image classifier that runs as a web application. I decided to test this out on a set of images of road bikes, TT bikes and mountain bikes. To try it out, click on the image above or go to this website https://bike-identifier.onrender.com/ and select an image from your device. If you are using a phone, you can try taking photos of different bikes, then click on Analyse to see if they are correctly identified. Side-on images work best.
How does it work?
The fast.ai library provides a range of convenient ways to access images for the purpose of training a neural network. In this instance, I used the default option of applying transfer learning to a pre-trained ResNet34 model, scaling the images to 224 pixel squares, with data augmentation. After doing some initial training, it was useful to look at the images that had been misclassified, as many of these were incorrect images of motorbikes or cartoons or bike frames without wheels or TT bars. Taking advantage of a useful fast.ai widget, I removed unhelpful training images and trained the model further.
The confusion matrix showed that final version of my model was running at about 90% accuracy on the validation set, which was hardly world-beating, but not too bad. The main problem was a tendency to mistake certain road bikes for TT bikes. This was understandable, given the tendency for road bikes to become more aero, though it was disappointing when drop handlebars were clearly visible.
The next step was to make my trained network available as a web application. First I exported the models parameter settings to Dropbox. Then I forked a fast.ai repository into my GitHub account and edited the files to link to my Dropbox, switching the documentation appropriately for bicycle identification. In the final step, I set up a free account on Render to host a web service linked to my GitHub repository. This automatically updates for any changes pushed to the repository.
On the eve of the Tour de France, the pundits have made their predictions, but when the race is over, they will be long forgotten. One way of checking your own forecasts is to take a look at the odds offered on the betting markets. These are interesting, because they reflect the actions of people who have actually put money behind their views. In an efficient and liquid market, the latest prices ought to reflect all information available. This blog takes a look at the current odds, without wishing to encourage gambling in any way.
The website oddchecker.com collates the odds from a number of bookmakers across a large range of bets. It is helpful to convert the odds into predicted probabilities. Focussing on the overall winner, Egan Bernal is the favourite at 5/2 (equating to a 29% probability taking the yellow jersey), followed by Geraint Thomas at 7/2 (22%) and Jakob Fuglsang at 6/1 (14%). This gives a 51% chance of a winner being one of the two Team Ineos riders. The three three leading contenders are some distance ahead of Adam Yates, Richie Porte, Thibaut Pinot and Nairo Quintana. Less fancied riders include Roman Bardet, Steven Kruijswijk, Rigoberto Uran, Mikel Landa, Enric Mas and Vincenzo Nibali. Anyone else is seen as an outsider.
Ups and downs
The odds change over time, as the markets evaluate the performance and changing fortunes of the riders. In the following chart shows the fluctuations in the average daily implied winning chances of the three current favourites since the beginning of the year, according to betfair.com.
The implied probability that Geraint Thomas would repeat last year’s win has hovered between 20% and 30%, spiking up a bit during the Tour of Romandie. Unfortunately, Chris Froome’s odds are no longer available, as he was most likely the favourite earlier this year. However, his crash on 11 June instantaneously improved the odds for other riders, particularly Thomas and Bernal, though expectations for the Welshman declined after he crashed out of the Tour de Suisse on 18 June.
The betting on Fuglsang spiked up sharply during the Tirreno Adriatico, where he won a stage and came 3rd on GC, and the Tour of the Basque country, where he finished strongly. Apparently, his three podium results in the the Ardenne Classics had no effect on his chances of a yellow jersey, whereas his victory in the Critérium Dauphiné had a significant positive impact.
Egan Bernal, appeared from the shadows. At the beginning of the year, he was seen as a third string in Team Ineos. His victory in Paris Nice hardly registered on his odds for the Tour. But since Froome’s crash and Thomas’s departure from the Tour de Suisse, he became the bookies’ favourite.
With 65% of the money on the three main contenders, there are some pretty good odds available on other riders. A couple of crashes, an off day or a bit of bad luck could turn the race on its head. Clearly the Ineos and Astana teams are capable of protecting their GC contenders, but so too are Movistar, EF Education First, Michelton Scott, Groupama-FDJ, Bahrain Merida and others.