Kings and Queens of the Mountains

Screen Shot 2017-11-09 at 18.40.09.png

I guess that most male cyclists don’t pay much attention to the women’s leaderboards on Strava. And if they do it might just be to make some puerile remark about boys being better than girls. From a scientific perspective the comparison of male and female times leads to some interesting analysis.

Assuming both men and women have read my previous blogs on choosing the best time, weather conditions and wind directions for the segment that suits their particular strengths, we come back to basic physics.

KOM or QOM time = Work done / Power = (Work against gravity + Drag x Distance + Rolling resistance x Distance) / (Mass x Watt/kg)

Of the three components of work done, rolling resistance tends to be relatively insignificant. On a very steep hill, most of the work is done against gravity, whereas on a flat course, aerodynamic drag dominates.

The two key factors that vary between men and women are mass and power to weight ratio (watts per kilo).  A survey published by the ONS in 2010, rather shockingly reported that the average British man weighed 83.6kg, with women coming in at 70.2kg. This gives a male/female ratio of 1.19. KOM/QOM cyclists would tend to be lighter than this, but if we take 72kg and 60kg, the ratio is still 1.20.

Males generate more watts per kilogram due to having a higher proportion of lean muscle mass. Although power depends on many factors, including lungs, heart and efficiency of circulation, we can estimate the relative power to weight ratio by comparing the typical body composition of males and females. Feeding the ONS statistics into the Boer formula gives a lean body mass of 74% for men and 65% for women, resulting in a ratio of 1.13. This can be compared against the the useful table on Training Peaks showing maximal power output in Watts/kg, for men and women, over different time periods and a range of athletic abilities. The table is based on the rows showing world record performances and average untrained efforts.  For world champion five minute efforts and functional threshold powers, the ratios are consistent with the lean mass ratio. It makes sense that the ratio should be higher for shorter efforts, where the male champions are likely to be highly muscular. Apparently the relative performance is precisely 1.21 for all durations in untrained people.

Screen Shot 2017-11-08 at 10.23.33

On a steep climb, where the work done against gravity dominates, the benefit of additional male muscle mass is cancelled by the fact that this mass must be lifted, so the difference in time between the KOM and the QOM is primarily due to relative power to weight ratio. However, being smaller, women suffer from the disadvantage that the inert mass of bike represents a larger proportion of the total mass that must be raised against gravity. This effect increases with gradient. Accounting for a time difference of up to 16% on the steepest of hills.

In contrast, on a flat segment, it comes down to raw power output, so men benefit from advantages in both mass and power to weight ratio. But power relates to the cube of the velocity, so the elapsed time scales inversely with the cube root of power. Furthermore, with smaller frames, women present a lower frontal area, providing a small additional advantage. So men can be expected to have a smaller time advantage of around 9%. In theory the advantage should continue to narrow as the gradient shifts downhill.

Theory versus practice

Strava publishes the KOM and QOM leaderboards for all segments, so it was relatively straightforward to check the basic model against a random selection of 1,000 segments across the UK. All  leaderboards included at least 1,666 riders, with an overall average of 637 women and 5,030 men. One of the problems with the leaderboards is that they can be contaminated by spurious data, including unrealistic speeds or times set by groups riding together. To combat this, the average was taken of the top five times set on different dates, rather than simply to top KOM or QOM time.

The average segment length was just under 2km, up a gradient of 3%. The following chart plots the ratio of the QOM time to the KOM time versus gradient compared with the model described above. The red line is based on the lean body mass/world record holders estimate of 1.13, whereas the average QOM/KOM ratio was 1.32. Although there is a perceivable upward slope in the data for positive gradients, clearly this does not fit the data.

Screen Shot 2017-11-09 at 17.54.43

Firstly, the points on the left hand side indicate that men go downhill much more fearlessly than women, suggesting a psychological explanation for the observations deviating from the model. To make the model fit better for positive gradients, there is no obvious reason to expect the weight ratio of male to female Strava riders to deviate from the general population, so this leaves only the relative power to weight ratio. According to the model the QOM/KOM ratio should level off to the power to weight ratio for steep gradients. This seems to occur for a value of around 1.40, which is much higher than the previous estimates of 1.13 or the 1.21 for untrained people. How can we explain this?

A notable feature of the data set was that sample of 1,000 Strava segments was completed by nearly eight times as many men as women. This, in turn reflects the facts that there are more male than female cyclists in the UK and that men are more likely to upload, analyse, publicise and gloat over their performances than women.

Having more men than women, inevitably means that the sample includes more high level male cyclists than equivalent female cyclists. So we are not comparing like with like. Referring back to the Training Peaks table of expected power to weight ratios, a figure of 1.40 suggests we are comparing women of a certain level against men of a higher category, for example, “very good” women against “excellent” men.

A further consequence of having far more men than women is that is much more likely that the fastest times were recorded in the ideal conditions described in my previous blogs listed earlier.


There is room for more women to enjoy cycling and this will push up the standard of performance of the average amateur rider. This would enhance the sport in the same way that the industry has benefited as more women have joined the workforce.

Froome versus Dumoulin

Screen Shot 2017-10-27 at 19.04.21Many commentators have been licking their lips at the prospect of head-to-head combat between Chris Froome and Tom Dumoulin at next year’s Tour de France. It is hard to make a comparison based on their results in 2017, because they managed to avoid racing each other over the entire season of UCI World Tour races, meeting only in the World Championship Individual Time Trial, where the Dutchman was victorious. But it is intriguing to ask how Dumoulin might have done in the Tour de France and the Vuelta or, indeed, how Froome might have fared in the Giro.

Inspiration for addressing these hypothetical questions comes from an unexpected source. In 2009 Netflix awarded a $1million prize to a team that improved the company’s technique for making film recommendations to its users, based on the star ratings assigned by viewers. The successful algorithm exploited the fact that viewers may enjoy the films that are highly rated by other users who have generally agreed on the ratings of the films they have seen in common. Initial approaches sought to classify films into genres or those starring particular actors, in the hope of grouping together viewers into similar categories. However, it turned out to be very difficult to identify which features of a film are important. An alternative is simply to let the computer crunch the data and identify  the key features for itself. A method called Collaborative Filtering became one of the most popular employed for recommender systems.

Our cycling problem shares certain characteristics with the Netflix challenge: instead of users, films and ratings, we have riders, races and results. Riders enter a selection of races over the season, preferring those where they hope to do well. Similar riders, for example sprinters, tend to finish high in the results of races where other sprinters also do well. Collaborative filtering should be able to exploit the fact that climbers, sprinters or TTers tend to finish close to each other, across a range of races.

This year’s UCI World Tour concluded with the Tour of Guangxi, completing the data set of results for 2017. After excluding team time trials, 883 riders entered 174 races, resulting in 26,966 finishers. Most races have up to 200 participants , so if you imagine a huge table with all the racers down the rows and all the races across the columns, the resulting matrix is “sparse” in the sense that there are lots of missing values for the riders who were not in a particular race. Collaborative Filtering aims to fill in the spaces, i.e. to estimate the position of a rider who did not enter a specific race. This is exactly what we would like to do for the Grand Tours.

It took a couple of minutes to fit a matrix factorisation Collaborative Filtering model, using keras, on my MacBook Pro. Some experimenting suggested that I needed about 50 hidden factors plus a bias to come up with a reasonable fit for this data set. Taking at random the Milan San Remo one day stage race, it did a fairly good job of predicting the top ten riders for this long, hilly race with a flat finish.

 Model fit (prediction) Rider Actual result
1 Peter_Sagan 2
2 Alexander_Kristoff 4
3 Michael_Matthews 12
4 Edvald_Boasson_Hagen 19
5 Sonny_Colbrelli 13
6 Michal_Kwiatkowski 1
7 John_Degenkolb 7
8 nacer_Bouhanni 8
9 Julian_Alaphilippe 3
10 Diego_Ulissi 40

The following figure visualises the primary factors the model derived for classifying the best riders. Sprinters are in the lower part of chart, with climbers towards the top and allrounders in the middle. Those with a lot of wins are towards the left.

Screen Shot 2017-10-27 at 19.26.17

Now we come to the interesting part: how would Tom Dumoulin and Chris Froome have compared in the other’s Grand Tours? Note that this model takes account of the results of all the riders in all the races, so it should be capable of detecting the benefit of being part of a strong team.

Tour de France

The model suggested that Tom Dumoulin would have beaten Chris Froome in stages 1(TT), 2, 5, 6, 10 and 21, but the yellow jersey winner would have been stronger in the mountains and won overall.

Giro d’Italia

The model suggested that Chris Froome would have been ahead in the majority of stages, leaving stages 4, 5, 6, 9,  10(TT), 14 and 21(TT) to Dumoulin. The Brit would have most likely claimed the pink jersey.

Vuelta a España

The model suggested that Tom Dumoulin would have beaten Chris Froome in stages 2, 4, 12, 18, 19 and 21. In spite of a surge by the Dutchman towards the end of the race, the red jersey would have remained with Froome.


Based on a Collaborative Filtering approach, the results of 2017 suggest that Chris Froome would have beaten Tom Dumoulin in any of the Grand Tours.

Ranking Top Pro Cyclists for 2017


Following Il Lombardia last weekend, the World Tour has only two more events this year. It is time to ask who were the best sprinters of 2017? Who was the best climber or puncheur? The simplest approach is to count up the number of wins, but this ignores the achievement of finishing consistently among the top riders on different types of parcours. This article explores ways of creating rankings for different types of riders.

The current UCI points system, introduced in 2016, is fiendishly complicated, with points awarded for winning races and bonuses given to those wearing certain jerseys in stage races. The approach applies different scales according to the type of event, but each of these scales puts a premium on winning the race, with points awarded for first place being just over double the reward of the fifth-placed rider. In fact, taking the top 20 places in the four main world tour categories of event, the curve of best fit is exponential with a coefficient of approximately -1/6. In other words, there’s a linear relationship between a rider’s finishing position and the logarithm of the UCI points awarded.

UCI Points

This observation is really useful, because it provides a straightforward way of assessing the performance in different types of races, based on their finishing positions. The  PCS web site is great source of  professional cycling statistics. One nice feature is that most of the races/stages have an associated profile indicated by a little logo, see Tour de France. These classify races into the following categories:

  • Flat e.g. TdF stage 2 from Düsseldorf to Liège
  • Hills with a flat finish e.g. Milan San Remo
  • Hills with an uphill finish e.g. Fleche Wallonne
  • Mountains with a flat finish e.g. TdF stage 8 Station des Rousses
  • Mountains with an uphill finish e.g. TdF stage 5 La Planche des Belles Filles
  • It is also reasonable to assume that any stage of less than 80km was a TT

We would expect outright sprinters to top the rankings in flat races, whereas the puncheurs come to the fore when it becomes hilly, with certain riders doing particularly well on steep uphill finishes. The climbers come into their own in the mountains, with some being especially strong on summit finishes.

Taking the results of all the World Tour races in 2017 completed up to Il Lobardia and applying the simple -1/6 exponential formula equally to all categories of event,  we obtain the following “derived ranking”,  arranged by the profile of event.

Derived ranking for 2017 World Tour events, according to parcours

Screen Shot 2017-10-10 at 20.02.24

Marcel Kittel rightly tops the sprinters on flat courses (while Cavendish was 11th), but the Katusha Alpecin rider and several others have tended to be dropped on hilly courses, where Sagan, Ewan and Kristoff were joined by Trentin, Gaviria and some classic puncheurs. Sagan managed to win some notable uphill finishes, such as Tirreno-Adriatico and Grand Prix Cycliste de Quebec, alongside riders noted for being strong in the hills. The aggression of Valverde and Contador put them ahead of Froome on mountain stages that finished on the flat, but the TdF winner, Zakarin and Bardet topped the rankings of pure climbers for consistency on summit finishes. Finally we see the usual suspects topping the TT rankings.

It should be noted that ranking performances based simply on positions, without some form of scaling, gave very unintuitive results. While simpler than the UCI points system, this analysis supports the idea of awarding points in a way that scales exponentially with the finishing position of a rider.



Applying science for performance

The principal objective of this site is to apply scientific methods that improve performance in sport. The increasing use of wearable sensors provides a growing source of data that is ripe for the application of machine learning algorithms and model-based statistical analysis. The aim is to provide new insights into the performance of individuals and teams.