## Cycling Data Science – building models

In the previous blog, I explored the structure of a data set of summary statistics from over 800 rides recorded on my Garmin device. The K-means algorithm was an example of unsupervised learning that identified clusters of similar observations without using any identifying labels. The Orange software, used previously, makes it extremely easy to compare a number of simple models that map a ride’s statistics to its type: race, turbo trainer or just a training ride. Here we consider Decision Trees, Random Forests and Support Vector Machines.

### Decision Trees

Perhaps the most basic approach is to build a Decision Tree. The algorithm finds an efficient way to make a series of binary splits of the data set, in order to arrive at a set of criteria that separates the classes, as illustrated below.

The first split separates the majority of training rides from races and turbo trainer sessions, based on an average speed of 35.8km/h. Then Average Power Variance helped identify races, as observed in the previous blog. After this, turbo trainer sessions seemed to have a high level of TISS Aerobicity, which relates to the percentage of effort done aerobically. Pedalling balance, fastest 500m and duration separated the remaining rides. An attractive way to display these decisions is to create a Pythagorean Tree, where the sides of each triangle relate to the number of observations split by each decision.

### Random Forests

Many alternative sets of decisions could separate the data, where any particular tree can be quite sensitive to specific observations. A Random Forest addresses this issue by creating a collection of different decision trees and choosing the class by majority vote. This is the Pythagorean Forest representation of 16 trees, each with six branches.

### Support Vector Machines

A Support Vector Machine (SVM) is a widely used model for solving this kind of categorisation problem. The training algorithm finds an efficient way to slice the data, that largely separates the categories, while allowing for some overlap. The points that are closest to the slices are called support vectors. It is tricky to display the results in such a high dimensional space, but the following scatter plot displays Average Power Variance versus Average Speed, where the support vectors are shown as filled circles.

### Comparison of results

A Confusion Matrix provides a convenient way to compare the accuracy of the models. This correlates the predictions versus the actual category labels. Out of the 809 rides, only 684 were labelled. The Decision Tree incorrectly labelled 20 races and 7 turbos as training rides. The Random Forest did the best job, with only six misclassifications, while the SVM made 11 errors.

Looking at the classification errors can be very informative. It turns out that the two training rides classified as races by the SVM had been accidentally mislabelled – they were in fact races! Furthermore, looking at the five races the that SVM classified as training rides, I punctured in one, I crashed in another and in a third race, I was dropped from the lead group, but eventually rolled in a long way behind with a grupetto. The Random Forest also found an alpine race where my Garmin battery failed and classified it as a training ride. So the misclassifications were largely understandable.

After correcting the data set for mislabelled rides, the Random Forest improved to just two errors and the SVM dropped to just eight errors. The Decision Tree deteriorated to 37 errors, though it did recognise that the climbing rate tends to be zero on a turbo training session.

### Prediction

Having trained three models, we can take a look at the sample of 125 unlabelled rides. The following chart shows the predictions of the Random Forest model. It correctly identified one race and suggested several turbo trainer sessions. The SVM also found another race.

### Conclusions

Several lessons can be learned from these experiments. Firstly, it is very helpful to start with a clean data set. But if this is not the case, looking at the misclassified results of a decent model can be useful in catching mislabelled data. The SVM seemed to be good for this task, as it had more flexibility to fit the data than the Decision Tree, but it was less prone to overfit the data than the Random Forest.

The Decision Tree was helpful in quickly identifying average speed and power variance (chart below) as the two key variables. The SVM and Random Forest were both pretty good, but less transparent. One might improve on the results by combining these two models.

The next blog will explore this topic further.

## Cycling Data Science – clusters

Data Science is a hot topic that is impacting a range of diverse areas from business to sport. With so many cyclists collecting and uploading their data, there is plenty of raw material from which to draw interesting insights. This is the first in a series of articles exploring applications of data science in the field of cycling, beginning with the concept of clustering.

As a data set, I took all my Garmin files covering 2014-2017. Having previously uploaded them onto Golden Cheetah (GC), I took advantage of the API that allows external programmes, such as Python, to retrieve data. I also used a Python library to download the same rides from Strava, where I had recorded additional information about the rides.
After a certain amount of (rather time-consuming) tidying up, I ended up with over 800 rides. Each ride had over 200 summary statistics calculated by GC, as well as other meta-data, such as whether the ride was a race or turbo session. The metrics included all the standard items, such as time, distance, speed, heart rate, power, elevation gain, TSS, normalised power, as well as more esoteric metrics like “Time expended when Power is above CP and W’ bal is between 50% and 75% of W'”. When each ride is represented by a point in 200-dimensional space, it is easy to be overwhelmed. As a coach or an informed rider, which metrics are the most meaningful? This is precisely where data science steps in.
I decided to use some open source machine learning and data visualisation software called Orange. This makes it very straightforward to set up simple pipelines using a toolbox of standard approaches, as illustrated above.
One of the first things to do was to ask the computer to look for clusters of rides with similar characteristics. Orange has a useful feature that finds informative projections of the data that can be displayed on a scatter plot. As a first cut, the K-means algorithm categorised the data into four clusters that were largely explained by the time of day and the duration of the ride.

Although this makes a pretty graph, it simply tells us that I start a lot of rides in the morning, but do quite a few in the afternoon and evening. The green cluster includes my longer rides that rather obviously have to start earlier in the day. The scale is annoyingly shown in seconds, so a duration of 1800 would be a five hour ride. The blue band runs from about 1:30pm to about 6:30pm.

Grouping rides by time of day was not very helpful, so I filtered out that variable and searched again for rides that were similar in terms of effort. This made the results much more interesting. Distance and Average Power Variance (APV) were among the most informative metrics. The following scatter plot does a very good job of separating out races (shown in green), from normal rides and turbo trainer sessions (red). The points I did not have time to label are shown in grey.
Average Power Variance measures the mean power deviation with respect to its 30 second moving average. This will be high when power output is continually changing sharply, as it does on very short town centre courses or the Crystal Palace loop, where you are repeatedly sprinting out of corners. When racing on the Hillingdon and Dunsfold circuits or longer Surrey League routes, power is still much more variable than on a club ride. The band of Saturday club riders is very obvious at 53km: four laps of Richmond Park, with varying levels of APV depending on how aggressively the group was riding. You can also see that I quite often do only one or two laps, at about 19km and 30km. Short TTs and hill climb races tend to have less power variability. This was also the case on the endlessly long climbs encountered on the Haute Route. Lastly, turbo sessions have much lower APV because, even if target power levels vary, they tend to be sustained at the same level for each segment.
It is worth noting that APV is not correlated with the Variability Index, which is the ratio of normalised power to average power. APV is affected by continual changes in power output, whereas the Variability Index is strongly affected by power peaks, even if they a relatively few. The two power files below illustrate the difference.

## Conclusions

This analysis draws attention to Average Power Variance as a useful metric that is high for circuit and road races, but lower for TTs and long hilly races. The key observation for me is that relatively little of my training has a high APV.

The next part in this series zooms in on the races, to identify metrics associated with good and bad results.

## Kings and Queens of the Mountains

I guess that most male cyclists don’t pay much attention to the women’s leaderboards on Strava. And if they do it might just be to make some puerile remark about boys being better than girls. From a scientific perspective the comparison of male and female times leads to some interesting analysis.

Assuming both men and women have read my previous blogs on choosing the best time, weather conditions and wind directions for the segment that suits their particular strengths, we come back to basic physics.

KOM or QOM time = Work done / Power = (Work against gravity + Drag x Distance + Rolling resistance x Distance) / (Mass x Watt/kg)

Of the three components of work done, rolling resistance tends to be relatively insignificant. On a very steep hill, most of the work is done against gravity, whereas on a flat course, aerodynamic drag dominates.

The two key factors that vary between men and women are mass and power to weight ratio (watts per kilo).  A survey published by the ONS in 2010, rather shockingly reported that the average British man weighed 83.6kg, with women coming in at 70.2kg. This gives a male/female ratio of 1.19. KOM/QOM cyclists would tend to be lighter than this, but if we take 72kg and 60kg, the ratio is still 1.20.

Males generate more watts per kilogram due to having a higher proportion of lean muscle mass. Although power depends on many factors, including lungs, heart and efficiency of circulation, we can estimate the relative power to weight ratio by comparing the typical body composition of males and females. Feeding the ONS statistics into the Boer formula gives a lean body mass of 74% for men and 65% for women, resulting in a ratio of 1.13. This can be compared against the the useful table on Training Peaks showing maximal power output in Watts/kg, for men and women, over different time periods and a range of athletic abilities. The table is based on the rows showing world record performances and average untrained efforts.  For world champion five minute efforts and functional threshold powers, the ratios are consistent with the lean mass ratio. It makes sense that the ratio should be higher for shorter efforts, where the male champions are likely to be highly muscular. Apparently the relative performance is precisely 1.21 for all durations in untrained people.

On a steep climb, where the work done against gravity dominates, the benefit of additional male muscle mass is cancelled by the fact that this mass must be lifted, so the difference in time between the KOM and the QOM is primarily due to relative power to weight ratio. However, being smaller, women suffer from the disadvantage that the inert mass of bike represents a larger proportion of the total mass that must be raised against gravity. This effect increases with gradient. Accounting for a time difference of up to 16% on the steepest of hills.

In contrast, on a flat segment, it comes down to raw power output, so men benefit from advantages in both mass and power to weight ratio. But power relates to the cube of the velocity, so the elapsed time scales inversely with the cube root of power. Furthermore, with smaller frames, women present a lower frontal area, providing a small additional advantage. So men can be expected to have a smaller time advantage of around 9%. In theory the advantage should continue to narrow as the gradient shifts downhill.

## Theory versus practice

Strava publishes the KOM and QOM leaderboards for all segments, so it was relatively straightforward to check the basic model against a random selection of 1,000 segments across the UK. All  leaderboards included at least 1,666 riders, with an overall average of 637 women and 5,030 men. One of the problems with the leaderboards is that they can be contaminated by spurious data, including unrealistic speeds or times set by groups riding together. To combat this, the average was taken of the top five times set on different dates, rather than simply to top KOM or QOM time.

The average segment length was just under 2km, up a gradient of 3%. The following chart plots the ratio of the QOM time to the KOM time versus gradient compared with the model described above. The red line is based on the lean body mass/world record holders estimate of 1.13, whereas the average QOM/KOM ratio was 1.32. Although there is a perceivable upward slope in the data for positive gradients, clearly this does not fit the data.

Firstly, the points on the left hand side indicate that men go downhill much more fearlessly than women, suggesting a psychological explanation for the observations deviating from the model. To make the model fit better for positive gradients, there is no obvious reason to expect the weight ratio of male to female Strava riders to deviate from the general population, so this leaves only the relative power to weight ratio. According to the model the QOM/KOM ratio should level off to the power to weight ratio for steep gradients. This seems to occur for a value of around 1.40, which is much higher than the previous estimates of 1.13 or the 1.21 for untrained people. How can we explain this?

A notable feature of the data set was that sample of 1,000 Strava segments was completed by nearly eight times as many men as women. This, in turn reflects the facts that there are more male than female cyclists in the UK and that men are more likely to upload, analyse, publicise and gloat over their performances than women.

Having more men than women, inevitably means that the sample includes more high level male cyclists than equivalent female cyclists. So we are not comparing like with like. Referring back to the Training Peaks table of expected power to weight ratios, a figure of 1.40 suggests we are comparing women of a certain level against men of a higher category, for example, “very good” women against “excellent” men.

A further consequence of having far more men than women is that is much more likely that the fastest times were recorded in the ideal conditions described in my previous blogs listed earlier.

## Conclusions

There is room for more women to enjoy cycling and this will push up the standard of performance of the average amateur rider. This would enhance the sport in the same way that the industry has benefited as more women have joined the workforce.

## Froome versus Dumoulin

Many commentators have been licking their lips at the prospect of head-to-head combat between Chris Froome and Tom Dumoulin at next year’s Tour de France. It is hard to make a comparison based on their results in 2017, because they managed to avoid racing each other over the entire season of UCI World Tour races, meeting only in the World Championship Individual Time Trial, where the Dutchman was victorious. But it is intriguing to ask how Dumoulin might have done in the Tour de France and the Vuelta or, indeed, how Froome might have fared in the Giro.

Inspiration for addressing these hypothetical questions comes from an unexpected source. In 2009 Netflix awarded a \$1million prize to a team that improved the company’s technique for making film recommendations to its users, based on the star ratings assigned by viewers. The successful algorithm exploited the fact that viewers may enjoy the films that are highly rated by other users who have generally agreed on the ratings of the films they have seen in common. Initial approaches sought to classify films into genres or those starring particular actors, in the hope of grouping together viewers into similar categories. However, it turned out to be very difficult to identify which features of a film are important. An alternative is simply to let the computer crunch the data and identify  the key features for itself. A method called Collaborative Filtering became one of the most popular employed for recommender systems.

Our cycling problem shares certain characteristics with the Netflix challenge: instead of users, films and ratings, we have riders, races and results. Riders enter a selection of races over the season, preferring those where they hope to do well. Similar riders, for example sprinters, tend to finish high in the results of races where other sprinters also do well. Collaborative filtering should be able to exploit the fact that climbers, sprinters or TTers tend to finish close to each other, across a range of races.

This year’s UCI World Tour concluded with the Tour of Guangxi, completing the data set of results for 2017. After excluding team time trials, 883 riders entered 174 races, resulting in 26,966 finishers. Most races have up to 200 participants , so if you imagine a huge table with all the racers down the rows and all the races across the columns, the resulting matrix is “sparse” in the sense that there are lots of missing values for the riders who were not in a particular race. Collaborative Filtering aims to fill in the spaces, i.e. to estimate the position of a rider who did not enter a specific race. This is exactly what we would like to do for the Grand Tours.

It took a couple of minutes to fit a matrix factorisation Collaborative Filtering model, using keras, on my MacBook Pro. Some experimenting suggested that I needed about 50 hidden factors plus a bias to come up with a reasonable fit for this data set. Taking at random the Milan San Remo one day stage race, it did a fairly good job of predicting the top ten riders for this long, hilly race with a flat finish.

Model fit (prediction) Rider Actual result
1 Peter_Sagan 2
2 Alexander_Kristoff 4
3 Michael_Matthews 12
4 Edvald_Boasson_Hagen 19
5 Sonny_Colbrelli 13
6 Michal_Kwiatkowski 1
7 John_Degenkolb 7
8 nacer_Bouhanni 8
9 Julian_Alaphilippe 3
10 Diego_Ulissi 40

The following figure visualises the primary factors the model derived for classifying the best riders. Sprinters are in the lower part of chart, with climbers towards the top and allrounders in the middle. Those with a lot of wins are towards the left.

Now we come to the interesting part: how would Tom Dumoulin and Chris Froome have compared in the other’s Grand Tours? Note that this model takes account of the results of all the riders in all the races, so it should be capable of detecting the benefit of being part of a strong team.

## Tour de France

The model suggested that Tom Dumoulin would have beaten Chris Froome in stages 1(TT), 2, 5, 6, 10 and 21, but the yellow jersey winner would have been stronger in the mountains and won overall.

## Giro d’Italia

The model suggested that Chris Froome would have been ahead in the majority of stages, leaving stages 4, 5, 6, 9,  10(TT), 14 and 21(TT) to Dumoulin. The Brit would have most likely claimed the pink jersey.

## Vuelta a España

The model suggested that Tom Dumoulin would have beaten Chris Froome in stages 2, 4, 12, 18, 19 and 21. In spite of a surge by the Dutchman towards the end of the race, the red jersey would have remained with Froome.

## Conclusions

Based on a Collaborative Filtering approach, the results of 2017 suggest that Chris Froome would have beaten Tom Dumoulin in any of the Grand Tours.

## Ranking Top Pro Cyclists for 2017

Following Il Lombardia last weekend, the World Tour has only two more events this year. It is time to ask who were the best sprinters of 2017? Who was the best climber or puncheur? The simplest approach is to count up the number of wins, but this ignores the achievement of finishing consistently among the top riders on different types of parcours. This article explores ways of creating rankings for different types of riders.

The current UCI points system, introduced in 2016, is fiendishly complicated, with points awarded for winning races and bonuses given to those wearing certain jerseys in stage races. The approach applies different scales according to the type of event, but each of these scales puts a premium on winning the race, with points awarded for first place being just over double the reward of the fifth-placed rider. In fact, taking the top 20 places in the four main world tour categories of event, the curve of best fit is exponential with a coefficient of approximately -1/6. In other words, there’s a linear relationship between a rider’s finishing position and the logarithm of the UCI points awarded.

This observation is really useful, because it provides a straightforward way of assessing the performance in different types of races, based on their finishing positions. The  PCS web site is great source of  professional cycling statistics. One nice feature is that most of the races/stages have an associated profile indicated by a little logo, see Tour de France. These classify races into the following categories:

• Flat e.g. TdF stage 2 from Düsseldorf to Liège
• Hills with a flat finish e.g. Milan San Remo
• Hills with an uphill finish e.g. Fleche Wallonne
• Mountains with a flat finish e.g. TdF stage 8 Station des Rousses
• Mountains with an uphill finish e.g. TdF stage 5 La Planche des Belles Filles
• It is also reasonable to assume that any stage of less than 80km was a TT

We would expect outright sprinters to top the rankings in flat races, whereas the puncheurs come to the fore when it becomes hilly, with certain riders doing particularly well on steep uphill finishes. The climbers come into their own in the mountains, with some being especially strong on summit finishes.

Taking the results of all the World Tour races in 2017 completed up to Il Lobardia and applying the simple -1/6 exponential formula equally to all categories of event,  we obtain the following “derived ranking”,  arranged by the profile of event.

### Derived ranking for 2017 World Tour events, according to parcours

Marcel Kittel rightly tops the sprinters on flat courses (while Cavendish was 11th), but the Katusha Alpecin rider and several others have tended to be dropped on hilly courses, where Sagan, Ewan and Kristoff were joined by Trentin, Gaviria and some classic puncheurs. Sagan managed to win some notable uphill finishes, such as Tirreno-Adriatico and Grand Prix Cycliste de Quebec, alongside riders noted for being strong in the hills. The aggression of Valverde and Contador put them ahead of Froome on mountain stages that finished on the flat, but the TdF winner, Zakarin and Bardet topped the rankings of pure climbers for consistency on summit finishes. Finally we see the usual suspects topping the TT rankings.

It should be noted that ranking performances based simply on positions, without some form of scaling, gave very unintuitive results. While simpler than the UCI points system, this analysis supports the idea of awarding points in a way that scales exponentially with the finishing position of a rider.

## Deep Learning – Faking It

My last blog showed the results of using a deep convolutional neural network to apply different artistic styles to a photograph of cyclist.  This article looks at the trendy topic of Generative Adversarial Networks (GANs). Specifically, I investigate the application of a Wasserstein GAN to generate thumbnail images of bicycles.

In the field of machine learning, a generative model is a model designed to produce examples from a particular target distribution. In statistics, the output might be samples from a Gaussian distribution, but we can extend the idea to create a model that produces examples of sonnets in the style of Shakespeare or pictures of cats… or bicycles.

The adversarial framework introduces an attractive idea from game theory: to create a competitive form of learning. While a generator learns from a corpus of real examples how to create realistic “fakes”, a discriminator (or critic) learns to distinguish been fakes and authentic examples. In fact, the generator is given the objective of trying to fool the discriminator. As the discriminator improves, the generator is driven to enhance the authenticity of its output. This creates a virtuous cycle.

When originally proposed in 2014, Generative Adversarial Networks stimulated much interest, but it proved hard to make them work reliably in practice. One problem was “mode collapse”, where the generator becomes stuck, producing the same output all the time. However, this changed with the publication of a recent paper, explaining how earlier problems could be overcome by using a so-called Wasserstein loss function.

As an experiment, I downloaded a batch of images of bicycles from the Internet. After manually removing pictures with riders and close-ups of components, there were about 1,200 side views of road bikes (mostly with handlebars to the right, so you can see the chainset). After a few experiments, I reduced the dataset to the 862 images, by automatically selecting bikes against a white background.

As a participant of part 2 of the excellent fast.ai deep learning course, I made use of WGAN code that runs using Pytorch. I loaded the bike images at thumbnail size of 64×64 (training with larger images exceeded the memory constraints of the p2.large GPU I’m running on AWS). It was initially disappointing to experience the mode collapse problem, especially because the authors of the WGAN paper claimed never to have encountered it. However, speeding up the learning rate of the generator seemed to solve the problem.

Although each fake was created from a completely random starting point, the generator learned to produce images against a white background, with two circles joined by lines. After a couple of hundred iterations the WGAN began to generate some recognisably bicycle-like images. Notice the huge variety. Some of the best ones are shown at the top of this post.

I tried to improve the WGAN’s images, using another deep learning tool: super resolution. This amazing technique is used to solve the seemingly impossible task of converting images from low resolution to high resolution. It is achieved by taking downgraded versions of a large dataset of high resolution images, then training a neural network to reproduce a high-res version from the corresponding low-res input. A super resolution network is able to learn about certain properties of the world, for example, it converts jagged curves into smooth ones – a feature I’d hoped might be useful for making wheels look rounder.

#### Example of a super resolution network on real photographs

Unfortunately, my super resolution experiments did not lead to the improvement I’d hoped for. Two possible explanations are that a) the fake images were not low-res photos and b) the network had been trained on many types of images other than bicycles with white backgrounds.

#### Example of super resolution network on a fake bicycle image

In the end I was pretty happy with the best of the 64×64 images shown above. They are at least as good as something I could draw by hand. This is an impressive example of unsupervised learning. The trained network is able to use some learned notion of what a bicycle looks like in order to produce new images that possess similar properties. With more time and training, I’m sure the WGAN could be improved, perhaps to the point where the images might provide creative inspiration for new bike designs.

### References

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., … Bengio, Y. (2014). Generative Adversarial Networks.

Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein GAN.

Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual Losses for Real-Time Style Transfer and Super-Resolution.

## Deep Learning – Cycling Art

I’ve always be fascinated by the field of artificial intelligence, but it is only recently that significant and rapid advances have been made, particularly in the area of deep learning, where artificial neural networks are able to learn complex relationships. Back in the early 1990s, I experimented with forecasting share prices using neural networks. Performance was not much better than the linear models we were using at the time, so we never managed money this way, though I did publish a paper on the topic.

I am currently following an amazing course offered by fast.ai that explains how to programme and implement state of the art techniques in deep learning. Image recognition is one of the most interesting applications. Convolutional neural networks are able to recognise the content and style of images. It is possible to explore what the network has “learnt” by examining the content of the intermediate layers, between the input and the output.

Over the last week I have been playing around with some Python code, provided for the course, that uses a package called keras to build and run networks on a GPU using Google’s TensorFlow infrastructure. Starting with a modified version of the publicly available network called VGG16, which has been trained to recognise images, the idea is to combine the content a photograph with the style of an artist.

An image is presented to the network as an array of pixel values. These are passed through successive layers, where a series of transformations is performed. These allow the network to recognise increasingly complex features of the original image. The content of the image is captured by refining an initially random set of pixels, until it generates similar higher level features.

The style of an artist is represented in a slightly different way. This time an initially random set of pixels is modified until it matches the overall mixture of colours and textures, in the absence of positional information.

Finally, a new image is created, again initially from random, but this time matching both the content of the photograph and the style of the artist. The whole process takes about half an hour on my MacBook Pro, though I also have access to a high-spec GPU on Amazon Web Services to run things faster.

Here are some examples of a cyclist in the styles of Cézanne, Braque, Monet and Dali. The Cézanne image worked pretty well. I scaled up the content versus style for Braque. The Monet picture confuses the sky and trees. And the Dali result is just weird.

### References

Trained to Forecast – Risk Magazine, January 1993

Deep Learning for Coders

A Neural Algorithm of Artistic Style, Leon A. Gatys, Alexander S. Ecker, Matthias Bethge

## Chain reactions

At this year’s Royal Society Summer Exhibition, scientists and engineers from Bristol University presented some interesting work on improvements to the drive chains used by Team GB in the Rio Olympics. They reached clear conclusions about the design of the chain and sprockets, taken up by Renold. Current research is exploring the the problem of chain resonance.

Bicycle chains and sprockets and sprockets tend to receive less attention than aerodynamics, for several reasons. As noted in previous blogs, the power required to overcome aerodynamic drag scales with the cube of velocity, whereas frictional effects scale simply in proportion to velocity. Furthermore, a good well-lubricated drive chain typically has an efficiency of around 95% or more, so it is hard to make further improvements. Note that a dirty chain has significantly lower efficiency, so you should certainly keep your bike clean.

The loss of power comes from the friction between links as they bend around the chainring and the rear sprocket. Using a high precision rig, the researchers demonstrated that larger sprockets are more efficient than smaller ones. For example, with a gear ratio of 4:1, it is more efficient to use a 64/16 than a more conventional 52/13.

In fact, one of the experts told me that the efficiency of the drive chain falls off sharply as the sprocket size is reduced from 13 to 12 to 11. This is because the chain has to bend around a much sharper angle for a smaller sprocket. If you think about it, the straight chain has to bend to a certain angle that depends on the number of teeth on the sprocket. Recalling some school maths about the interior angles of polygons, for 16 teeth, the angle is 157.5º, whereas for 11 teeth, the angle is 147.3º. For the larger sprocket, each pair of links overcomes less friction bending through 22.5º and back, compared with a more dramatic 32.7º and back for the smaller one.

Note that this analysis of the rear sprocket applies to single speed track bikes. On a road bike the chain has to pass the two derailleur cogs, which typically have 13 teeth, whatever gear you choose. However, the argument still applies to the chainring  at the front, where the gains of going larger were shown to exceed the additional aerodynamic drag.

The Bristol team also explored the effect of a number of other factors on performance. Using different length links obviously requires customised sprockets and chainrings. This would be a major upheaval for the industry, but it is possible for purpose-built track bikes. Certain molybdenum-based lubricating powders used in the space industry may be better than traditional oils. Other materials could replace traditional steel.

A different kind of power loss can occur when the chain resonates vertically. A specially designed test rig showed that this can occur at frequencies, which could be triggered at certain pedalling cadences. Current research is investigating how the tension of the chain and its design can help mitigate this problem (which is also an issue for motor cycles).

In conclusion, when we see Tony Martin pushing a 58+ chainring, it may not be simply an act of machismo – he is actually be benefitting from efficiency gains.

## Update on cycling aerodynamics

A recently published paper provides a useful review of competition cycling aerodynamics. It looks at the results of a wide range of academic studies, highlighting the significant advances made in the last 5 to 10 years.

The power required to overcome aerodynamic drag rises with the cube of velocity, so riding at 50km/h takes almost twice as much power as riding at 40km/h. At racing speed, around 80% of a cyclist’s power goes into overcoming aerodynamic drag. This is largely because a bike and rider are not very streamlined, resulting in a turbulent wake.

The authors quote drag coefficients, Cd, of 0.8 for upright and 0.6 for TT positions. These compare with 0.07 for a recumbent bike with fairing, indicating that there is huge room for improvement.

Wind tunnels, originally used in the aerospace and automotive industries, are now being designed specifically for cycling, though no specific standards have been adopted. These provide a simplification of environmental conditions, but they can be used to study air flow for different body positions and equipment. Mannequins are often used in research, as one of the difficulties for riders is the ability to repeat and maintain exactly the same position. Some tunnels employ cameras to track movements. Usually a drag area measurement, CdA, is reported, rather than Cd, thereby avoiding uncertainty due to measurement of frontal area, though this can be estimated by counting pixels in a image.

One thing that makes cycling particularly complex is the action of pedalling. This creates asymmetric high drag forces as one leg goes up and the other goes down, resulting in variations of up to 20% relative to a horizontal crank position.

Cycling has been studied using computational fluid dynamics, helping to save on wind tunnel costs. These use fine mesh models to calculate details of flow separation and pressure variations across the cyclist’s body. The better models are in good agreement with wind tunnel experiments.

Cycling speed is a maximum optimisation problem between aerodynamic and biomechanical efficiency

Ultimately, scientists need to do field tests. The extensive use of power meters allows cyclists to experiment for themselves. The authors provide two practical ways to separate the coefficient of rolling resistance, Crr,  from CdA. One based on rolling to a halt and the other using a series of short rides at constant speed.

Minimising aerodynamic resistance through rider position is one of the most effective ways to improve performance among well-trained athletes

Compared with riding upright on the hoods, moving to the drops saves 15% to 20% while adopting a TT position saves 30% to 35%. Studies show quite a lot of variance in these figures, as the results depend on whether the rider is pedalling, as well as body size. The following quote suggests that when freewheeling downhill in an aero tuck, your crank should be horizontal (unless you are cornering).

Current research suggests that the drag coefficient of a pedalling cyclist is ≈6% higher than that of a static cyclist holding a horizontal crank position

The authors quote the figures for CdA of 0.30-0.50 for an upright position, 0.25 to 0.30 on the drops and 0.20-0.25 for a TT position. Variation is largely, but not only, due to changes in frontal area, A. Unfortunately, relatively minor changes in position can have large effects on drag, but the following effects were noted.

Broker and Kyle note that rider positions that result in a flat back, a low tucked head and forearms positioned parallel to the bicycle frame generally have low aerodynamic drag. Wind tunnel investigations into a wide range of modifications to standard road cycling positions by Barry et al. showed that that lowering the head and torso and bringing the arms inside the silhouette of the hips reduced the aerodynamic drag.

Bike frames, wheels, helmets and skin suits are all designed with aerodynamics in mind, while remaining compliant with UCI rules. Skin suits are important, due to their large surface areas. By delaying airflow separation, textured fabrics reduce wake turbulence, resulting in as much as a 4% reduction in drag.

In race situations, drafting skills are beneficial, particularly behind a larger rider. While following riders gain a significant benefit, it has been shown that the lead rider also accrues a small advantage of around 3%. It is best to overtake very closely in order to take maximal advantage of lateral drafting effects.

For a trailing cyclist positioned immediately behind the leader, drag reduction has been reported in the range of 15–50 % and reduces to 10–30 % as the gap extends to approximately a bike length… The drafting effect is greater for the third rider than the second rider in a pace-line, but often remains nearly constant for subsequent riders

For those interested in greater detail, it is well worth looking at the full text of the paper, which is freely available.

### Reference

Riding against the wind: a review of competition cycling aerodynamics, Timothy N. CrouchEmail authorDavid BurtonZach A. LaBryKim B. Blair, Sports Engineering, June 2017, Volume 20, Issue 2, pp 81–110

## The fractal nature of GPS routes

The mathematician, Benoît Mandelbrot, once asked “How long is the coast of Britain?“. Paradoxically, the answer depends on the length of your measuring stick. Using a shorter ruler results in a longer total distance, because you take account of more minor details of the shape of the coastline. Extrapolating this idea, reducing the measurement scale down to take account of every grain of sand, the total length of the coast increases without limit.

This has an unexpected connection with the data recorded on a GPS unit. Cycle computers typically record position every second. When riding at 36km/h, a record is stored every 10 metres, but at a speed of 18k/h, a recording is made every 5 metres. So riding as a lower speed equates to measuring distances with a shorter ruler. When distance is calculated by triangulating between GPS locations, your riding speed affects the result, particularly when you are going around a sharp corner.

Consider two cyclists riding round a sharp 90-degree bend with a radius of 13m. The arc has a length of 20m, so the GPS has time to make four recordings for the a rider doing 18km/h, but only two recordings for the rider doing 36km/h. The diagram below shows that the faster rider will have a record of position at each red dot, while the slower rider also has a reading for each green dot.  Although the red and green distances match on the straight section, when it comes to the corner the total length of the red line segments is less than the total of the green segments. You can see this jagged effect if you zoom into a corner on the Strava map of your course. Both triangulated distances are shorter than the actual arc ridden.

It is relatively straightforward to show that the triangulation method will underestimate both distance and speed by a factor of 2r/s*sin(s/2r), where r is the radius of the corner in metres and s is speed in m/s. So the estimated length of the 20m arc for the fast rider is 19.4m ridden at a speed of 35.1km/h (2.5% underestimate), while the corresponding figures for the slower rider would be 19.8m at 17.9km/h (0.6% underestimate).

We might ask whether these underestimates are significant, given the error in locating real-time positions using GPS. Over the length of a ride, we should expect GPS errors to average out to approximately zero in all directions. However, triangulation underestimates distance on every corner, so these negative errors accumulate over the ride. Note that when the bike is stationary, any noise in the GPS position adds to the total distance calculated by triangulation. But guess what? This can only happen when you are not moving fast. The case remains that slower riders will show a longer total distance than faster riders.

The simple triangulation method described above does not take account of changes of elevation. This has a relatively small effect, except on the steepest gradients, thus a 10% climb increases in distance by only 0.5%.  In fact, the only reliable way to measure distance that accounts for corners and changes in altitude is to use a correctly-calibrated wheel-based device. Garmin’s GSC-10 speed and cadence monitor tracks the passage of magnets on the wheel and cranks, transmitting to the head unit via ANT+. This gives an accurate measure of ground speed, as long as the correct wheel size is used (and, of course, that changes with the type of tyre, air pressure, rider weight etc.).

According to Strava Support, Garmin uses a hierarchy for determining distance. If you have a PowerTap hub, its distance calculation takes precedence. Next, if you have a GSC-10, its figure is used. Otherwise the GPS positions are used for triangulation. This means that, if you don’t have a PowerTap or a GSC-10 speed/cadence meter, your distance (and speed) measurements will be subject to the distortions described above.

But does this really matter? Well it depends on how “wiggly” a route you are riding. This can be estimated using Richardson’s method. The idea is that you measure the route using different sized rulers and see how much the total distance changes. The rate of change determines the fractal dimension, which we can take as the “wiggliness” of the route.

One way of approximating this method from your GPS data is, firstly, to add up all the distances between consecutive GPS positions,  triangulating latitude and longitude. Then do the same using every other position. Then every fourth position, doubling the gap each time. If you happened to be riding at a constant 36km/h, this equates to measuring distance using a 10m ruler, then a 20m ruler, then a 40m ruler etc..

Using this approach, the fractal dimension of a simple loop around the Surrey countryside is about 1.01, which is not much higher than a straight line of dimension 1. So, with just a few corners, the GPS triangulation error will be low. The Sella Ronda has a fractal dimension of 1.11, reflecting the fact that alpine roads have to follow the naturally fractal-like mountain landscape. Totally contrived routes can be higher, such as this one, with a fractal dimension of 1.34, making GPS triangulation likely to be pretty inaccurate – if you zoom in, lots of corners are cut.

In conclusion, if you ride fast around a wiggly course, your Garmin will experience non-relativistic length contraction. Having GPS does not make your wheel-based speed/cadence monitor redundant.

If you are interested in the code used for this blog, you can find it here.