It was shocking to see footage of Remco Evenepoel’s horrific crash in Il Lombardia. Reports indicate that he broke his pelvis after falling from a bridge into a ravine. This follows the injuries sustained by his Deceuninck-QuickStep team mate Fabio Jakobsen in the Tour of Poland.
The video above shows the repairs to my pelvis carried out by the specialist team at St George’s Hospital. My accident was less spectacular than Remco’s, I just hit a large pothole, while riding in the Kent lanes last March. It took the ambulance two and a half hours to arrive, as this was just at the beginning of the COVID-19 crisis. In fact, lock-down was announced on the evening of my crash. There was a lot of uncertainty about the virus back then, so it was a pretty scary time to be in hospital. Nevertheless I have immense respect and gratitude for the NHS staff who looked after me.
I was given crutches the day after the operation and returned home the day after that, with strict instructions to remain non-weight-bearing on the injured leg for six weeks and then only partial weight-bearing for the next six weeks. An NHS physiotherapist contacted me and regularly provided a progression of exercises. I set myself additional challenges, like doing extra press-ups.
After six weeks of doing no proper exercise, I had lost 4kg. The circumference of my left thigh was 5cm less than the right. However, following a review at the hospital, I was given permission to start gentle exercise on my static turbo trainer. I began by removing the left pedal and performing single leg drills, but after a couple of days it was easier to put my injured leg on the pedal as a passenger. This also gave the hip some mobility.
After a week on the turbo, I was up to one hour a day at about 160 watts. It took a long time to increase this above 200 watts. I watched a lot of old cycling films, without any particular urge to go on Zwift. I started riding outside in mid-June, 12 weeks post op. My Garmin pedals allowed me to monitor the left-right balance as well as average power.The following chart shows that 21 weeks after my accident, balance is hovering around 48:52 and five minute power is back over 300 watts.
The psychological aspect of rehabilitation has been very important. I have focussed on targets and deadlines, marking each little achievement as a milestone. I am now walking without a limp, though running is still off limits. I even went kitesurfing a couple of weeks ago (don’t tell my surgeon about that one). I have been busy learning Italian, composing music and programming in Python.
Since heading back out on the roads, I have been riding cautiously, as my hip will not regain full strength until next spring. I plan to enter a couple of time trials to rekindle a sense of competition, without the danger of riding in a peloton. Racing again next season remains a goal.
Probably the most important mental aspect has been to stay positive at all times and never to spend time feeling sorry for oneself. This has been difficult as, inevitably, there have been a couple of set-backs when progress has seemed to reverse. But on the whole, my recovery has been astounding and, like Chris Froome, I remain optimistic about regaining my peak.
Remco will be back on the road next season, with the potential to pick up some results later in the year.
In previous blogs, I described how mathematical modelling can help understand the spread of the COVID-19 epidemics and provide privacy-preserving contact tracing. Looking forward at how the world will have to deal with COVID-19 in the coming months, it is likely that a significant percentage of the population will need to be tested multiple times. In a recent BBC science podcast, Neil Turok, Leon Mutesa and Wilfred Ndifo describe their highly efficient method of implementing large-scale testing that takes advantage of pooling samples. This is helping African governments save millions on the cost of testing. I offer an outline of their innovative approach, which is described in more detail in a paper published on arxiv.org.
The need for large-scale testing
The roll-out of antigen testing in some countries, like the US and the UK, has been painfully slow. Some suggest that the US may need to carry out between 400,00 and 900,000 tests a day in order to get a grip on the epidemic. When antigen tests cost 30-50 US dollars (or 24-40 UK pounds), this could be very expensive. However, as long as a relatively small percentage of the population is infected, running a separate test for everyone would be extremely inefficient compared with approaches that pool samples.
Pooling offers a huge advantage, because a negative test for a pooled sample of 100 swabs, would clear 100 people with a single test. The optimal size of the pools depends on the level of incidence of the disease: larger pools can be used for lower incidence.
The concept of pooling dates back to the work of Dorfman in 1943. His method was to choose an optimal pool size and perform a test on each pooled sample. A negative result for a pool clears all the samples contained in it. Then the infected individuals are found by testing every sample in the the positive pools. Mutesa and Ndifo’s hypercube method is more efficient, because, rather than testing everyone in an infected pool, you test carefully-selected sub-pools.
The idea is to imagine that all the samples in a pool lie on a multidimensional lattice in the form of a hypercube. It turns out that the optimal number of points in each direction is 3. Obviously it is hard to visualise high dimensions, but in 3-D, you have 27 samples arranged on a 3x3x3 grid forming a cube. The trick to identifying individual infected samples is to create sub-pools by taking slices through the lattice. In the diagram above, there are 3 red slices, 3 green and 3 blue, each containing 9 samples.
Consider, for simplicity, only one infected person out of the 27. Testing the 9 pools represented by the coloured slices will result in exactly 3 positive results, representing the intersection of the three planes passing through the infected sample. This uniquely identifies the positive individual with just 9 tests, whereas Dorfman would have set out to test all 27, finding the positive, on average after doing half of these.
Slicing a hypercube
Although you can optimise the pool size to ensure that the expected number of positives in any pool is manageable, in practice you won’t know how many infected samples are contained in any particular pool. The hypercube method deals with this by noting that a slice through a D-dimensional hypercube is itself a hypercube of dimension D-1, so the method can be applied recursively.
The other big advantage is that the approach is massively parallel, allowing positives to be identified quickly, relative to the speed of spread of the pandemic. About 3 rounds of PCR tests can be completed in a day. Algorithms that further reduce the total number of tests towards the information theoretical limit, such as binary search, require tests to be performed sequentially, which takes longer than doing more tests in parallel.
In order to make sure I really understood what is going on, I wrote some Python code to implement and validate the hypercube algorithm. In principle, it was extremely simple, but dealing with low probability edge cases, where multiple positive samples happen to fall into the same slice turned out to be a bit messy. However, in simulations, all infected samples were identified with no false positives nor false negatives. The number of tests was very much in line with the theoretical value.
Huge cost savings
My Python program estimates the cost savings of implementing the hypercube algorithm versus testing every sample individually. The bottom line is that the if the US government needed to test 900,000 people and the background level of infection is 1%, the algorithm would find all infected individuals with around 110,000 tests or 12% of the total samples. At $40 a test, this would be a cost saving of over $30million per day versus testing everyone individually. Equivalent calculations for the UK government to test 200,000 people would offer savings of around £5million pounds a day.
It is great to see leading edge science being developed in Africa. Cost conscious governments, for example in Rwanda, are implementing the strategy. Western governments lag behind, delayed by anecdotal comments from UK officials who worry that the approach is “too mathematical”, as if this is somehow a vice rather than a virtue.
Suppose you are in a Zwift race that comes down to a sprint finish. How long does it take for your avatar to respond to your heroic effort in the final dash for the line? Could a time lag cost you the race?
Consider the steps involved. First the ANT+ signal travels from your power meter to your device (i.e. computer or phone) then it goes to your router and on to Zwift’s server somewhere on the cloud. At some point your watts per kilo are converted into a velocity, taking account of your previous speed, the gradient, rolling resistance, drafting and any PowerUps in play. This calculation can be performed pretty much instantaneously compared with signal transmission time.
The ANT+ signal travels at the speed of light to your device, which is likely to be very close by, so there is little to be gained as long as there is a clear line of sight. The next step, to the router, can be slower, especially if you are relying on a wireless signal from your garage, while running a raft of other applications on your device (best to shut these down). Serious e-gamers often use a direct wired link to the router. It also helps if you have a super-fast high bandwidth internet connection. However, the time taken for the signal to travel from your router to Zwift’s gaming server, called latency, typically introduces the longest delay, especially if it has to go halfway around the world.
We don’t know the precise location of Zwift’s server, but let’s suppose it is in San Francisco. You can check the latency from your location to other parts of the world on web sites like this one. When I looked, the latency from London to San Francisco was 136ms (milliseconds) and from Cape Town it was 281ms.
In the past, banks have moved their trading desks as close as possible to exchanges, in order to obtain prices nanoseconds earlier than their rivals. As a general rule for interactive online gaming, you need a latency of less than 100ms for acceptable gameplay and over 150ms can become frustrating. But we are not talking about playing DOTA, so how do these figures apply to Zwift?
Zwift not DOTA
Let’s go back to our sprint finish, where the bunch is riding at 60kph. This equates to 16.7 metres per second, which is just a bit less than one bike length every 100ms. However, your ability to overtake your rival depends on your relative speed, not the absolute figure. Imagine a situation where you make a Herculean effort to increase your speed to 18 metres per second (64.8kph), drawing level with the leader’s rear wheel with 30 metres to go. To win the race, you have to make up a bike length, say 1.8m, travelling at a measly 1.3m/s faster than the leader. Who will cross the line first?
If you have 30m to go and the leader is a bike length ahead, he only has 28.2m left, taking 1.69 seconds. But at your higher speed you will cover 30m in 1.67 seconds, so you win by about half a wheel. However, if your avatar had responded to your acceleration with a 100ms lag, you would certainly have lost the race. If you experience this level of latency, a slower rider could beat you, just because he is located closer to the gaming server. The speed of your avatar really is limited by the speed of light.
However, sometimes it can feel like a zPower rider is overtaking you at an appreciable proportion of the speed of light. If this really were the case and Zwift wanted to represent the avatar correctly, what would it look like?
The physicist George Gamov posed this question back in 1938. He highlighted the effect of relativistic length contraction, predicted by Einstein’s theory of special relativity. In fact, the avatar would change colour, due to the Doppler shift, and light intensity would fluctuate. These effects would be further be complicated by our binocular vision, causing an unnerving blurring effect. This is helpfully explained in detail by physicists in a recent scientific paper. Surprisingly, there are practical applications for this work that may help interpret data gathered by spacecraft passing objects at very high speeds.
The Covid-19 epidemic provided a huge boost to the Zwift streaming service. Confined by a global lockdown, cyclists freed themselves from the boredom of pedalling on a static turbo trainer by logging into one of a broadening range of online virtual worlds. Zwift racing has become particularly popular. While it is relatively straightforward to simulate variations in gradient and even the effects of drafting, it is not possible for riders to demonstrate superior bike handling skills. Nor can racers benefit from adopting a superior aerodynamic position on the bike, in fact this may prove to be a disadvantage.
Setting aside e-doping suspicions, such as riders understating their weights, in the artificial world of a Zwift race, the outcome largely comes down the the ability to sustain a high level of power (watts per kilo). The engagingly competitive nature of simulated races encourages everyone to push their limits. However, since Zwift offers no penalty against maintaining a non-aerodynamic body position on your trainer, it is quite possible that regular Zwifters might become habituated to riding in position that is far from optimal for the road.
Once out in the fresh air again, many riders may have noticed improvements in the levels of power they are able to sustain, thanks to the high levels of exertion required to compete on Zwift. But in the real world, when it comes to beating other riders in a race or a time trial, the principle force a rider has to overcome is aerodynamic drag, not electromagnetic resistance.
Maximum speed is attained by adopting a riding position that provides the optimal tradeoff between the ability to generate power and a low level of aerodynamic drag. Drag depends on a rider’s CdA, which represents the drag coefficient multiplied by frontal area. Since power rises with the cube of velocity, there comes a point where it is better to compromise on power in order to reduce frontal area. This is the key to time trialing and successful breakaways.
When the race season begins, skilful and more aerodynamic racers will be able to benefit from drafting in the huge wind shadow created by Zwift diesels, while offering back much less assistance when they pull through. So after prolonged training on Zwift, racers and time trialists really need to focus on improving their aerodynamics
There are various ways to reduce drag, starting withs some basics as described in an earlier blog. Post ride analysis can be performed using Golden Cheetah, BestBikeSplit or MyWindSock. There is also a range of devices that claim to offer real time measurement of CdA. These have been primarily targeted at the TT/triathlon market, but there’s no doubt that these could be incredibly useful for both training or even, perhaps, a race breakaway. Cycling Weekly recently reviewed the Notio device, but, while useful, these tools remain expensive and a bit clunky.
Whatever you choose to do, stay safe and stay aero.
As the initial global wave of COVID-19 infections is brought under control, the world is moving into a phase of extensive testing, tracking and tracing, until a vaccine can be found. The preservation of personal privacy must be paramount in these initiatives.
The UK government’s target of performing 100,000 tests a day by the end of April 2020 provided a fine example of Goodhart’s law: “When a measure becomes a target, it ceases to be a good measure”. One tragic consequence was the willingness, even encouragement, to define just about anything as a “completed test”, including the action of simply dispatching a kit by post. This has discouraged the distinguish between different types of test: antigen or antibody, nasal swab or blood test, pin-prick or venous sample, laboratory analysis or on-the-spot result.
For those who suspect they might have been exposed to COVID-19, an antibody test is the most useful. Although there has not been time to gather sufficient information to be absolutely sure, the detection of antibodies in the blood should provide immunity from infection, at least in the short term, unless the virus mutates sufficiently to bypass the immune response. Private tests are available from providers, such as Forth, where reliable results of IgG antibodies are provided by laboratory tests performed using the Abbot Architect method.
A second area where the UK government seems to be going wrong is in hiring thousands of people to carry out intrusive tracking and tracing. Not only is this hugely inefficient, it is also a massive unnecessary invasion of personal privacy. That a data leak occurred before it even started hardly inspires confidence.
Privacy Preserving Contact Tracing
A team of epidemiologist and cryptographers called DP-3T has released open source software that makes use of Bluetooth messages exchanged between mobile phones to track and trace COVID-19 infections entirely anonymously. It does not require users to surrender any personal information or location data. The approach is the basis for the technology announced jointly by Apple and Google.
The method is explained very nicely in this video 3Blue1Brown or in comic form by Nicky Case. This is a summary of how it works. Once you download a privacy preserving app onto your phone, it transmits random numbers over Bluetooth, at regular time intervals, and simultaneously listens for the random numbers of other users. Since the numbers are random, they contain no information about the you. Your phone locally maintains a list of your transmitted random numbers. It also stores locally a list of all numbers received, possibly including a timestamp and the Bluetooth signal strength, which gives some information about the proximity of the other user. Items older than, say, 14 days can be deleted from both lists.
If a person falls ill and tests positive for COVID-19 antigens, that person can voluntarily, with the permission of a healthcare professional, anonymously upload the list of transmitted random numbers to a central database. The phone app of every user periodically checks this database against its local list of received messages. If a match is detected, the app can identify the date, time and duration of contact, along with an estimate of proximity. This allows the app to advise a user to “self-isolate” for an appropriate period. This matching can all be done locally on the phone.
If set up appropriately, neither Google nor Apple nor any government body would be able to identify any particular individual. Privacy is preserved. No human trackers or tracers are required. No ankle bracelets or police guards are necessary. The system is entirely voluntary, but if sufficient users join up, say, 60% of those susceptible, it can still have a significant impact in controlling the spread of the virus. This is the correct way forward for a free and democratic society.
In a fascinating white paper, Bert Blocken, Professor of Civil Engineering at Eindhoven University of Technology, comments on social distancing when applied to walking, running or cycling. His point is that the government recommendations to maintain a distance of 1.5 or 2 metres assume people are standing still indoors or outdoors in calm weather. However, when a person is moving, the majority of particulate droplets are swept along in a trailing slipstream.
Cyclists typically prefer to ride closely behind each other, in order to benefit from the aerodynamic drafting effect. Cycling is currently a permitted form of exercise in the UK, though only if riding alone or with members of your household. Nevertheless, there may be times when you find yourself catching up with a cyclist ahead. In this situation, you should avoid the habitual tendency to move up into the slipstream of the rider in front.
Professor Blocken’s team has performed computational fluid dynamics (CFD) simulations showing the likely spread of micro-droplets behind people moving at different speeds. As the cloud of particles, produced when someone coughs or sneezes, is swept into the slipstream, the heavier droplets, shown in red in the diagram above, fall faster. These are generally thought to be more considerably more contagious. You can see that they can land on the hands and body of the following athlete.
Based on the results, Blocken advises to keep a distance of at least four to five meters behind the leading person while walking in the slipstream, ten meters when running or cycling slowly and at least twenty metres when cycling fast.
Social Distancing v2.0
The recommendation, for overtaking other cyclists, is to start moving into a staggered position some twenty metres behind the rider in front, consistently avoiding the slipstream as you pass.
The results will be reported in a forthcoming peer-reviewed publication. But given the importance of the topic, I recommend that you take a look at the highly accessible three page white paper available here.
As a growing number of people seek to educate themselves on coronavirus COVID-19, while confined to their homes, a better understanding can be gained by taking a look at how to model an epidemic.
Researchers have created highly complex models of the spread of infections. For example, BlueDot’s disease-tracking model, described in this podcast, monitors the Internet with AI language translators and evaluates the network effects of transmission based on air travel itineraries. However, a surprising amount of insight can be gained from a very simple approach called the SIR model.
The SIR model divides the population into three classes. The susceptible class (S) includes everyone who can catch the infection. In the case of a novel virus like corona, it seems that the entire global population was initially susceptible. The infected class (I) includes all those currently infected and able to transmit the virus to susceptible people. The removed class (R) includes everyone who has recovered from the virus or, unfortunately, died. In the model, these people no longer transmit the disease nor are they susceptible. The idea is that people move from the susceptible class to the infected class to the removed class.
Although there is much focus in the media on the exponential rise of the total number of cases of coronavirus, this figure includes recoveries and deaths. In one sense this is a huge underestimate, because the figures only includes people who have taken a test and returned a positive result. As explained by Tomas Pueyo, many people do not display symptoms until around 5 days after infection and for over 90% these symptoms are mild, so there could be ten times more people infected than the official figures suggest. In another sense, the figures are a huge exaggeration, because people who have recovered are unlikely to be infectious, because their immune systems have fought off the virus.
The SIR model measures the number of infectious people. On the worldometers site these are called “active cases”. The critical insight of the SIR model, shown in the diagram above, is that the class of infected people grows if the daily number of new cases exceeds the number of closed cases.
Closed cases – removal rate
In a real epidemic, experts don’t really know how many people are infected, but they can keep track of those who have died or recovered. So it is best to start by considering the rate of transfer from infected to removed. After some digging around, it appears that the average duration of an infection is about 2 weeks. So in a steady state situation, about on person in 14 or 7% of those infected would recover every day. This percentage would be a bit less than this if the epidemic is spreading fast, because there would be more people who have recently acquired the virus, so let’s call it 6%. For the death rate, we need the number of deaths divided by the (unknown) total number of people infected. This is likely to be lower than the “case fatality rate” reported on worldometers because that divides the number of deaths only be the number of positive tests. The death rate is estimated to be 2-3%. If we add 3% to the 6% of those recovering, the removal rate (call it “a”) is estimated to be 9%.
In the absence of a cure or treatment for the virus, it is unlikely that the duration of infectiousness can be reduced. As long as hospitals are not overwhelmed, those who might otherwise have died may be saved. However, there is not much that governments or populations can do to speed up the daily rate of “closed cases”. The only levers available are those to reduce the number of “new cases” below 9%. This appeared to occur as a result of the draconian actions taken in China in the second half of February, but the sharp increase in new cases that became apparent over the weekend of 6/7 March spooked the financial markets.
New cases – infection rate
In the SIR model, the number of new infections depends on three factors: the number of infectious people, the number of susceptible people and the something called the infection rate (r), which measures the probability that an infected person passes on the virus to a susceptible person either through direct contact or indirectly, for example, by contaminating a surface, such as a door handle.
Governments can attempt to reduce the number of new infections by controlling each of the three driving factors. Clearly, hand washing and avoiding physical contact can reduce the infection rate. Similarly infected people are encouraged to isolate themselves in order to reduce the proportion of susceptible people exposed to direct or indirect contact. Guidance to UK general practitioners is to advise patients suffering from mild symptoms to stay at home and call 111, while those with serious symptoms should call 999.
When will it end?
As more people become infected, the number of susceptible people naturally falls. Until there is a vaccine, there is nothing governments can do to speed up this decline. Eventually, when enough people will have caught the current strain of the virus, so-called “herd immunity” will prevail. It is not necessary for everyone to have come into contact with the virus, rather it is sufficient for the number of susceptible people to be smaller than a critical value. Beyond this point infected people, on average, recover before they encounter a susceptible person. This is how the epidemic will finally start to die down.
When people refer to the transmission rate or reproduction rate of a virus, they mean the number of secondary infections produced by a primary infection across a susceptible population. This is equal to the number of susceptible people times the infection rate divided by the removal rate. This determines the threshold number of susceptible people below which the number of infections falls. The critical value is equal to removal rate relative to the infection rate (a/r) . When the number of susceptible people falls to this critical value, the number of infected people will reach a peak and subsequently decline. More susceptible people will still continue to be infected, but at a decreasing rate, until the infection dies out completely, by which time a significant part of the population will have been infected.
This looks very scary indeed
Running the figures through the SIR model produces some extremely scary predictions. At the time of writing, new cases of infection were rising at a rate of about 15%. At this rate, the virus could spread to 2/3 of the population before it dies out. If three percent of those infected die, the virus would kill 2 percent of the population. Based on results so far these would be largely elderly people or those suffering from complications, so it is extremely important that they are protected from infection. If the virus continues to run out of control, the number of deaths could run into the millions before the epidemic ends.
It is absolutely essential to reduce the infection rate and keep it low, particularly among the elderly and vulnerable groups
We should watch China carefully for new cases. If none arise, it suggests that a large proportion of the population gained immunity through infection, even though lower numbers of infections were reported. However, if the imposition of constraints temporary reduced the infection rate, leaving a large susceptible pool still vulnerable, the epidemic could re-emerge once the constraints are relaxed.
What to do?
The imposition of governments restrictions on travel and large gatherings are forcing the people to rethink their options. Where possible, office workers, university students, schoolchildren and sportsmen may find themselves congregating online in virtual environments rather than in the messy and dangerous real world.
Among cyclists, this ought to be good news for Zwift and other online platforms. Zwift seems to be particularly well-positioned, due to its strong social aspect, which allows riders to meet and race against each other in virtual races. It also has the potential to allow world tour teams to compete in virtual races.
In fact the ability to meet friends and do things together virtually would have applications across all walks of life. Sports fans need something between the stadium and the television. Businesses need a medium that fills the gap between a physical office and a conference call. Schools and universities require better ways to ensure that students can learn, while classrooms and lecture theatres are closed. These innovations may turn out to be attractive long after the coronavirus scare has subsided.
While the prospect of going to the pub or a crowded nightclub loses its appeal, cycling offers the average person a very attractive alternative way to meet friends while avoiding close proximity with large groups of people. As the weather improves, the chance to enjoying some exercise in the fresh air looks ever more enticing.
For those of us who cannot imagine going cycling without recording every ride on a GPS device, there is no reason not to do the same when we go skiing. Although there are specific skiing apps that you can download onto your phone, they increase the drain on the battery and use up your roaming data allowance. A simple alternative is simply to put your Garmin in your pocket for the day.
It is worthwhile creating a specific activity profile for skiing. On my Garmin 520, this is an option on the setting menu. The interesting data fields are elevation, total ascent, total descent, distance and max speed. When you upload the file onto Strava, you have a nice map of the day’s mountainous activites. Make sure you set the sport as “Alpine Ski”, otherwise your rides up the chairlifts end up smashing loads of KOMs set by summer mountain bikers.
When I went heliskiing in Canada, I was charged according to the vertical distance covered. One might hope to do at least 100k feet or 30 vertical kilometres in a week. It turns out that, with the multiple, fast-moving lifts and prepared pistes of a modern ski resort, you can expect to cover far higher total ascents/descents or “dénivelé” as they say in French.
If you keep moving, you can also cover remarkably long distances in a day of alpine skiing. The large interconnected European resorts like, Val d’Isere/Tignes, Les Trois Vallées, Portes du Soleil and Zermatt/Cervinia provide exentensive ski areas, where it can be hard to reach both edges of the piste map in a day.
Skiing speed is an interesting statistic. Having a background in slalom racing and freestyle skiing, I don’t hang around. It turns out that my descending speeds tend to be remarkably similar to cycling, averaging 35-40kph on a typical long run, but top speeds are definitely higher on skis.
A decent day’s skiing
A few years ago, I came up with three criteria for a decent day’s skiing, based on total vertical metres, distance covered and maximum speed. A reasonable total descent is 10,000m. You also need to cover a total distance of 100km. But the most challenging and dangerous part is to include a maximum speed of 100kph, which should only be attempted by expert skiers, at their own risk and without endangering other people.
The image at the top of this page shows a day when I comfortably achieved all three targets. It was surprising to note that, over an 8 1/2 hour day, I was only moving for 4 3/4 hours and that included riding up the lifts.
Of course, when there is a fresh fall of snow like we had at Christmas in the Alps, you can forget all that and just head off piste.
Since my blog about Strava Fitness and Freshness has been very popular, I thought it would be interesting to demonstrate a simple model that can help you use these metrics to improve your cycling performance.
As a quick reminder, Strava’s Fitness measure is an exponentially weighted average of your daily Training Load, over the last six weeks or so. Assuming you are using a power meter, it is important to use a correctly calibrated estimate of your Functional Threshold Power (FTP) to obtain an accurate value for the Training Load of each ride. This ensures that a maximal-effort one hour ride gives a value of 100. The exponential weighting means that the benefit of a training ride decays over time, so a hard ride last week has less impact on today’s Fitness than a hard ride yesterday. In fact, if you do nothing, Fitness decays at a rate of about 2.5% per day.
Although Fitness is a time-weighted average, a simple rule of thumb is that your Fitness Score equates to your average daily Training Load over the last month or so. For example, a Fitness level of 50 is consistent with an average daily Training Load (including rest days) of 50. It may be easier to think of this in terms of a total Training Load of 350 per week, which might include a longer ride of 150, a medium ride of 100 and a couple of shorter rides with a Training Load of 50.
How to get fitter
The way to get fitter is to increase your Training Load. This can be achieved by riding at a higher intensity, increasing the duration of rides or including extra rides. But this needs to be done in a structured way in order be effective. Periodisation is an approach that has been tried and tested over the years. A four-week cycle would typically include three weekly blocks of higher training load, followed by an easier week of recovery. Strava’s Fitness score provides a measure of your progress.
Modelling Fitness and Fatigue
An exponentially weighted moving average is very easy to model, because it evolves like a Markov Process, having the following property, relating to yesterday’s value and today’s Training Load.
where is Fitness or Fatigue on day t and for Fitness or for Fatigue
This is why your Fitness falls by about 2.4% and your Fatigue eases by about 13.3% after a rest day. The formula makes it straightforward to predict the impact of a training plan stretching out into the future. It is also possible to determine what Training Load is required to achieve a target level of Fitness improvement of a specific time period.
Ramping up your Fitness
The change in Fitness over the next seven days is called a weekly “ramp”. Aiming for a weekly ramp of 5 would be very ambitious. It turns out that you would need to increase your daily Training Load by 33. That is a substantial extra Training Load of 231 over the next week, particularly because Training Load automatically takes account of a rider’s FTP.
Interestingly, this increase in Training Load is the same, regardless of your starting Fitness. However, stepping up an average Training Load from 30 to 63 per day would require a doubling of work done over the next week, whereas for someone starting at 60, moving up to 93 per day would require a 54% increase in effort for the week.
In both cases, a cyclist would typically require two additional hard training rides, resulting in an accumulation of fatigue, which is picked up by Strava’s Fatigue score. This is a much shorter term moving average of your recent Training Load, over the last week or so. If we assume that you start with a Fatigue score equal to your Fitness score, an increase of 33 in daily Training Load would cause your Fatigue to rise by 21 over the week. If you managed to sustain this over the week, your Form (Fitness minus Fatigue) would fall from zero to -16. Here’s a summary of all the numbers mentioned so far.
Whilst it might be possible to do this for a week, the regime would be very hard to sustain over a three-week block, particularly because you would be going into the second week with significant accumulated fatigue. Training sessions and race performance tend to be compromised when Form drops below -20. Furthermore, if you have increased your Fitness by 5 over a week, you will need to increase Training Load by another 231 for the following week to continue the same upward trajectory, then increase again for the third week. So we conclude that a weekly ramp of 5 is not sustainable over three weeks. Something of the order of 2 or 3 may be more reasonable.
A steady increase in Fitness
Consider a rider with a Fitness level of 30, who would have a weekly Training Load of around 210 (7 times 30). This might be five weekly commutes and a longer ride on the weekend. A periodised monthly plan could include a ramp of 2, steadily increasing Training Load for three weeks followed by a recovery week of -1, as follows.
This gives a net increase in Fitness of 5 over the month. Fatigue has also risen by 5, but since the rider is fitter, Form ends the month at zero, ready to start the next block of training.
To simplify the calculations, I assumed the same Training Load every day in each week. This is unrealistic in practice, because all athletes need a rest day and training needs to mix up the duration and intensity of individual rides. The fine tuning of weekly rides is a subject for another blog.
A tougher training block
A rider engaging in a higher level of training, with a Fitness score of 60, may be able to manage weekly ramps of 3, before the recovery week. The following Training Plan would raise Fitness to 67, with sufficient recovery to bring Form back to positive at the end of the month.
A general plan
The interesting thing about this analysis is that the outcomes of the plans are independent of a rider’s starting Fitness. This is a consequence of the Markov property. So if we describe the ambitious plan as [3,3,3,-2], a rider will see a Fitness improvement of 7, from whatever initial value prevailed: starting at 30, Fitness would go to 37, while the rider starting at 60 would rise to 67.
Similarly, if Form begins at zero, i.e. the starting values of Fitness and Fatigue are equal, then the [3,3,3,-2] plan will always result in a in a net change of 6 in Fatigue over the four weeks.
In the same way, (assuming initial Form of zero) the moderate plan of [2,2,2,-1] would give any rider a net increase of Fitness and Fatigue of 5.
A couple of years ago I built a model to evaluate how Froome and Dumoulin would have matched up, if they had not avoided racing against each other over the 2017 season. As we approach the 2019 World Championships Road Race in Yorkshire, I have adopted a more sophisticated approach to try to predict the winner of the men’s race. The smart money could be going on Sam Bennett.
With only two races outstanding, most of this year’s UCI world tour results are available. I decided to broaden the data set with 2.HC classification European Tour races, such as the OVO Energy Tour of Britain. In order to help with prediction, I included each rider’s weight and height, as well as some meta-data about each race, such as date, distance, average speed, parcours and type (stage, one-day, GC, etc.).
The key question was what exactly are you trying to predict? The UCI allocates points for race results, using a non-linear scale. For example, Mathieu Van Der Poel was awarded 500 points for winning Amstel Gold, while Simon Clarke won 400 for coming second and Jakob Fuglsang picked up 325 for third place, continuing down to 3 points for coming 60th. I created a target variable called PosX, defined as a negative exponential of the rider’s position in any race, equating to 1.000 for a win, 0.834 for second, 0.695 for third, decaying down to 0.032 for 20th. This has a similar profile to the points scheme, emphasising the top positions, and handles races with different numbers of riders.
A random forest would be a typical choice of model for this kind of data set, which included a mixture of continuous and categorical variables. However, I opted for a neural network, using embeddings to encode the categorical variables, with two hidden layers of 200 and 100 activations. This was very straightforward using the fast.ai library. Training was completed in a handful of seconds on my MacBook Pro, without needing a GPU.
After some experimentation on a subset of the data, it was clear that the model was coming up with good predictions on the validation set and the out-of-sample test set. With a bit more coding, I set up a procedure to load a start list and the meta-data for a future race, in order to predict the result.
With the final start list for the World Championships Road Race looking reasonably complete, I was able to generate the predicted top 10. The parcours obviously has an important bearing on who wins a race. With around 3600m of climbing, the course was clearly hilly, though not mountainous. Although the finish was slightly uphill, it was not ridiculously steep, so I decided to classify the parcours as rolling with a flat finish
Mathieu Van Der Poel
Edvald Boasson Hagen
Greg Van Avermaet
It was encouraging to see that the model produced a highly credible list of potential top 10 riders, agreeing with the bookies in rating Mathieu Van Der Poel as the most likely winner. Sagan was ranked slightly below Kristoff and Bennett, who are seen as outsiders by the pundits. The popular choice of Philippe Gilbert did not appear in my top 10 and Alaphilippe was only 9th, in spite of their recent strong performances in the Vuelta and the Tour, respectively. Riders in positions 5 to 10 would all be expected to perform well in the cycling classics, which tend to be long and arduous, like the Yorkshire course.
For me, 25/1 odds on Sam Bennett are attractive. He has a strong group of teammates, in Dan Martin, Eddie Dunbar, Connor Dunne, Ryan Mullen and Rory Townsend, who will work hard to keep him with the lead group in the hillier early part of the race. Then he will then face an extremely strong Belgian team that is likely to play the same game that Deceuninck-QuickStep successfully pulled off in stage 17 of the Vuelta, won by Gilbert. But Bennett was born in Belgium and he was clearly the best sprinter out in Spain. He should be able to handle the rises near the finish.
A similar case can be made for Kristoff, while Matthews and Van Avermaet both had recent wins in Canada. Nevertheless it is hard to look past the three-times winner Peter Sagan, though if Van Der Poel launches one of his explosive finishes, there is no one to stop him pulling on the rainbow jersey.
After the race, I checked the predicted position of the eventual winner, Mads Pedersen. He was expected to come 74th. Clearly the bad weather played a role in the result, favouring the larger riders, who were able to keep warmer. The Dane clearly proved to be the strongest rider on the day.