Large Language Models for Cycling

Amid the fanfare about the extraordinary AI capabilities of large language models (LLMs), the race is on to produce LLM applications that actually do something useful. Several hurdles obstruct the way forward, including confabulation, privacy and limitations of training data. This article explains how Retrieval-Augmented Generation can address some of these issues and help find out some useful information about cycling.

Confabulation

Since they are trained only to predict the next word in a sentence, LLMs have no understanding of the text they generate. They can have a tendency to intersperse facts with inaccurate or untrue statements in a manner that seems entirely plausible and convincing. You should not rely on an LLM for advice on important decisions without validating its response. Note that if you ask an LLM for its sources, the citations may be imaginary or if they exist the references may not support the assertions.

One way to reduce the risk of confabulation is to ask the LLM to base its answer selected relevant documents.

Privacy

If you use an online ChatBot, everything you type or upload is stored on the provider’s server. This can be used as further training data or for snooping on users. There is a significant risk that a lot of personal and proprietary company information has already been hoovered up in this way. Companies need ways to sandbox internally developed LLM tools that contain sensitive data and prevent leakage of confidential information.

Privacy can be addressed by running LLMs on local networks. This is possible because inference is a simple task relative to training, which requires banks of data servers. Many free open-source models are available. Running an open-source model locally also avoids paying a subscription to an online provider.

Limitations of training data

LLMs are trained on vast generic data sets. This is a very expensive task, so they are typically frozen at some point in time. If you ask an LLM about current news events, it will typically tell you that it was trained on data up to a certain date in the past and has no knowledge of the event. The models may also lack specialised data in narrow domains.

RAG provides a way to supplement the original training set with selected additional information relevant to the field of interest.

RAG

Retrieval-Augmented Generation is an approach that provides an LLM with additional information to help it answer a query. For example, when asked about current affairs, it would be very helpful to retrieve a copy of today’s newspaper. Companies may wish to develop LLM-based applications with specialist knowledge of internal documentation. RAG makes this possible.

When you interact with an LLM, your conversation is presented to the model in the form of a huge string of numbers called an embedding. LLMs are trained so that sentences with similar words and meanings have similar embeddings. The idea of RAG is to take one or more relevant supplementary documents, split them up into chunks and store the embeddings of these chunks in a vector database. When a user presents a query, the database retrieves the embeddings of chunks that most closely match the query. In this way, the query can be augmented with relevant chunks of text from the source material. Note that this avoids the need to do additional expensive training or fine-tuning of the LLM on specialist topics.

One weakness of RAG is that most LLMs have not been trained specifically on the types of prompts used in RAG, though some companies, like Cohere, are working on this.

Cycling Through the Pandemic

I decided to try this out using a freely available book called Cycling Through the Pandemic, which has the slightly cryptic subtitle, Tactical Urbanism and the Implementation of Pop-Up Bike Lanes in the Time of COVID-19. This 250-page book was published in 2024, so it could not have been part of the training set for the LLM that I was using.

It is unbelievably easy to download and then run an open-source LLM on your local machine. All you need to do is download LM Studio. I decided to use an open-source LLM model called Meta-Llama-3.1-8B-Instruct-GGUF. I dragged and dropped the pdf of the book into the chat-box and started asking questions.

A transcript of my conversation is recorded below. At each step, LM Studio retrieved three chunks of text from the book it considered most relevant to the query, allowing me to check the reliability of the model’s interpretation.

The model answered in good English, basing its responses on the content of the book. It corrected itself when challenged on the claim that Bogotá is in France. It even responded in French to a question posed in French. In my view, RAG can be used to produce genuinely useful applications for LLMs.

RAG dialogue based on a book about the impact of COVID 19 on cycling

You
CyclingCOVID.pdf

What is tactical urbanism

Assistant
Meta-Llama-3.1-8B-Instruct-GGUF
Intermediate steps hidden. Click to expand.

Based on the provided information, tactical urbanism refers to a type of urban planning that involves temporary and low-cost interventions aimed at introducing rapid changes to urban spaces with a broader purpose in mind (Lydon and Garcia 2015). It can be seen as a practical approach to urban change where many small actions implemented at the hyper-local level can achieve longer-term goals of creating a liveable, walkable, sustainable community.

Tactical urbanism often involves shortcuts or rapid and unbureaucratic actions taken by municipalities in response to crises (Citation 1). It can be associated with grassroots initiatives but can also be used by authorities. This approach has been implemented in various contexts, including during the COVID-19 pandemic.

Tactical urbanism is characterized by its focus on temporary and flexible material approaches to urban change, which can be both a practical tool for citizens and enterprising governments alike (Citation 3).

It’s worth noting that there are different perspectives on tactical urbanism, with some viewing it as a means of achieving substantive change without state action, while others see the increasing professionalization of these activities as borrowing their material design language (Citation 2).

You
Why was cycling boosted by COVID 19?

Assistant
Meta-Llama-3.1-8B-Instruct-GGUF
Retrieved 3 relevant citations for user query
Done predicting
Cycling saw a boost during the COVID-19 pandemic for several reasons (Citation 2). Firstly, people had less need to travel due to home working and distance learning. Secondly, the fear of infection led to a strong decline in ridership on public transport, which benefited cycling. Additionally, active mobilities such as cycling were promoted as a means of staying fit and getting exercise when other recreational facilities were closed.

The pandemic also led to an increase in cycling for leisure activities (Buehler and Pucher 2021b), with the strongest increase seen on weekends and in the afternoon. This shift towards cycling as a leisure activity is likely due to the rise of “undirected travel,” or trips without a specific destination, which was encouraged by social distancing measures.

In France, for instance, the first COVID-19 related lockdown led to a rapid increase in cycling use (Buehler and Pucher 2022) and implemented temporary cycle paths. Governments saw cycling as a solution to develop health-friendly mobility while responding to the possible desertion of public transport due to fear of catching the virus (Citation 3).

You
Has the COVID related boost in cycling persisted

Assistant
Meta-Llama-3.1-8B-Instruct-GGUF
Retrieved 3 relevant citations for user query
Done predicting
The survey conducted in Geneva and Lausanne (Rérat et al. 2022; Schmassman and Rérat 2022) found that the COVID-related boost in cycling has largely persisted. The majority of respondents reported cycling as much or more often than before the pandemic, with a significant number even starting to cycle for the first time.

In Geneva, 42% of respondents cycled as much as before, while 44% cycled more often. In Lausanne, 58% cycled as much as before, and 27% cycled more often. This suggests that many people have adopted cycling as a mode of transport during the pandemic and are likely to continue using it.

The fact that 81% of respondents in Geneva (73% in Lausanne) agree or strongly agree that COVID cycle lanes make them feel safer also suggests that these temporary infrastructure projects had a lasting impact on cyclist safety and behavior.

However, as mentioned by Citation 2, the long-term sustainability of this boost is still uncertain. The study notes that most cycling infrastructure implemented during the pandemic involved existing projects, and it remains to be seen whether this “leap forward” will make a lasting difference in the trajectory of local public action or simply represent a spike on an upward graph.

Overall, while there are indications that the COVID-related boost in cycling has largely persisted, further analysis is needed to determine its long-term impact.

You
What has been the impact specifically in London?

Assistant
Meta-Llama-3.1-8B-Instruct-GGUF
Retrieved 3 relevant citations for user query
Done predicting
The impact of temporary cycling infrastructure and “Tactical Urbanist” approaches in London is a complex issue with both positive and negative effects.

Positive impacts include:

The introduction of School Streets, which have shown to promote mode shift towards active modes of travel (Hopkinson et al. 2021) and improve air quality at schools (Air Quality Consultants 2021).
The implementation of temporary cycle lanes and other small-scale projects aimed at improving the smaller residential or local streets in London.
These efforts are part of the Healthy Streets approach, which aims to promote a more sustainable transportation system in London.
However, there have also been negative impacts, including:

Variable outcomes between sites, with some schemes showing little effect (Citation 2).
Criticisms that some projects were implemented too quickly or without adequate consultation, leading to concerns about the speed and process of decision-making.
Vandalism of planters, bollards, and cameras used to prevent through-traffic in some areas (Citation 3).
Removal of Low Traffic Neighbourhoods, especially in Outer London, due to opposition from local residents.
The Temporary City report by Asa Thomas highlights the importance of balancing big change with small impact, emphasizing the need for careful planning and consultation when implementing new transportation projects.

You
What research was done in French cities?

Assistant
Meta-Llama-3.1-8B-Instruct-GGUF
Retrieved 3 relevant citations for user query
Done predicting
The research on French cities during the COVID-19 pandemic focused on understanding how urban planners and policymakers implemented measures to promote cycling and improve public transportation.

A study examined five French cities: Grenoble, Lyon, Montpellier, Rennes, and Bogotá (Citation 1). The study found that these cities had varying levels of population size, density, and surface area. For example, Bogotá was the most populous city with over 7 million inhabitants, while Grenoble had a smaller population of around 160,000.

Another study looked at the implementation of measures in French cities during the pandemic (Citation 2). The research found that some cities focused on improving cycling infrastructure within the city center, such as Lyon and Rennes. In contrast, other cities prioritized links between periurban municipalities or outer suburbs.

A book titled “Changing Bikeability: Everyday Cycling in Developed Territories” explores changing bikeability in French cities during the pandemic (Citation 3). The book examines six French cities: Grenoble, Lyon, Montpellier, Paris, Rennes, and Saint-Étienne. The research found that each city had its unique characteristics and ways of implementing measures to promote cycling.

These studies provide insights into how different types of cities implemented measures to improve cycling infrastructure during the pandemic, and whether these measures were made permanent.

You
Is Bogotá in France?

Assistant
Meta-Llama-3.1-8B-Instruct-GGUF
Retrieved 3 relevant citations for user query
Done predicting
No, Bogotá is not in France. According to Citation 1 and Citation 2, Bogotá is located in Colombia, and it has been compared with four French cities: Grenoble, Lyon, Montpellier, and Rennes.

In fact, the study mentioned in Citation 3 compares the cycling infrastructure in these four French cities (Grenoble, Lyon, Montpellier, and Rennes) with that of Bogotá, a city located in Colombia.

You
Quels ont été les résultats à Grenoble ?

Assistant
Meta-Llama-3.1-8B-Instruct-GGUF
Retrieved 1 relevant citations for user query
Done predicting
Les résultats obtenus à Grenoble sont mentionnés dans la citation :

“Nous avons contacté le Syndicat Mixte des Mobilités de l’Aire Grenobloise et réussi à obtenir la base de données du réseau cyclable. De plus, un stagiaire a effectué une interview avec un technicien de cette organisation, et sur plusieurs occasions, il est allé visiter le site pour voir si les voies temporaires (pop-up tracks) existaient toujours.”

Cela indique que les auteurs ont réussi à collecter des données sur le réseau cyclable de Grenoble en contactant la Syndicat Mixte des Mobilités de l’Aire Grenobloise et en effectuant une interview avec un technicien. Cependant, il n’y a pas d’information sur les résultats concrets de ces efforts.

Cependant, on peut trouver plus d’informations dans le graphique (Figure 8.2) qui montre différents types de voies temporaires installés dans les cinq villes étudiées, dont Grenoble.

39.77 tok/sec

234 tokens

3.34s to first token

Stop: eosFound

Cyclo-Social Networks

World Tour Rider Network

When Mexico’s Isaac del Toro and the Brit, Finlay Pickering, joined World Tour teams in 2024, they became part of an elite group. The structure of the professional cycling community differs from other types of social network, because the sport is organised around close-knit teams. You may have heard of the idea that everyone in the world is connected by six degrees of separation. Many social networks include key individuals who act as hubs linking disparate groups. How closely connected are professional cyclist? Which cyclists are the most connected?

Forming a link

An obvious way for cyclists to become acquainted is by being part of the same team. They spend long periods travelling, training, eating and relaxing together, often sharing rooms. Working as a team through the trials and tribulations of elite competition develops high degrees of camaraderie.

Each rider’s page on ProCyclingStats includes a team history. So I checked the past affiliations of all the current UCI world team riders. The idea was to build a graph where each node represented a rider, with edges connecting riders who had been in the same team. Each edge was weighted by the number of years a pair of riders had been in the same team, reflecting the strength of their relationship.

Cyclo-social network

The resulting graphical network, displayed at the top of this blog, has some interesting properties that reveal the dynamics of professional cycling. The 18 world tour teams are displayed in different colours. The size of each rider node is scaled by length of career. An experienced rider, like Geraint Thomas, has a larger node and more connections, which are shown as light grey lines, where the thickness represents years in the pair of riders were in the same team. Newer riders, with fewer connections tend to be on the periphery. For example, Isaac del Toro is the orange UAE Team Emirates rider at the top.

There are no fixed rules about how to represent the network, but the idea is that more closely related riders ought to congregate together. The network shows INEOS near UAE as shades of orange in the top right. Next we have Team Visma Lease a Bike in cyan, around two o’clock, close to Lidl – Trek in light green, Bahrain – Victorious in darker blue and Cofidis in lighter blue. Yellow Groupama – FDJ lies above the dark brown Jayco AlUla riders in the lower right. Red Movistar, light green Team dsm-firmenich PostNLand light blue BORA – Hansgrohe are near 6 o’clock, though Primož Roglič lies much closer to his old team. Decathlon AG2R La Mondiale is light blue in the lower left, near the darker blue of Alpecin – Deceuninck and pale green EF Education – EasyPost. Astana is orange, around nine o’clock. Dark blue Soudal Quick-Step, lighter blue Intermarché – Wanty and red Arkéa – B&B Hotels all hover around the top left. Teams that are more dispersed would be indicative of higher annual turnover.

3 degrees of separation

No rider is more than three steps from any other. In fact the average distance between riders is two steps, because the chances are that two riders have ridden with someone in common over their careers. Although the neo-pros are obviously more distantly connected, Isaac del Toro rides with Adam Yates, who spent six years alongside Jack Haig at Michelton Scott, but who now rides for Bahrain Victorious as a teammate of Finlay Pickering.

Bob Jungels is the most connected rider, having been a teammate of 100 riders in the current peloton. Since 2012, he has ridden for five different teams. He and Florian Senechal are only two steps from all riders. The Polish rider Łukasz Wiśniowski has been with six teams and has 94 links. Then we have Mark Cavendish with 91 and Rui Costa with 89.

In contrast, taking account of spending multiple years as teammates, Geraint Thomas tops the list with 218 teammate years. Interestingly, he is followed by Jonathan Castroviejo, Salvatore Puccio, Michal Kwiatkowski, Luke Rowe, Ben Swift, all long-time colleagues at INEOS Grenadiers. This suggests they are very happy (or well paid) staying where they are.

If we estimate “long-term team loyalty” by number of teammate years divided by number teammates, Geraint is top, followed by Salvatore Puccio, Michael Hepburn, Luke Durbridge, Luke Rowe, Simon Yates and Jasper Stuyven.

Best buddies

The riders who have been teammates for the longest are Luke Durbridge/ Michael Hepburn and Robert Gesink/Steven Kruijswijk both 15 years,
Geraint Thomas has ridden with Ben Swift and Salvatore Puccio for 14 years and Luke Rowe with Salvatore Puccio for 13.

Outliers

In a broader analysis that includes all the Pro Continental teams alongside the World Tour, the graph below shows a notable team of outliers. This is Team Novo Nordisk for athletes who compete with type 1 diabetes. Their Hungarian rider, Peter Kusztor, was the teammate of several current riders, such as Jan Tratnik, prior to joining Novo Nordisk, in a career that stretches back to 2006. The team is an inspiration to everyone affected by diabetes.

World Tour and Pro Continental Teams

Analysis

This analysis was performed in Python, using the NetworkX library.

Code can be found here.

Fuelling your rides on Strava

As we move into our 40s, 50s and beyond, we may become aware of changes in our bodies. Performance peaks level off or start to decline. Even if you don’t feel old, it becomes harder to keep up with younger sprinters. It takes longer to recover from a hard ride, injury or illness.

Muscle, Fat and Bone

The cause of these age-related changes is a decline in the production of specific hormones. Growth hormone falls insidiously from the time we reach adult height. From the age of 50, testosterone levels drop slightly in men, while oestradiol levels fall dramatically as women reach menopause. The key thing to note about growth hormone and testosterone is that they are anabolic agents, i.e. they build muscle. As they decline, there is a tendency to lose muscle and to increase fat deposition. Sex steroids also play a pivotal role in bone formation.

Protein, Carbohydrates and Vitamin D

Fortunately there are measures we can take to counter the effects of declining hormones. Nutrition plays an important role. Understanding the physiological effects of hormonal changes makes it easier to recognise beneficial adaptations in your diet.

Protein provides the building blocks required for muscle. Taking an adequate level of protein, spread out through the day, is beneficial.

Carbohydrates are the key fuel for moderate to high intensity. Fasted training is not advisable. The body’s shock reaction to underfuelled training is to deposit fat.

The UK government advises everyone to take vitamin D supplements, especially over the winter. In addition to supporting bone health, studies have shown improved immunity and muscle recovery.

Nutrition as you get older

Nutrition, Exercise and Recovery

When combined with adequate nutrition, exercise, particularly strength training, stimulates the production of growth hormone and testosterone. It is important to ensure adequate recovery and to follow a regular routine of going to be early, because these hormones are produced while you are asleep.

Everybody is unique, so you need to work out what works best for you. For further insights on this topic, Dr Nicky Keay has written a book full of top tips, called Hormones Health and Human Potential.

Dreaming of the Giro

fast.ai’s latest version of Practical Deep Learning for Coders Part 2 kicks off with a review of Stable Diffusion. This is a deep neural network architecture developed by Stability AI that is able to convert text into images. With a bit of tweaking it can do all sorts of other things. Inspired by the amazing videos created by Softology, I set out to generate a dreamlike video based on the idea of riding my bicycle around a stage of the Giro d’Italia.

Text to image

As mentioned in a previous post, Hugging Face is a fantastic resource for open source models. I worked with one of fast.ai’s notebooks using a free GPU on Google Colab. In the first step I set up a text-t0-image pipeline using a pre-trained version of stable-diffusion-v1-4. The prompt “a treelined avenue, poplars, summer day, france” generated the following images, where model was more strongly guided by the prompt in each row. I liked the first image in the second row, so I decided to make this the first frame in an initial test video.

Stable diffusion is trained in a multimodal fashion, by aligning text embeddings with the encoded versions corresponding images. Starting with random noise, the pixels are progressively modified in order to move the encoding of the noisy image closer to something that matches the embedding of the text prompt.

Zooming in

The next step was to simulate the idea of moving forward along the road. I did this by writing a simple two-line function, using fast.ai tools, that cropped a small border off the edge of the image and then scaled it back up to the original size. In order to generate my movie, rather that starting with random noise, I wanted to use my zoomed-in image as the starting point for generating the next image. For this I needed to load up an image-to-image pipeline.

I spent about an hour experimenting with with four parameters. Zooming in by trimming only a couple of pixels around the edge created smoother transitions. Reducing the strength of additional noise enhanced the sense of continuity by ensuring that that subsequent images did not change too dramatically. A guidance scale of 7 forced the model to keep following prompt and not simply zoom into the middle of the image. The number of inference steps provided a trade-off between image quality and run time.

When I was happy, I generated a sequence of 256 images, which took about 20 minutes, and saved them as a GIF. This produced a pleasing, constantly changing effect with an impressionist style.

Back to where you started

In order to make the GIF loop smoothly, it was desirable to find a way to return to the starting image as part of the continuous zooming in process. At first it seemed that this might be possible by reversing the existing sequence of images and then generating a new sequence of images using each image in the reversed list as the next starting point. However, this did not work, because it gave the impression of moving backwards, rather than progressing forward along the road.

After thinking about the way stable diffusion works, it became apparent that I could return to the initial image by mixing it with the current image before taking the next step. By progressively increasing the mixing weight of the initial image, the generated images became closer to target over a desired number of steps as shown below.

Putting it al together produced the following video, which successfully loops back to its starting point. It is not a perfect animation, because the it zooms into the centre, whereas the vanishing point is below the centre of the image. This means we end up looking up at the trees at some points. But overall it had the effect I was after.

A stage of the Giro

Once all this was working, it was relatively straightforward to create a video that tells a story. I made a list of prompts describing the changing countryside of an imaginary stage of the Giro d’Italia, specifying the number of frames for each sequence. I chose the following.

[‘a wide street in a rural town in Tuscany, springtime’, 25],

[‘a road in the countryside, in Tuscany, springtime’,25],

[“a road by the sea, trees on the right, sunny day, Italy”,50],

[‘a road going up a mountain, Dolomites, sunny day’,50],

[‘a road descending a mountain, Dolomites, Italy’,25],

[‘a road in the countryside, cypress trees, Tuscany’,50],

[‘a narrow road through a medieval town in Tuscany, sunny day’,50]

These prompts produced the video shown at the top of this post. The springtime blossom in the starting town was very effective and the endless climb up into the sunlit Dolomites looked great. For some reason the seaside prompt did not work, so the sequence became temporarily stuck with red blobs. Running it again would make something different. Changing the prompts offered endless possibilities.

The code to run this appears on my GitHub page. If you have a Google account, you can open it directly in Colab and set the RunTime to GPU. You also need a free Hugging Face account to load the stable diffusion pipelines.

Milan Sanremo in a Random Forest

Last time I tried to predict a race, I trained up a neural network on past race results, ahead of the World Championships in Harrogate. The model backed Sam Bennett, but it did not take account of the weather conditions, which turned out to be terrible. Fortunately the forecast looks good for tomorrow’s Milan Sanremo.

This time I have tried using a Random Forest, based on the results of the UCI races that took place in 2020 and so far in 2021. The model took account of each rider’s past results, team, height and weight, together with key statistics about each race, including date, distance, average speed and type of parcours.

One of the nice things about this type of model is that it is possible to see how the factors contribute to the overall predictions. The following waterfall chart explains why the model uncontroversially has Wout van Aert as the favourite.

Breakdown of prediction for Wout van Aert

The largest positive contribution comes from being Wout van Aert. This is because he has a lot of good results. His height and weight favour Milan Sanremo. He also has a strong positive coming from his team. This distance and race type make further positive contributions.

We can contrast this with the model’s prediction for Mathieu van der Poel, who is ranked 9th.

Breakdown of prediction for Mathieu van der Poel

We see a positive personal contribution from being van der Poel, but having raced fewer UCI events, he has less of a strong set of results than van Aert. According to the model the Alpecin Fenix team contribution is not a strong as Jumbo Visma, but the long distance of the race works in favour of the Dutchman. The day of year gives a small negative contribution, suggesting that his road results have been stronger later in the year, but this could be due to last year’s unusual timing of races.

Each of the other riders in the model’s top 10 is in with a shout.

It’s taken me all afternoon to set up this model, so this is just a short post.

Post race comment

Where was Jasper Stuyven?

Like Mads Pedersen in Harrogate back in 2019, Jasper Stuyven was this year’s surprise winner in Sanremo. So what had the model expected for him? Scrolling down the list of predictions, Stuyven was ranked 39th.

Breakdown of prediction for Jasper Stuyven

His individual rider prediction was negative, perhaps because he has not had many good results so far this year, though he did win Omloop Het Nieuwsblad last year and had several top 10 finishes. The model assessed that his greatest advantage came from the length of the race, suggesting that he tends to do well over greater distances.

The nice thing about this approach is that that it identifies factors that are relevant to particular riders, in a quantitative fashion. This helps to overcome personal biases and the human tendency to overweight and project forward what has happened most recently.

Pro cycling team networks

The COVID-19 pandemic has further exposed the weakness of the professional cycling business model. The competition between the teams for funding from a limited number of sponsors undermines the stability of the profession. With marketing budgets under strain, more teams are likely to face difficulties, in spite of the great advertising and publicity that the sport provides. Douglas Ryder is fighting an uphill struggle trying to keep his team alive after the withdrawal of NTT as a lead sponsor. One aspect of stability is financial, but another measure is the level of transfers between teams.

The composition of some teams is more stable than others. This is illustrated by analysing the history of riders’ careers, which is available on ProCyclingStats. The following chart is a network of the transfers between teams in the last year, where the yellow nodes are 2020 teams and the purple ones are 2019. The width of the edges indicates how many riders transferred between the teams, with the thick green lines representing the bulk of the riders who stuck with the same team. The blue labels give the initials of the official name of each team, such as M-S (Mitchelton-Scott), MT (Movistar Team), T-S (Trek-Segafredo) and TS (Team Sunweb). Riders who switched teams are labelled in red.

Although there is a Dutch/German grouping on the lower right, the main structure is from the outside towards the centre of the network.

The spikes around the end of the chart show riders like Geoffrey Soupe or Rubén Fernández, who stepped down to smaller non World Tour teams like Team Total Direct Energie (TTDE), Nippo Delko One Provence (NNDP), Euskaltel-Euskadi (E-E), Androni Giocattoli-Sidermec (AG-S ) or U-XPCT (Uno-X Pro Cycling Team).

The two World Tour outliers were Mitchelton-Scott (M-S) and Groupama FDJ (GF), who retained virtually all their riders from 2019. Moving closer in, a group of teams lies around the edge of the central mass, where a few transfers occurred. Moving anti-clockwise we see CCC Team (CT), Astana Pro Team (APT), Trek-Segafredo (T-S), AG2R Le Mondial (ALM), Circus-Wanty Gobert (C-WG), Team Jumbo Visma (TJV), Bora-Hansgrohe (B-H) and EF Pro Cycling (EPC).

Deeper in the mêlée, Ineos (TI_19/IG_20), Deceuninck – Quick Step (D-QS), UAE-Team Emirates (U-TE), Lotto Soudal (LS), Bahrain – McLaren (B-H) and Movistar Team(MT) exchanged a number of riders.

Right in the centre Israel Start-Up Nation (IS-UN) grabbed a whole lot of riders, including 7 from Team Arkéa Samsic (TAS). Meanwhile likes of Victor Campenaerts and Domenico Pozzovivo are probably regretting joining NTT Pro Cycling (TDD_19/NPC_20).

Looking forward

A few of the top riders have contracts for next year showing up on ProCyclingStats. So far 2020/2021 looks like the network below. Many riders are renewing with their existing teams, indicated by the broad green lines. But some big names are changing teams, including Chris Froome, Richie Porte, Laurens De Plus, Sam Oomen, Romain Bardet and Wilco Keldeman, Bob Jungels and Lilian Calmejane.

What about networks of riders?

My original thought when starting this analysis was that over their careers, certain riders must have been team mates with most of the riders in today’s peloton, so who is the most connected? Unfortunately this turned out to be ridiculously complicated, as shown in the image below, where nodes are riders with links if they were ever teammates and the colours represent the current teams. The highest ranked rider in each team is shown in red.

It is hard to make much sense of this, other than to note that those with shorter careers in the same team are near the edge and that Philippe Gilbert is close to the centre. Out of interest, the rider around 9 o’clock linking Bora and Jumbo Visma is Christoph Pfingsten, who moved this year. At least we can conclude that professional cyclists are well-connected.

Time to be aerodynamic

The Covid-19 epidemic provided a huge boost to the Zwift streaming service. Confined by a global lockdown, cyclists freed themselves from the boredom of pedalling on a static turbo trainer by logging into one of a broadening range of online virtual worlds. Zwift racing has become particularly popular. While it is relatively straightforward to simulate variations in gradient and even the effects of drafting, it is not possible for riders to demonstrate superior bike handling skills. Nor can racers benefit from adopting a superior aerodynamic position on the bike, in fact this may prove to be a disadvantage.

Setting aside e-doping suspicions, such as riders understating their weights, in the artificial world of a Zwift race, the outcome largely comes down the the ability to sustain a high level of power (watts per kilo). The engagingly competitive nature of simulated races encourages everyone to push their limits. However, since Zwift offers no penalty against maintaining a non-aerodynamic body position on your trainer, it is quite possible that regular Zwifters might become habituated to riding in position that is far from optimal for the road.

Fresh aerodynamics

Once out in the fresh air again, many riders may have noticed improvements in the levels of power they are able to sustain, thanks to the high levels of exertion required to compete on Zwift. But in the real world, when it comes to beating other riders in a race or a time trial, the principle force a rider has to overcome is aerodynamic drag, not electromagnetic resistance.

Maximum speed is attained by adopting a riding position that provides the optimal tradeoff between the ability to generate power and a low level of aerodynamic drag. Drag depends on a rider’s CdA, which represents the drag coefficient multiplied by frontal area. Since power rises with the cube of velocity, there comes a point where it is better to compromise on power in order to reduce frontal area. This is the key to time trialing and successful breakaways.

When the race season begins, skilful and more aerodynamic racers will be able to benefit from drafting in the huge wind shadow created by Zwift diesels, while offering back much less assistance when they pull through. So after prolonged training on Zwift, racers and time trialists really need to focus on improving their aerodynamics

There are various ways to reduce drag, starting withs some basics as described in an earlier blog. Post ride analysis can be performed using Golden Cheetah, BestBikeSplit or MyWindSock. There is also a range of devices that claim to offer real time measurement of CdA. These have been primarily targeted at the TT/triathlon market, but there’s no doubt that these could be incredibly useful for both training or even, perhaps, a race breakaway. Cycling Weekly recently reviewed the Notio device, but, while useful, these tools remain expensive and a bit clunky.

Whatever you choose to do, stay safe and stay aero.

No drafting

In a fascinating white paper, Bert Blocken, Professor of Civil Engineering at Eindhoven University of Technology, comments on social distancing when applied to walking, running or cycling. His point is that the government recommendations to maintain a distance of 1.5 or 2 metres assume people are standing still indoors or outdoors in calm weather. However, when a person is moving, the majority of particulate droplets are swept along in a trailing slipstream.

Cyclists typically prefer to ride closely behind each other, in order to benefit from the aerodynamic drafting effect. Cycling is currently a permitted form of exercise in the UK, though only if riding alone or with members of your household. Nevertheless, there may be times when you find yourself catching up with a cyclist ahead. In this situation, you should avoid the habitual tendency to move up into the slipstream of the rider in front.

Professor Blocken’s team has performed computational fluid dynamics (CFD) simulations showing the likely spread of micro-droplets behind people moving at different speeds. As the cloud of particles, produced when someone coughs or sneezes, is swept into the slipstream, the heavier droplets, shown in red in the diagram above, fall faster. These are generally thought to be more considerably more contagious. You can see that they can land on the hands and body of the following athlete.

Based on the results, Blocken advises to keep a distance of at least four to five meters behind the leading person while walking in the slipstream, ten meters when running or cycling slowly and at least twenty metres when cycling fast.

Social Distancing v2.0

The recommendation, for overtaking other cyclists, is to start moving into a staggered position some twenty metres behind the rider in front, consistently avoiding the slipstream as you pass.

The results will be reported in a forthcoming peer-reviewed publication. But given the importance of the topic, I recommend that you take a look at the highly accessible three page white paper available here.

References

Social Distancing v2.0: During Walking, Running and Cycling
Bert Blocken, Fabio Malizia, Thijs van Druenen, Thierry Marchal

Bike Identification as a web app

One of the first skills acquired in the latest version of the fast.ai course on deep learning is how to create a production version of an image classifier that runs as a web application. I decided to test this out on a set of images of road bikes, TT bikes and mountain bikes. To try it out, click on the image above or go to this website https://bike-identifier.onrender.com/ and select an image from your device. If you are using a phone, you can try taking photos of different bikes, then click on Analyse to see if they are correctly identified. Side-on images work best.

How does it work?

The first task was to collect some sample images for the three classes of bicycles I had chosen: road, TT and MTB. It turns out that there is a neat way to obtain the list of urls for a Google image search, by running some javascript in the console. I downloaded 200 images for each type of bike and removed any that could not be opened. This relatively small data set allowed me to do all the machine learning using the CPU on my MacBook Pro in less than an hour.

The fast.ai library provides a range of convenient ways to access images for the purpose of training a neural network. In this instance, I used the default option of applying transfer learning to a pre-trained ResNet34 model, scaling the images to 224 pixel squares, with data augmentation. After doing some initial training, it was useful to look at the images that had been misclassified, as many of these were incorrect images of motorbikes or cartoons or bike frames without wheels or TT bars. Taking advantage of a useful fast.ai widget, I removed unhelpful training images and trained the model further.

The confusion matrix showed that final version of my model was running at about 90% accuracy on the validation set, which was hardly world-beating, but not too bad. The main problem was a tendency to mistake certain road bikes for TT bikes. This was understandable, given the tendency for road bikes to become more aero, though it was disappointing when drop handlebars were clearly visible.

The next step was to make my trained network available as a web application. First I exported the models parameter settings to Dropbox. Then I forked a fast.ai repository into my GitHub account and edited the files to link to my Dropbox, switching the documentation appropriately for bicycle identification. In the final step, I set up a free account on Render to host a web service linked to my GitHub repository. This automatically updates for any changes pushed to the repository.

Amazingly, it all works!

References

fast.ai lesson 2

My GitHub repository, include Jupyter notebook

Strava – Tour de Richmond Park Clockwise

Screenshot 2019-05-22 at 15.24.51

Following my recent update on the Tour de Richmond Park leaderboard, a friend asked about the ideal weather conditions for a reverse lap, clockwise around the park. This is a less popular direction, because it involves turning right at each mini-roundabout, including Cancellara corner, where the great Swiss rouleur crashed in the 2012 London Olympics, costing him a chance of a medal.

An earlier analysis suggested that apart from choosing a warm day and avoiding traffic, the optimal wind direction for a conventional anticlockwise lap was a moderate easterly, offering a tailwind up Sawyers Hill. It does not immediately follow that a westerly wind would be best for a clockwise lap, because trees, buildings and the profile of the course affect the extent to which the wind helps or hinders a rider.

Currently there are over 280,000 clockwise laps recorded by nearly 35,000 riders, compared with more than a million anticlockwise laps by almost 55,000 riders. As before, I downloaded the top 1,000 entries from the leaderboard and then looked up the wind conditions when each time was set on a clockwise lap.

In the previous analysis, I took account of the prevailing wind direction in London. If wind had no impact, we would expect the distribution of wind directions for leaderboard entries to match the average distribution of winds over the year. I defined the wind direction advantage to be the difference between these two distributions and checked if it was statistically significant. These are the results for the clockwise lap.

RoseSegmentBarSegmentclockwise

The wind direction advantage was significant (at p=1.3%). Two directions stand out. A westerly provides a tailwind on the more exposed section of the park between Richmond Gate and Roehampton, which seems to be a help, even though it is largely downhill. A wind blowing from the NNW would be beneficial between Roehampton and Robin Hood Gate, but apparently does not provide much hindrance on the drag from Kingston Gate up to Richmond, perhaps because this section of the park is more sheltered. The prevailing southwesterly wind was generally unfavourable to riders setting PBs on a clockwise lap.

The excellent mywindsock web site provides very good analysis for avid wind dopers. This confirms that the wind was blowing predominantly from the west for the top ten riders on the leaderboard, including the KOM, though the wind strength was generally light.

The interesting thing about this exercise is that it demonstrates a convergence between our online and our offline lives, as increasing volumes of data are uploaded from mobile sensors. A detailed analysis of each section of the million laps riders have recorded for Richmond Park could reveal many subtleties about how the wind flows across the terrain, depending on strength and direction. This could be extended across the country or globally, potentially identifying local areas where funnelling effects might make a wind turbine economically viable.

References

Jupyter notebook for calculations