Human Blood Protein Atlas

A recent report in Science announced the publication of a new human blood protein atlas, describing the disease signatures of thousands of proteins circulating in the blood. Minimally invasive protein profiling marks a step forward in the personalisation of medicine. Some interesting statistical and machine learning techniques were employed.

Blood Protein Study

The researchers’ methods included a technique called proximity extension assay (PEA), which makes use of highly specific probes of DNA strands to detect minute concentrations of proteins in the blood plasma. Amplification with PCR (Polymerase Chain Reactions) allowed 5,416 proteins to be evaluated.

A longitudinal dataset showed dramatic changes as children passed through adolescence to adulthood. The central part of the study was a cross-sectional analysis, where age, sex and BMI were identified as important explanatory factors. The signatures of 59 clinically relevant diseases, in seven classes, can be viewed interactively in The Human Protein Atlas.

Into the secretome

Rather than the hideaway of a reclusive cockney, the secretome refers to the ensemble of secreted proteins. From a data science perspective, the challenge was how to find the signatures of a wide range of diseases, based on the differential abundance of over 5,400 proteins. This was complicated by the fact that many proteins elevated by a particular disease were also found to be elevated in other diseases.

“To investigate the distinct and shared proteomics signatures across diseases, we performed differential abundance analyses. Several groups were used as controls, including healthy samples, a disease background consisting of all other diseases, and samples from the same disease class.”

From The Human Protein Atlas

The differential abundance of proteins was evaluated using normalised protein expression units (NPX). The volcano chart above plots the p-values against the multiplicative (fold) change in NPX, both on log scales. The red values on the right were unusually high and the blue values on the left was exceptionally low.

The researchers used a logistic LASSO approach to identify the importance of proteins in providing a signature of each disease against its cohort. In the case of HIV above, CRTAM was the most significant explanatory factor, even though CD6 had the most extreme p-value.

How does logistic LASSO work?

A logistic model is trained on target values of one or zero, in this case representing the presence or absence of a disease. Least absolute shrinkage and selection operator (LASSO) is a version of linear regression that selects the most relevant explanatory variables using L1 regularisation. Adding the sum of the absolute values of the regression coefficients to the objective function forces the contribution of irrelevant variables towards zero as the hyper-parameter, λ, is increased. This property was particularly useful for the disease signature problem, where there were thousands of potential explanatory proteins.

The tricky aspect of LASSO is tuning the hyper-parameter, λ. You want it to be high enough to eliminate irrelevant variables, but not so high that it discounts the useful explanatory features. In the protein study, this was addressed using cross-validation: randomly splitting the data into 70:30 training and test sets, then rerunning the regression for a range of λ values. The quality of a model can be assessed in terms of both its accuracy and its required number of inputs, using criteria such as the Akaike information criterion or Bayesian information criterion, which favour parsimony. Repeating the randomisation 100 times, the researchers could home in on an optimal value of λ. The regression coefficients of the resulting model could then be used to rank the importance of the relevant proteins, as shown in the right hand side of the panel above.

Personalised health

The potential for a cheap, annual blood test to screen the whole population is immense. Proteomics adds to the arsenal of resources available to help people stay healthy. Early indications of diseases like cancer can be critical in initiating treatment. There is plenty of room to broaden the scope beyond the current 59 diseases, to include rarer conditions, such as Motor Neurone Disease, which has impacted some top sportsmen. It would be extremely helpful to find proteins related to the apparent epidemic of mental health issues, which are hard to define and lack objective, quantitative diagnostic criteria.

PEAQ Performance

If you are a cyclist, athlete, dancer or exerciser struggling to reach your full potential, your might have a mismatch between your training and what you are eating. Persistently running an energy deficit can have an adverse impact on your health and performance, sometimes leading to a condition called Relative Energy Deficiency in Sport (REDs). Optimal training adaptations and peak achievements rely on consistently fuelling for the work required.

I have created an app that generates a score based on a short Personal Energy Availability Questionnaire (PEAQ) designed to identify people at risk.

Personal Energy Availability Questionnaire (PEAQ)

The PEAQ is based on research published in BMJ Open Sport & Exercise Medicine, exploring the relationship between a REDs score derived from the questionnaire and quantified clinical consequences of low energy availability. A similar approach has been used in other research.

The app automates the scoring process and generates a free downloadable report that includes graphics and an interpretation of your result. It takes a few minutes to fill in your answers and the process is anonymous.

The report breaks down the overall score into three health categories. Physical health is based on body mass index (BMI) and injuries. Physiological factors include hormones, sleep and nutrition. Psychological wellbeing relates to habits and anxiety.

Relative energy deficiency

REDs is not confined to top athletes. It can occur in men and women of any age, at all levels of performance, across a spectrum of activities, including sports, exercise and dance.

Relative energy deficits can result from deliberate under-fuelling, particularly in activities where low body weight confers an aesthetic or performance advantage (dance, cycling, climbing, running etc.). Relative energy deficits can also arise, sometimes unintentionally, as a result of stepping up one’s training load without a corresponding increase in energy intake.

Health and performance risks

For evolutionary reasons, your body prioritises movement in the allocation of its energy budget. Energy availability is a measure of the amount of energy left over for day-to-day physiological processes: breathing, digestion, repair, brain function etc.. In an energy deficit, your body switches off inessential processes, such as reproduction. Poor bone health is one of the consequences of a reduction in sex steroid hormones. Other effects of low energy availability include fatigue, disrupted sleep and digestive problems.

For active people, low energy availability reduces your ability to perform high quality training/exercise and depletes your body’s ability to deliver the desired positive adaptations, such as muscle strength and endurance capacity.

Take a PEAQ

Please take advantage of the PEAQ. If you have worries or concerns about your results, Dr Nicky Keay offers personalised health advisory appointments. You can find valuable resources at BASEM.

Technical points

I built this educational health app in Python. It is hosted on the Streamlit Community Cloud. The code is on my GitHub page.

References

Mountjoy M, Ackerman KE, Bailey DM et al 2023 International Olympic Committee’s (IOC) consensus statement on Relative Energy Deficiency in Sport (REDs) British Journal of Sports Medicine 2023;57:1073-1098
Keay N Hormones, Health and Human Potential: A guide to understanding your hormones to optimise your health and performance, Sequoia books 2022
Keay N, Francis G, AusDancersOverseas Indicators and correlates of low energy availability in male and female dancers. BMJ Open in Sports and Exercise Medicine 2020
Nicolas J, Grafenuer S. Investigating pre-professional dancer health status and preventative health knowledge Front. Nutr. Sec. Sport and Exercise Nutrition. 2023 (10)
Keay N, Francis G. Longitudinal investigation of the range of adaptive responses of the female hormone network in pre- professional dancers in training March 2025 ResearchGate DOI: 10.13140/RG.2.2.30046.34880
Keay N. Current views on relative energy deficiency in sport (REDs). Focus Issue 6: Eating disorders. Cutting Edge Psychiatry in Practice CEPiP. 2024.1.98-102
Assessment of Relative Energy Deficiency in Sport, Malnutrition Prevalence in Female Endurance Runners by Energy Availability Questionnaire, Bioelectrical Impedance Analysis and Relationship with Ovulation status. Clinical Nutrition Open Science 2025S.
Sharp S, Keay N, Slee A. Body composition, malnutrition, and ovulation status as RED-S risk assessors in female endurance athletes, Clinical Nutrition ESPEN 2023, 58 :720-721
Keay N, Craghill E, Francis G Female Football Specific Energy Availability Questionnaire and Menstrual Cycle Hormone Monitoring. Sports Injr Med 2022; 6: 177
Nicola Keay, Martin Lanfear, Gavin Francis. Clinical application of monitoring indicators of female dancer health, including application of artificial intelligence in female hormone networks. Internal Journal of Sports Medicine and Rehabilitation, 2022; 5:24.
Nicola Keay, Martin Lanfear, Gavin Francis. Clinical application of interactive monitoring of indicators of health in professional dancers J Forensic Biomech, 2022, 12 (5) No:1000380
Keay, Francis, Hind Low energy availability assessed by a sport-specific questionnaire and clinical interview indicative of bone health, endocrine profile and cycling performance in competitive male cyclists BMJ Open Sports and Exercise Medicine 2018
Keay, Francis, Hind Clinical evaluation of education relating to nutrition and skeletal loading in competitive male road cyclists at risk of relative energy deficiency in sports (RED-S): 6-month randomised controlled trial BMJ Open Sports and Exercise Medicine 2019
Keay, Francis, Hind Bone health risk assessment in a clinical setting: an evaluation of a new screening tool for active populations MOJSports Medicine 2022;5(3):84-88. doi: 10.15406/mojsm.2022.05.00125″