Original post http://blogs.warwick.ac.uk/simongates/entry/new_test_can/ 2 May 2015
A story in several UK papers including the Telegraph suggests that a test measuring telomere length can predict who will develop cancer “up to 13 years” before it appears. Some of the re-postings have (seemingly by a process of Chinese whispers) elaborated this into “A test that can predict with 100 per cent accuracy whether someone will develop cancer up to 13 years in the future has been devised by scientists” (New Zealand Herald) – which sounds pretty unlikely.
What they are talking about is this study, which analysed telomere lengths in a cohort of people, some of whom developed cancer.
It’s hard to know where to start with this. there are two levels of nonsense going on here; the media hype, which has very little to do with the results of the study, and the study itself, which seems to come to conclusions that are way beyond what the data suggest, through a combination of over-reliance on sgnificance testing, poor methodology and wishful thinking. I’ll leave the media hype to one side, as it’s well-established that reporting of studies often bears little relation to what the study actually did; in this case, there was no “test” and no “100% accuracy”. But what about what the researchers really found out, or thought they did?
The paper makes two major claims:
1. “Age-related BTL attrition was faster in cancer cases pre-diagnosis than in cancer-free participants” (that’s verbatim from their abstract);
2. “all participants had similar age-adjusted BTL 8–14 years pre-diagnosis, followed by decelerated attrition in cancer cases resulting in longer BTL three and four years pre-diagnosis” (also vebatim from their abstract, edited to remove p-values).
They studied a cohort of 579 initially cancer-free US veterans who were followed up annually between 1999 and 2012, with blood being taken 1-4 times from each participant. About half had only one or two blood samples, so there isn’t much in the way of within-patient comparisons of telomere length over time. Telomere length was measured from these blood samples (this was some kind of average, but I’ll assume intra-individual variation isn’t important).
Figure 1 illustrates the first result:
The regression lines do look as though there is a steeper slope through the cancer group, and the interaction is “significant” (p-0.032 when unadjusted and p=0.017 adjusted) – but what is ignored in the interpretation is the enormous scatter around both of the regression lines. Without the lines on the graph you wouldn’t be able to tell whether there was any difference in the slopes. Additionally, as relatively few participants had multiple readings, it isn’t possible to do the analysis of comparing within-patient measures of change in telomere length, which might be less noisy. Instead we have an analysis of average telomere length at each age, with a changing set of patients. So, on this evidence, it is hard to imagine how this could ever be a useful test that would be any good for distinguishing people who will develop cancer from those who will not. The claim of a difference seems to come entirely from the “statistical significance” of the interaction.
The second claim, that in people who develop cancer BTL stops declining and reaches a plateau 3-4 years pre-diagnosis, derives from their Figure 2:
Again, the claim derives from the difference between the two lines being “statistically significant” at 3-5 years pre-diagnosis, and not elsewhere. But looking at the red line, it really doesn’t look like a steady decline, followed by a plateau in the last few years. If anything, the telomere length is high in the last few years, and the “significance” is caused by particularly low values in the cancer-free group in those years. I’m not sure that this plot is showing what they think it shows; the x-axis for the cancer group is years pre-diagnosis, but for the non-cancer group it is years pre-censoring, so it seems likely that the non-cancer group will be older at each point on the x axis. Diagnoses of cancer could happen at any time, whereas most censoring is likely to happen at or near the end of the study. If BTL declines with age, that could potentially produce this sort of effect. So I’m pretty unconvinced. The claim seems to result from looking primarily at “statistical significance” of comparisons at each time point, which seems to have trumped any sense-checking.