I spotted something about this trial on Twitter and I followed it up because I’ve been involved in trials in this area (out-of-hospital cardiac arrest; the PARAMEDIC and PARAMEDIC2 trials). It raises a few interesting design issues that are more widely relevant than just to cardiac arrest trials, so that’s why I’m writing about it now. The design and rationale paper is here.

INCEPTION is a trial comparing treatments for out-of-hospital cardiac arrest patients when they reach the emergency department. Patients are eligible if they are in cardiac arrest, shocked and with no return of spontaneous circulation after 15 minutes. At the emergency department they are randomised either to receive standard care, which is continuation of CPR, or extracorporeal CPR (ECPR), which I think involves connecting the patient to a portable heart-lung support system, while chest compressions are continued.

The main outcome is 30-day survival with CPC score 1 or 2 (i.e. survival with good neurological function), similar to other cardiac arrest trials.

And it has a sample size calculation! Of course, because everything has to. This one comes up with a sample size of 55 per group – that’s 110 in total. In the context of other recent cardiac arrest trials, that is, well, surprising; PARAMEDIC and PARAMEDIC2 had about 4,500 and 8,000 recruits respectively. Why so few in INCEPTION? The answer is that the sample size calculation is based on a true increase in survival from 8% to a massive 30% – that’s a risk ratio of 3.75. Pretty much nothing in the history of medicine has had an effect that big. It’s just not going to happen. So that’s a real problem; if you’re going to use a significance test to decide whether you believe a treatment is effective, and you’re only going to get significance 80% of the time if the real difference is a risk ratio of 3.75, you’re just setting yourself up to fail.

The trial’s stated aims are:

“This trial aims to determine whether ECPR should be considered as a standard of care in patients with refractory OHCA.”

and

“…to determine the effect of ECPR on survival rate and neurological outcome, and to evaluate its feasibility and cost-effectiveness.”

I just don’t see that it’s going to be able to do that.

I’m really not saying that I don’t think they should do this trial, or it’s “unethical” or anything like that. Quite the opposite, in fact – getting high quality randomised evidence is really valuable and if 110 patients is the most that can reasonably be done, that’s 110 patients’ worth of better evidence than we have at the moment, which has to be a good thing, especially if (as I’m sure is the case here) randomising the patients and implementing the intervention are pretty challenging.

What I’m objecting to is the spurious sample size calculation based on a treatment effect that is unlikely, the unrealistic aims of the trial given this sample size, and the almost inevitable conclusion that the treatment “doesn’t work” when the results turn out to be non-significant. It doesn’t make sense to expect to be able to determine clinical and cost effectiveness with such a small number. It would be far more sensible, in my view, to treat this more like an early phase study where the goal is to seek some randomised evidence that the intervention is doing something useful and not causing any safety issues, as an indicator of whether it is worth pursuing more randomised trials in the future.

And there’s more. The trial also includes an adaptive element. Here’s the description from the paper:

An interim analysis will be performed after the CPC score has been established for the 40th patient at 30 days after the OHCA. In this interim analysis, the percentage of survival with good neurological outcome will be calculated for both treatment groups. […] These percentages will be used to estimate the chance of a type II error. In case of an imminent type II error, a new sample size calculation will be performed which may result in an advice to increase the sample size.

There’s not too much information there about how this is going to be done, but one thing that they probably need to know is that you can’t actually estimate the chance of a Type II error from the results. And “an imminent Type II error”? With data for 40 out of 110 patients? How’s that going to work? I should say that I haven’t read the study protocol, so I’m not sure if something has been mangled in the transition from protocol to paper here.

But I guess the intention here is good. It makes some sense that if the results are looking as though they will not be conclusive you might want to have the option of recruiting some more people to try to get more clarity.

So my main point was about being realistic about what a trial is able to do. There is often pressure on researchers to propose trials that resolve a clinical question, and trials are often sold as “definitive.” But often that isn’t going to be possible – the trial you’d like to do might just be too big or too expensive – so I would say that we need to be realistic about this, and see it as OK to fund and do trials that aren’t able by themselves to completely resolve a clinical question, but can provide good quality evidence that can at least start to inform the debate. It just seems a bit more honest.