| · Veterinary medicines
· Technicians
· Business skills
· GlaxoSmithKline (7)
· Statistics
· Generics
· The profession
· Medicines use review
· Patient packs
· Elections
· The Society (2)
· The Journal
Letters to the Editor
|
Statistics
P-values need confidence limits
From Dr F. Leach, MRPharmS, and Dr B. Faragher, FSS
Scott Pegler and Jonathan Underhill are to be congratulated on their
lively and informative article (PJ, 5 March, p271) on the evaluation
of medicines-related promotional material (PDF 110K). Although we endorse
their emphasis on the important distinction between statistical and clinical
significance, we are uneasy at their statement that “statistical
significance simply means that the results were unlikely to have occurred
by chance”. We hope that the following points will clarify our
reservations.
The statistical significance of trial data is commonly assessed by reference
to a P-value, determined using an appropriate statistical hypothesis
test. The P-value is commonly defined as the probability of observing
the study data by chance. Although technically correct, this definition
is imprecise. A more informative definition, in the context of clinical
trials, is that the P-value is the probability that a difference equal
to or greater than that observed in the study could have occurred if
the null hypothesis (that there is actually no difference between the
treatments) is true. A hypothetical example might help to clarify the
subtle but important distinction between the two definitions.
A clinical trial is conducted to compare the effects of two drugs in
reducing systolic blood pressures (SBP) in hypertensive patients. The
mean difference in the effects of the two drugs on SBP is found to be
12mmHg (P<0.05). This allows us to conclude that the probability of
observing a difference of 12mmHg or greater if the two drugs are, in
fact, identical in effect (ie, if the null hypothesis is true) is less
than 1 in 20. We declare, therefore, that the observed difference is
too unlikely under the null hypothesis, so we reject this and accept
the alternative hypothesis of a real difference in the effects of the
two drugs (ie, we declare that the outcome of the trial is statistically
significant). This does not, of course, prove that there really is a
difference in effects; it merely limits the uncertainty surrounding the
result. If, as should be the case, the P-value is quoted exactly, the
situation is clarified even further. Suppose that, in our trial, P=0.02;
as before, we declare the outcome to be statistically significant but
we can be aware that there is a 2 per cent chance that the observed (or
a larger) difference could have occurred if the drugs are equipotent.
Conversely, if P>0.05, convention would not allow us to reject the
null hypothesis (or, more pertinently, to accept the alternative hypothesis)
but this does not necessarily justify a conclusion that the effects of
the two drugs on SBP are equal. The difference between them could be
clinically significant; our trial might have been insufficiently powerful
to detect this. Whatever the outcome of our hypothetical trial, reporting
the confidence interval around our estimate of 12mmHg would be far more
informative than reliance on the P-value alone.
A rigidly dichotomous interpretation of the outcome of significance tests,
whereby results are classified as “positive” or “negative” solely
on the basis of the main P-value, remains one of the most common errors
in the interpretation of trial data.1,2 Our own experience indicates
that, armed with the imprecise definition quoted by Pegler and Underhill,
many clinicians and pharmacists fail to deduce correctly whether or not
a trial result is statistically significant. Such confusion is much rarer
when a properly precise definition is used.
Since the P-value still enjoys widespread use in the promotional material
of pharmaceutical companies, it is important that the relevance and limitations
of statistical significance are appreciated, that loose definitions of
its meaning are avoided and that, ideally, P-values are accompanied by
confidence limits.
Frank Leach
Medicines Information Pharmacist
North West Medicines Information Centre
Brian Faragher
Senior Lecturer in Statistics and Research Methods
Organisational Psychology Group
University of Manchester
References
1. Sterne JAC, Smith GD. Sifting the evidence — what’s wrong
with significance tests? BMJ 2001;322:226–31.
2. Leach F, Faragher B. Statisticians are useful to know. The Pharmaceutical
Journal 2005;274:48. |