LOB-vs
Download Lectures on Biostatistics (1971).
Corrected and searchable version of Google books edition

Download review of Lectures on Biostatistics (THES, 1973).

Latest Tweets
Categories
Archives

Uncategorized

Is it bad to write for hard-right outlets?

There is no doubt that the Overton window has shifted to the right during the last decade or two.  It is now common to hear people saying things that, even in 2010 would have been thought to be frankly fascistic.

I recall a conversation with the great biophysicist, Sir Bernard Katz, in 1992. He had come to UCL in 1936 to escape from the Nazi regime in Leipzig.  When I suggested to him that he must have been very pleased about the reunification of Germany, he pulled a long face and said “hmm, let’s wait to see what crawls out from under stones”.  He was, as so often, right. The radical right party, Alternative für Deutschland, has gained strength, especially in the former East Germany.

It isn’t long since we used to laugh at the USA for the far-right tendencies of Fox News.  Now we have its direct equivalent in GB News and Talk TV. Neither carries much advertising.  GB News lost £42 million last year.  In an evening of watching GB News, the only advertisement that I heard was for boxt.co.uk who sponsored the weather forecast. So, who is paying for them?  It seems to be much the same people who are paying for other far-right sites like Spiked Online and for slightly more subtle organisations that have sprung up to push hard-right views.  These include UnHerd, Quillette and The Critic. Mostly it’s super-rich business people whose wealth allows them to push for laws that make them even more wealthy while pretending to be on the side of the people against the “elites”.  In the case of Talk TV it’s Rupert Murdoch.  In the case of GB News, and UnHerd, and the Free Speech Union, it’s Sir Paul Marshall.

Paul Marshall (read more about him) is a very wealthy hedge fund manager -one of the elite group of wealthy people who find it convenient to pretend that they are on the ’side of the people’.  A curious characteristic of the hard right is that they seem to believe this: a sufficient condition to believe something is that it isn’t true.  In order to join their cult you must be against vaccines, against lockdowns, against climate change mitigations, against electric cars, against science, against universities, against the BBC, against any sort of regulation.  It is all weirdly contrarian.  What they claim to be for is free speech, though naturally that’s more important for people they agree with.

There are some quite ingenious people behind the hard-right’s attempts to take power. The attempted coup was obvious when Trump urged his supporters to storm Congress on January 6th, 2021, and it’s obvious in his rhetoric in 2024.  In the UK it is more subtle, but equally dangerous.  The idea seems to be to get people indignant about things like climate change by producing a non-stop deluge of misinformation.  The fact that 99.9 percent of scientists agree that climate change is a danger to the future of our planet means nothing to them -they seem to regard it as proof that there is a conspiracy by the “elites” to oppress the people. 

Spreading conspiracies is a useful tool for the far-right.  They vary from the slightly plausible to the batshit crazy (Jewish space lasers, anyone?). In the words of Steve Bannon, “The real opposition is the media. And the way to deal with them is to flood the zone with shit.”. In other words, provide so much misinformation that people get disorientated.

Some good fact-checkers have arisen in an attempt to counter the flood of misinformation.  Needless to say, the hard-right are against them.  BBC Verify does a good job in telling us what’s true and what isn’t. And the BBC appointed its first specialist disinformation and social media correspondent, Marianna Spring.  She’s done a terrific job in investigating conspiracy theorists.  She’s talked to some of the more extreme people -those who claim that the Covid virus doesn’t exist and that the pandemic was a hoax, and that the Manchester Arena bombing was staged.  Needless to say, she’s incurred the wrath not only of Twitter trolls (many thousands of abusive messages and death threats), but also of UnHerd, which pretends to be moderate.

Simon Cottee published in UnHerd, “The hypocrisy of the BBC’s misinformation war: Marianna Spring is as dogmatic as her trolls”.  (Cottee is a senior lecturer in criminology at the University of Kent.) It includes gems like:

. . . an entire industry of journalists, academics and experts has arisen to hunt down, track and police misinformation. In some ways, this industry is just as creepy and alarming as the conspiracy culture it gorges on, mirroring its familiar pathologies of distortion and hyperbole.

So the people who warn about dangerous conspiracies are just as bad as those who spread them?

Conspiracy culture, for those who are part of it, offers a profound spiritual enlargement of the world, imbuing it with hidden meanings, mysteries and secrets. Conspiracies can be engaging and fun and thrillingly transgressive.

So that’s all right, then.

There is a lot of information about the organisations and people involved in the recent burgeoning of hard-right sites in a fascinating paper by Huw C. Davies and Sheena E. MacRae: An anatomy of the British war on woke (https://doi.org/10.1177/03063968231164905). One name that keeps cropping up is Toby Young.  You can read more about him on his Wikipedia page. Despite the fact that he’s edited his own page 282 times, it’s less than flattering.  He ran Lockdown Sceptics, and now runs The Daily Sceptic and the Free Speech Union.  His character can be judged also from this selection of his puerile tweets from 2009 (when he was 45 years old, not 15).

A dilemma

Outlets like UnHerd, The Critic and Quillette are somewhat less extreme than Breitbart and Spiked.  They are designed to seem reasonable and to tolerate, even to invite, some progressive opinions.  I’d argue that this makes them even more dangerous than the more extreme sites.  If you want to know where their political sympathies lie, look at comments on their articles (but hold your nose before doing so).   At a time when we have the most right-wing government in my lifetime, their number one enemy is that very government. They are fiercely critical of any Conservative who isn’t a member of the group of far-right insurgents whom John Major called “bastards” in 1993, and whom an ally of David Cameron, 20 years later, called swivel-eyed loons.

The stimulus to write this piece came when I noticed that some of my heroes had been writing for them.  That has created a dilemma for me, so I’ll put both sides of it.  First, though, the problem.

First, I noticed that Sense About Science is debating at the Free Speech Union. Sense about Science is an organisation that advocates good science and explains it in an accessible way.  It has written good pamphlets on a lot of topics, though the fact that it’s taken money from industry inevitably means that it will be regarded with a bit of suspicion. Being involved with Toby Young’s blatantly anti-science Free Speech Union can surely only add to those suspicions.

Then I got another shock when I saw that Alan Sokal was also involved with the Free Speech Union.  I loved Sokal’s book, Intellectual Impostures, in which he, with Jean Bricmont, talk about his spoof paper which demolished the absurd pretentiousness of post-modernist philosophers.  The cover of his book appears on the masthead of my blog, where I have posted about his work.  I was therefore very surprised when I found that he, a physicist, had spoken at Toby Young’s anti-science Free Speech Union.  It’s on YouTube.  

I was even more astonished when I found that Margaret McCartney and Deborah Cohen were publishing in UnHerd.  McCartney is a GP in Glasgow and a prolific journalist. Cohen is a first-rate investigative medical journalist. They are both people whose work I admire hugely.  Why on earth are such people giving succour to the hard right?

Declaration of interest. I count Sokal, McCartney and Cohen as friends. I have huge respect for all of them. McCartney, Cohen (and I) have all received the Health Sense UK award. We were all what was described as sceptics before that word was purloined by the forces of anti-science.

My dilemma

It could certainly be argued that I’m wrong to be upset that people with whom I agree on almost everything, are engaging with the hard right.  Talking is good and they are taking their messages to people who will often not agree with them, not preaching to those who already agree. 

On the other hand, they are attracting readers to organisations that are far to the right of anything I’ve known in my lifetime. Organisations that, if they got their way, would result, I believe, in an authoritarian government which would have much in common with fascism.

One possible explanation lies in the cleverness with which the hard-right has used wedge issues to divide people.  If you put the word ‘transgender’ into UnHerd’s search box, you get 466 hits. It’s a topic that is an obsession of all the new hard-right sites. They bang on about it incessantly.  That seems odd because only 0.5 percent of the population are affected by it.  It’s also odd because the same people who would, at other times, be saying that a woman’s place is in the kitchen, are now seeking to appear as champions of women’s rights.

I suspect that this is a clever calculation on the part of hard-right outlets, which are generally opposed to science.  It’s designed to win over rationalists by asserting again, and again, and again, that biological sex can’t change. Of course it can’t.  I don’t need people like Toby Young to tell me that. Most people nevertheless think that people with gender dysphoria should be treated with kindness. I have given my opinions on the transgender question already.  It isn’t that hard.  Yet the heat that is generated has allowed otherwise reasonable people to be sucked into the orbit of the hard-right. 

That is tragic.

Today, in desperation, we sent a letter to our MP. Usually such letters get a boiler plate response, dictated by Conservative Central Office. We aren’t hopeful about changing the mind of our MP. Before he was elected he ran a shockingly misogynist web site, now deleted. But it’s essential to do what one can to about democracy while we still have some.

Dear Mike Penning

I doubt whether this will be the only letter you’ll get along these lines.

1. The Conservative party used to be the party of business. Brexit has destituted many small businesses and impoverished the UK. That’s not  just my opinion, but that of the OBR.  And it hasn’t been the will of the people except for a few weeks around the time of the referendum. The vote was bought by Russian (and far-right US) money. Putin was a strong advocate of brexit.  He got his way.

2. The Conservative party used to be the party of law and order. Now, it seems, it thinks it’s enough to say sorry, let’s move on. It has become the party of law-breaking.

3. The Rwanda scheme is intolerable. Government ministers appear regularly on TV and radio referring to people arriving illegally.  It is NOT illegal to ask for asylum however you arrive in the country. Ministers must know this, so they are lying to the public.  This should not happen. It breaks the ministerial code and in the recent past it would have led to resignation. That honourable tradition is ignored by the Conservative party.

4. As a backbencher I hope that you are as appalled as we are that there will be no debate in parliament about a matter as controversial as the Rwanda proposals.  That is more what one might expect in a dictatorship than in a democracy. The Rwanda proposals cost a fortune, are probably illegal and they won’t work anyway.

5.  Ministers repeatedly claim that the UK’s record for taking refugees is world-beating.  You yourself made a similar claim in your recent letter to us. Surely you must know that this is simply not true,  Even the famous Kindertransport in WW2 refused to take the parents of the children who arrived in the UK – they were left to be murdered by the Nazis.  More recently, Germany accepted a million Syrians. In comparison, we took only a handful.

6. Most recently, the Ukraine war has again showed the UK in a bad light. Poland has accepted millions of people fleeing from Putin’s war.  We have accepted only a handful.  Priti Patel’s scheme, Homes for Ukraine, verges on being a sham.  We signed up for it as soon as it opened, but nothing has happened. We are trying to find a refugee to sponsor. Apparently thousands have asked, but the government has done nothing to help. We are among the 100,000 British people who wanted to do something to help, only to find ourselves thwarted by the government.

7. Although sanctions were (very belatedly) imposed on some uber-wealthy Russians when Putin invaded Ukraine, they were given plenty of time to move their assets before the sanctions were enforced.  Was the government concerned that the large contributions that Russians made to Conservative party funds might dry up?

In summary, the Conservative party now bears no resemblance to the Conservative of even 10 years ago.  It has morphed into a far-right populist party with scant regard for honesty, or even democracy itself.  More like Orban’s Hungary than the country we were born in. We are sad and ashamed that the UK is laughed at and pitied round the world (do you read foreign newspapers? – their view of Johnson’s  government is chastening).

We hope this government falls before it destroys totally the England in which we were born.

David & Margaret Colquhoun

This is a transcript of the talk that I gave to the RIOT science club on 1st October 2020. The video of the talk is on YouTube . The transcript was very kindly made by Chris F Carroll, but I have modified it a bit here to increase clarity. Links to the original talk appear throughout.

This is the original video of the talk.

My title slide is a picture of UCL’s front quad, taken on the day that it was the starting point for the second huge march that attempted to stop the Iraq war. That’s a good example of the folly of believing things that aren’t true.

“Today I speak to you of war. A war that has pitted statistician against statistician for nearly 100 years. A mathematical conflict that has recently come to the attention of the normal people and these normal people look on in fear, in horror, but mostly in confusion because they have no idea why we’re fighting.”

Kristin Lennox (Director of Statistical Consulting, Lawrence Livermore National Laboratory)

That sums up a lot of what’s been going on. The problem is that there is near unanimity among statisticians that p values don’t tell you what you need to know but statisticians themselves haven’t been able to agree on a better way of doing things.

This talk is about the probability that if we claim to have made a discovery we’ll be wrong. This is what people very frequently want to know. And that is not the p value. You want to know the probability that you’ll make a fool of yourself by claiming that an effect is real when in fact it’s nothing but chance.

Just to be clear, what I’m talking about is how you interpret the results of a single unbiased experiment. Unbiased in the sense the experiment is randomized, and all the assumptions made in the analysis are exactly true. Of course in real life false positives can arise in any number of other ways: faults in the randomization and blinding, incorrect assumptions in the analysis, multiple comparisons, p hacking and so on, and all of these things are going to make the risk of false positives even worse. So in a sense what I’m talking about is your minimum risk of making a false positive even if everything else were perfect.

The conclusion of this talk will be:

If you observe a p value close to 0.05 and conclude that you’ve discovered something, then the chance that you’ll be wrong is not 5%, but is somewhere between 20% and 30% depending on the exact assumptions you make. If the hypothesis was an implausible one to start with, the false positive risk will be much higher.

There’s nothing new about this at all. This was written by a psychologist in 1966.

The major point of this paper is that the test of significance does not provide the information concerning phenomena characteristically attributed to it, and that a great deal of mischief has been associated with its use.

Bakan, D. (1966) Psychological Bulletin, 66 (6), 423 – 237

Bakan went on to say this is already well known, but if so it’s certainly not well known, even today, by many journal editors or indeed many users.

The p value

Let’s start by defining the p value. An awful lot of people can’t do this but even if you can recite it, it’s surprisingly difficult to interpret it.

I’ll consider it in the context of comparing two independent samples to make it a bit more concrete. So the p value is defined thus:

If there were actually no effect -for example if the true means of the two samples were equal, so the difference was zero -then the probability of observing a value for the difference between means which is equal to or greater than that actually observed is called the p value.

Now there’s at least five things that are dodgy with that, when you think about it. It sounds very plausible but it’s not.

  1. If there are actually no effect …”: first of all this implies that the denominator for the probability is the number of cases in which there is no effect and this is not known.
  2. “… or greater than…” : why on earth should we be interested in values that haven’t been observed? We know what the effect size that was observed was, so why should we be interested in values that are greater than that which haven’t been observed?
  3. It doesn’t compare the hypothesis of no effect with anything else. This is put well by Sellke et al in 2001, “knowing that the data are rare when there is no true difference [that’s what the p value tells you] is of little use unless one determines whether or not they are also rare when there is a true difference”. In order to understand things properly, you’ve got to have not only the null hypothesis but also an alternative hypothesis.
  4. Since the definition assumes that the null hypothesis is true, it’s obvious that it can’t tell us about the probability that the null hypothesis is true.
  5. The definition invites users to make the error of the transposed conditional. That sounds a bit fancy but it’s very easy to say what it is.

  • The probability that you have four legs given that you’re a cow is high but the probability that you’re a cow given that you’ve got four legs is quite low many animals that have four legs that aren’t cows.
  • Take a legal example. The probability of getting the evidence given that you’re guilty may be known. (It often isn’t of course — but that’s the sort of thing you can hope to get). But it’s not what you want. What you want is the probability that you’re guilty given the evidence.
  • The probability you’re catholic given that you’re the pope is probably very high, but the probability you’re a pope given that you’re a catholic is very low.

So now to the nub of the matter.

  • The probability of the observations given that the null hypothesis is the p value. But it’s not what you want. What you want is the probability that the null hypothesis is true given the observations.

The first statement is a deductive process; the second process is inductive and that’s where the problems lie. These probabilities can be hugely different and transposing the conditional simply doesn’t work.

The False Positive Risk

The false positive risk avoids these problems. Define the false positive risk as follows.

If you declare a result to be “significant” based on a p value after doing a single unbiased experiment, the False Positive Risk is the probability that your result is in fact a false positive.

That, I maintain, is what you need to know. The problem is that in order to get it, you need Bayes’ theorem and as soon as that’s mentioned, contention immediately follows.

Bayes’ theorem

Suppose we call the null-hypothesis H0, and the alternative hypothesis H1. For example, H0 can be that the true effect size is zero and H1 can be the hypothesis that there’s a real effect, not just chance. Bayes’ theorem states that the odds on H1 being true, rather than H0 , after you’ve done the experiment are equal to the likelihood ratio times the odds on there being a real effect before the experiment:

In general we would want a Bayes’ factor here, rather than the likelihood ratio, but under my assumptions we can use the likelihood ratio, which is a much simpler thing [explanation here].

The likelihood ratio represents the evidence supplied by the experiment. It’s what converts the prior odds to the posterior odds, in the language of Bayes’ theorem. The likelihood ratio is a purely deductive quantity and therefore uncontentious. It’s the probability of the observations if there’s a real effect divided by the probability of the observations if there’s no effect.

Notice a simplification you can make: if the prior odds equal 1, then the posterior odds are simply equal to the likelihood ratio. “Prior odds of 1” means that it’s equally probable before the experiment that there was an effect or that there’s no effect. Put another way, prior odds of 1 means that the prior probability of H0 and of H1 are equal: both are 0.5. That’s probably the nearest you can get to declaring equipoise.

Comparison: Consider Screening Tests

I wrote a statistics textbook in 1971 [download it here] which by and large stood the test of time but the one thing I got completely wrong was the limitations of p values. Like many other people I came to see my errors through thinking about screening tests. These are very much in the news at the moment because of the COVID-19 pandemic. The illustration of the problems they pose which follows is now quite commonplace.

Suppose you test 10,000 people and that 1 in a 100 of those people have the condition, e.g. Covid-19, and 99 don’t have it. The prevalence in the population you’re testing is 1 in a 100. So you have 100 people with the condition and 9,900 who don’t. If the specificity of the test is 95%, you get 5% false positives.

This is very much like a null-hypothesis test of significance. But you can’t get the answer without considering the alternative hypothesis, which null-hypothesis significance tests don’t do. So now add the upper arm to the Figure above.

You’ve got 1% (so that’s 100 people) who have the condition, so if the sensitivity of the test is 80% (that’s like the power of a significance test) then you get to the total number of positive tests is 80 plus 495 and the proportion of tests that are false is 495 false positives divided by the total number of positives, which is 86%. A test that gives 86% false positives is pretty disastrous. It is not 5%! Most people are quite surprised by that when they first come across it.

Now look at significance tests in a similar way

Now we can do something similar for significance tests (though the parallel is not exact, as I’ll explain).

Suppose we do 1,000 tests and in 10% of them there’s a real effect, and in 90% of them there is no effect. If the significance level, so-called, is 0.05 then we get 5% false positive tests, which is 45 false positives.

But that’s as far as you can go with a null-hypothesis significance test. You can’t tell what’s going on unless you consider the other arm. If the power is 80% then we get 80 true positive tests and 20 false negative tests, so the total number of positive tests is 80 plus 45 and the false positive risk is the number of false positives divided by the total number of positives which is 36 percent.

So the p value is not the false positive risk. And the type 1 error rate is not the false positive risk.

The difference between them lies not in the numerator, it lies in the denominator. In the example above, of the 900 tests in which the null-hypothesis was true, there were 45 false positives. So looking at it from the classical point of view, the false positive risk would turn out to be 45 over 900 which is 0.05 but that’s not what you want. What you want is the total number of false positives, 45, divided by the total number of positives (45+80), which is 0.36.

The p value is NOT the probability that your results occurred by chance. The false positive risk is.

A complication: “p-equals” vs “p-less-than

But now we have to come to a slightly subtle complication. It’s been around since the 1930s and it was made very explicit by Dennis Lindley in the 1950s. Yet it is unknown to most people which is very weird. The point is that there are two different ways in which we can calculate the likelihood ratio and therefore two different ways of getting the false positive risk.

A lot of writers including Ioannidis and Wacholder and many others use the “p less than” approach. That’s what that tree diagram gives you. But it is not what is appropriate for interpretation of a single experiment. It underestimates the false positive risk.

What we need is the “p equals” approach, and I’ll try and explain that now.

Suppose we do a test and we observe p = 0.047 then all we are interested in is, how tests behave that come out with p = 0.047. We aren’t interested in any other different p value. That p value is now part of the data. The tree diagram approach we’ve just been through gave a false positive risk of only 6%, if you assume that the prevalence of true effects was 0.5 (prior odds of 1). 6% isn’t much different from 5% so it might seem okay.

But the tree diagram approach, although it is very simple, still asks the wrong question. It looks at all tests that gives p ≤ 0.05, the “p-less-than” case. If we observe p = 0.047 then we should look only at tests that give p = 0.047 rather than looking at all tests which come out with p ≤ 0.05. If you’re doing it with simulations of course as in my 2014 paper then you can’t expect any tests to give exactly 0.047; what you can do is look at all the tests that come out with p in a narrow band around there, say 0.045 ≤ p ≤ 0.05.

This approach gives a different answer from the tree diagram approach. If you look at only tests that give p values between 0.045 and 0.05, the false positive risk turns out to be not 6% but at least 26%.

I say at least, because that assumes a prior probability of there being a real effect of 50:50. If only 10% of the experiments had a real effect of (a prior of 0.1 in the tree diagram) this rises to 76% of false positives. That really is pretty disastrous. Now of course the problem is you don’t know this prior probability.

The problem with Bayes theorem is that there exists an infinite number of answers. Not everyone agrees with my approach, but it is one of the simplest.

The likelihood-ratio approach to comparing two hypotheses

The likelihood ratio -that is to say, the relative probabilities of observing the data given two different hypotheses, is the natural way to compare two hypotheses. For example, in our case one hypothesis is the zero effect (that’s the null-hypothesis) and the other hypothesis is that there’s a real effect of the observed size. That’s the maximum likelihood estimate of the real effect size. Notice that we are not saying that the effect size is exactly zero; but rather we are asking whether a zero effect explains the observations better than a real effect.

Now this amounts to putting a “lump” of probability on there being a zero effect. If you put a prior probability of 0.5 for there being a zero effect, you’re saying the prior odds are 1. If you are willing to put a lump of probability on the null-hypothesis, then there are several methods of doing that. They all give similar results to mine within a factor of two or so.

Putting a lump of probability on their being a zero effect, for example a prior probability of 0.5 of there being zero effect, is regarded by some people as being over-sceptical (though others might regard 0.5 as high, given that most bright ideas are wrong).

E.J. Wagenmakers summed it up in a tweet:

“at least Bayesians attempt to find an approximate answer to the right question instead of struggling to interpret an exact answer to the wrong question [that’s the p value]”.

Some results.

The 2014 paper used simulations, and that’s a good way to see what’s happening in particular cases. But to plot curves of the sort shown in the next three slides we need exact calculations of FPR and how to do this was shown in the 2017 paper (see Appendix for details).

Comparison of p-equals and p-less-than approaches

The slide at slide at 26:05 is designed to show the difference between the “p-equals” and the “p-less than” cases.

On each diagram the dashed red line is the “line of equality”: that’s where the points would lie if the p value were the same as the false positive risk. You can see that in every case the blue lines -the false positive risk -is greater than the p value. And for any given observed p value, the p-equals approach gives a bigger false positive risk than the p-less-than approach. For a prior probability of 0.5 then the false positive risk is about 26% when you’ve observed p = 0.05.

So from now on I shall use only the “p-equals” calculation which is clearly what’s relevant to a test of significance.

The false positive risk as function of the observed p value for different sample sizes

Now another set of graphs (slide at 27:46), for the false positive risk as a function of the observed p value, but this time we’ll vary the number in each sample. These are all for comparing two independent samples.

The curves are red for n = 4 ; green for n = 8 ; blue for n = 16.

The top row is for an implausible hypothesis with a prior of 0.1, the bottom row for a plausible hypothesis with a prior of 0.5.

The left column shows arithmetic plots; the right column shows the same curves in log-log plots, The power these lines correspond to is:

  • n = 4 (red) has power 22%
  • n = 8 (green) has power 46%
  • n = 16 (blue) one has power 78%

Now you can see these behave in a slightly curious way. For most of the range it’s what you’d expect: n = 4 gives you a higher false positive risk than n = 8 and that still higher than n = 16 the blue line.

The curves behave in an odd way around 0.05; they actually begin to cross, so the false positive risk for p values around 0.05 is not strongly dependent on sample size.

But the important point is that in every case they’re above the line of equality, so the false positive risk is much bigger than the p value in any circumstance.

False positive risk as a function of sample size (i.e. of power)

Now the really interesting one (slide at 29:34). When I first did the simulation study I was challenged by the fact that the false positive risk actually becomes 1 if the experiment is a very powerful one. That seemed a bit odd.

The plot here is the false positive risk FPR50 which I define as “the false positive risk for prior odds of 1, i.e. a 50:50 chance of being a real effect or not a real effect.

Let’s just concentrate on the p = 0.05 curve (blue). Notice that, because the number per sample is changing, the power changes throughout the curve. For example on the p = 0.05 curve for n = 4 (that’s the lowest sample size plotted), power is 0.22, but if we go to the other end of the curve, n = 64 (the biggest sample size plotted), the power is 0.9999. That’s something not achieved very often in practice.

But how is it that p = 0.05 can give you a false positive risk which approaches 100%? Even with p = 0.001 the false positive risk will eventually approach 100% though it does so later and more slowly.

In fact this has been known for donkey’s years. It’s called the Jeffreys-Lindley paradox, though there’s nothing paradoxical about it. In fact it’s exactly what you’d expect. If the power is 99.99% then you expect almost every p value to be very low. Everything is detected if we have a high power like that. So it would be very rare, with that very high power, to get a p value as big as 0.05. Almost every p value will be much less than 0.05, and that’s why observing a p value as big as 0.05 would, in that case, provide strong evidence for the null-hypothesis. Even p = 0.01 would provide strong evidence for the null hypothesis when the power is very high because almost every p value would be much less than 0.01.

This is a direct consequence of using the p-equals definition which I think is what’s relevant for testing hypotheses. So the Jeffreys-Lindley phenomenon makes absolute sense.

In contrast, if you use the p-less-than approach, the false positive risk would decrease continuously with the observed p value. That’s why, if you have a big enough sample (high enough power), even the smallest effect becomes “statistically significant”, despite the fact that the odds may favour strongly the null hypothesis. [Here, ‘the odds’ means the likelihood ratio calculated by the p-equals method.]

A real life example

Now let’s consider an actual practical example. The slide shows a study of transcranial electromagnetic stimulation published in Science magazine (so a bit suspect to begin with).

The study concluded (among other things) that an improved associated memory performance was produced by transcranial electromagnetic stimulation, p = 0.043. In order to find out how big the sample sizes were I had to dig right into the supplementary material. It was only 8. Nonetheless let’s assume that they had an adequate power and see what we make of it.

In fact it wasn’t done in a proper parallel group way, it was done as ‘before and after’ the stimulation, and sham stimulation, and it produces one lousy asterisk. In fact most of the paper was about functional magnetic resonance imaging, memory was mentioned only as a subsection of Figure 1, but this is what was tweeted out because it sounds more dramatic than other things and it got a vast number of retweets. Now according to my calculations p = 0.043 means there’s at least an 18% chance that it’s false positive.

How better might we express the result of this experiment?

We should say, conventionally, that the increase in memory performance was 1.88 ±  0.85 (SEM) with confidence interval 0.055 to 3.7 (extra words recalled on a baseline of about 10). Thus p = 0.043. But then supplement this conventional statement with

This implies a false positive risk, FPR50, (i.e. the probability that the results occurred  by chance only) of at least 18%, so the result is no more than suggestive.

There are several other ways you can put the same idea. I don’t like them as much because they all suggest that it would be helpful to create a new magic threshold at FPR50 = 0.05, and that’s as undesirable as defining a magic threshold at p = 0.05. For example you could say that the increase in performance gave p = 0.043, and in order to reduce the false positive risk to 0.05 it would be necessary to assume that the prior probability of there being a real effect was 81%. In other words, you’d have to be almost certain that there was a real effect before you did the experiment before that result became convincing. Since there’s no independent evidence that that’s true, the result is no more than suggestive.

Or you could put it this way: the increase in performance gave p = 0.043. In order to reduce the false positive risk to 0.05 it would have been necessary to observe p = 0.0043, so the result is no more than suggestive.

The reason I now prefer the first of these possibilities is because the other two involve an implicit threshold of 0.05 for the false positive risk and that’s just as daft as assuming a threshold of 0.05 for the p value.

The web calculator

Scripts in R are provided with all my papers. For those who can’t master R Studio, you can do many of the calculations very easily with our web calculator [for latest links please go to http://www.onemol.org.uk/?page_id=456]. There are three options : if you want to calculate the false positive risk for a specified p value and prior, you enter the observed p value (e.g. 0.049), the prior probability that there’s a real effect (e.g. 0.5), the normalized effect size (e.g. 1 standard deviation) and the number in each sample. All the numbers cited here are based on an effect size if 1 standard deviation, but you can enter any value in the calculator. The output panel updates itself automatically.

web-calc

We see that the false positive risk for the p-equals case is 0.26 and the likelihood ratio is 2.8 (I’ll come back to that in a minute).

Using the web calculator or using the R programs which are provided with the papers, this sort of table can be very quickly calculated.

The top row shows the results if we observe p = 0.05. The prior probability that you need to postulate to get a 5% false positive risk would be 87%. You’d have to be almost ninety percent sure there was a real effect before the experiment, in order to to get a 5% false positive risk. The likelihood ratio comes out to be about 3; what that means is that your observations will be about 3 times more likely if there was a real effect than if there was no effect. 3:1 is very low odds compared with the 19:1 odds which you might incorrectly infer from p = 0.05. The false positive risk for a prior of 0.5 (the default value) which I call the FPR50, would be 27% when you observe p = 0.05.

In fact these are just directly related to each other. Since the likelihood ratio is a purely deductive quantity, we can regard FPR50 as just being a transformation of the likelihood ratio and regard this as also a purely deductive quantity. For example,  1 / (1 + 2.8) = 0.263, the FPR50.  But in order to interpret it as a posterior probability then you do have to go into Bayes’ theorem. If the prior probability of a real effect was only 0.1 then that would correspond to a 76% false positive risk when you’ve observed p = 0.05.

If we go to the other extreme, when we observe p = 0.001 (bottom row of the table) the likelihood ratio is 100 -notice not 1000, but 100 -and the false positive risk, FPR50 , would be 1%. That sounds okay but if it was an implausible hypothesis with only a 10% prior chance of being true (last column of Table), then the false positive risk would be 8% even when you observe p = 0.001: even in that case it would still be above 5%. In fact, to get the FPR down to 0.05 you’d have to observe p = 0.00043, and that’s good food for thought.

So what do you do to prevent making a fool of yourself?

  1. Never use the words significant or non-significant and then don’t use those pesky asterisks please, it makes no sense to have a magic cut off. Just give a p value.
  2. Don’t use bar graphs. Show the data as a series of dots.
  3. Always remember, it’s a fundamental assumption of all significance tests that the treatments are randomized. When this isn’t the case, you can still calculate a test but you can’t expect an accurate result.  This is well-illustrated by thinking about randomisation tests.
  4. So I think you should still state the p value and an estimate of the effect size with confidence intervals but be aware that this tells you nothing very direct about the false positive risk. The p value should be accompanied by an indication of the likely false positive risk. It won’t be exact but it doesn’t really need to be; it does answer the right question. You can for example specify the FPR50, the false positive risk based on a prior probability of 0.5. That’s really just a more comprehensible way of specifying the likelihood ratio. You can use other methods, but they all involve an implicit threshold of 0.05 for the false positive risk. That isn’t desirable.

So p = 0.04 doesn’t mean you discovered something, it means it might be worth another look. In fact even p = 0.005 can under some circumstances be more compatible with the null-hypothesis than with there being a real effect.

We must conclude, however reluctantly, that Ronald Fisher didn’t get it right. Matthews (1998) said,

“the plain fact is that 70 years ago Ronald Fisher gave scientists a mathematical machine for turning boloney into breakthroughs and flukes into funding”.

Robert Matthews Sunday Telegraph, 13 September 1998.

But it’s not quite fair to blame R. A. Fisher because he himself described the 5% point as a “quite a low standard of significance”.

Questions & Answers

Q: “There are lots of competing ideas about how best to deal with the issue of statistical testing. For the non-statistician it is very hard to evaluate them and decide on what is the best approach. Is there any empirical evidence about what works best in practice? For example, training people to do analysis in different ways, and then getting them to analyze data with known characteristics. If not why not? It feels like we wouldn’t rely so heavily on theory in e.g. drug development, so why do we in stats?

A: The gist: why do we rely on theory and statistics? Well, we might as well say, why do we rely on theory in mathematics? That’s what it is! You have concrete theories and concrete postulates. Which you don’t have in drug testing, that’s just empirical.

Q: Is there any empirical evidence about what works best in practice, so for example training people to do analysis in different ways? and then getting them to analyze data with known characteristics and if not why not?

A: Why not: because you never actually know unless you’re doing simulations what the answer should be. So no, it’s not known which works best in practice. That being said, simulation is a great way to test out ideas. My 2014 paper used simulation, and it was only in the 2017 paper that the maths behind the 2014 results was worked out. I think you can rely on the fact that a lot of the alternative methods give similar answers. That’s why I felt justified in using rather simple assumptions for mine, because they’re easier to understand and the answers you get don’t differ greatly from much more complicated methods.

In my 2019 paper there’s a comparison of three different methods, all of which assume that it’s reasonable to test a point (or small interval) null-hypothesis (one that says that treatment effect is exactly zero), but given that assumption, all the alternative methods give similar answers within a factor of two or so. A factor of two is all you need: it doesn’t matter if it’s 26% or 52% or 13%, the conclusions in real life are much the same.

So I think you might as well use a simple method. There is an even simpler one than mine actually, proposed by Sellke et al. (2001) that gives a very simple calculation from the p value and that gives a false positive risk of 29 percent when you observe p = 0.05. My method gives 26%, so there’s no essential difference between them. It doesn’t matter which you use really.

Q: The last question gave an example of training people so maybe he was touching on how do we teach people how to analyze their data and interpret it accurately. Reporting effect sizes and confidence intervals alongside p values has been shown to improve interpretation in teaching contexts. I wonder whether in your own experience that you have found that this helps as well? Or can you suggest any ways to help educators, teachers, lecturers, to help the next generation of researchers properly?

A: Yes I think you should always report the observed effect size and confidence limits for it. But be aware that confidence intervals tell you exactly the same thing as p values and therefore they too are very suspect. There’s a simple one-to-one correspondence between p values and confidence limits. So if you use the criterion, “the confidence limits exclude zero difference” to judge whether there’s a real effect you’re making exactly the same mistake as if you use p ≤ 0.05 to to make the judgment. So they they should be given for sure, because they’re sort of familiar but you do need, separately, some sort of a rough estimate of the false positive risk too.

Q: I’m struggling a bit with the “p equals” intuition. How do you decide the band around 0.047 to use for the simulations? Presumably the results are very sensitive to this band. If you are using an exact p value in a calculation rather than a simulation, the probability of exactly that p value to many decimal places will presumably become infinitely small. Any clarification would be appreciated.

A: Yes, that’s not too difficult to deal with: you’ve got to use a band which is wide enough to get a decent number in. But the result is not at all sensitive to that: if you make it wider, you’ll get larger numbers in both numerator and denominator so the result will be much the same. In fact, that’s only a problem if you do it by simulation. If you do it by exact calculation it’s easier. To do a 100,000 or a million t-tests with my R script in simulation, doesn’t take long. But it doesn’t depend at all critically on the width of the interval; and in any case it’s not necessary to do simulations, you can do the exact calculation.

Q: Even if an exact calculation can’t be done—it probably can—you can get a better and better approximation by doing more simulations and using narrower and narrower bands around 0.047?

A: Yes, the larger the number of simulated tests that you do, the more accurate the answer. I did check it with a million occasionally. But once you’ve done the maths you can get exact answers much faster. The slide at 53:17 shows how you do the exact calculation.

• The Student’s t value along the bottom
• Probability density at the side
• The blue line is the distribution you get under the null-hypothesis, with a mean of 0 and a standard deviation of 1 in this case.
• So the red areas are the rejection areas for a t-test.
• The green curve is the t distribution (it’s a non-central t-distribution which is what you need in this case) for the alternative hypothesis.
• The yellow area is the power of the test, which here is 78%
• The orange area is (1 – power) so it’s 22%

The p-less-than calculation considers all values in the red area or in the yellow area as being positives. The p-equals calculation uses not the areas, but the ordinates here, the probability densities. The probability (density) of getting a t value of 2.04 under the null hypothesis is y0 = 0.053. And the probability (density) under the alternative hypothesis is y1 = 0.29. It’s true that the probability of getting t = 2.04 exactly is infinitesimally small (the area of an infinitesimally narrow band around t = 2.04) but the ratio if the two infinitesimally small probabilities is perfectly well-define). so for the p-equals approach, the likelihood ratio in favour of the alternative hypothesis would be L10 = y1 / 2y0 (the factor of 2 arises because of the two red tails) and that gives you a likelihood ratio of 2.8. That corresponds to an FPR50 of 26% as we explained. That’s exactly what you get from simulation. I hope that was reasonably clear. It may not have been if you aren’t familiar with looking at those sorts of things.

Q: To calculate FPR50 -false positive risk for a 50:50 prior -I need to assume an effect size. Which one do you use in the calculator? Would it make sense to calculate FPR50 for a range of effect sizes?

A: Yes if you use the web calculator or the R scripts then you need to specify what the normalized effect size is. You can use your observed one. If you’re trying to interpret real data, you’ve got an estimated effect size and you can use that. For example when you’ve observed p = 0.05 that corresponds to a likelihood ratio of 2.8 when you use the true effect size (that’s known when you do simulations). All you’ve got is the observed effect size. So they’re not the same of course. But you can easily show with simulations, that if you use the observed effect size in place of the the true effect size (which you don’t generally know) then that likelihood ratio goes up from about 2.8 to 3.6; it’s around 3, either way. You can plug your observed normalised effect size into the calculator and you won’t be led far astray. This shown in section 5 of the 2017 paper (especially section 5.1).

Q: Consider hypothesis H1 versus H2 which is the interpretation to go with?

A: Well I’m not quite clear still what the two interpretations the questioner is alluding to are but I shouldn’t rely on the p value. The most natural way to compare two hypotheses is the calculate the likelihood ratio.

You can do a full Bayesian analysis. Some forms of Bayesian analysis can give results that are quite similar to the p values. But that can’t possibly be generally true because are defined differently. Stephen Senn produced an example where there was essentially no problem with p value, but that was for a one-sided test with a fairly bizarre prior distribution.

In general in Bayes, you specify a prior distribution of effect sizes, what you believe before the experiment. Now, unless you have empirical data for what that distribution is, which is very rare indeed, then I just can’t see the justification for that. It’s bad enough making up the probability that there’s a real effect compared with there being no real effect. To make up a whole distribution just seems to be a bit like fantasy.

Mine is simpler because by considering a point null-hypothesis and a point alternative hypothesis, what in general would be called Bayes’ factors become likelihood ratios. Likelihood ratios are much easier to understand than Bayes’ factors because they just give you the relative probability of observing your data under two different hypotheses. This is a special case of Bayes’ theorem. But as I mentioned, any approach to Bayes’ theorem which assumes a point null hypothesis gives pretty similar answers, so it doesn’t really matter which you use.

There was edition of the American Statistician last year which had 44 different contributions about “the world beyond p = 0.05″. I found it a pretty disappointing edition because there was no agreement among people and a lot of people didn’t get around to making any recommendation. They said what was wrong, but didn’t say what you should do in response. The one paper that I did like was the one by Benjamin & Berger. They recommended their false positive risk estimate (as I would call it; they called it something different but that’s what it amounts to) and that’s even simpler to calculate than mine. It’s a little more pessimistic, it can give a bigger false positive risk for a given p value, but apart from that detail, their recommendations are much the same as mine. It doesn’t really matter which you choose.

Q: If people want a procedure that does not too often lead them to draw wrong conclusions, is it fine if they use a p value?

A: No, that maximises your wrong conclusions, among the available methods! The whole point is, that the false positive risk is a lot bigger than the p value under almost all circumstances. Some people refer to this as the p value exaggerating the evidence; but it only does so if you incorrectly interpret the p value as being the probability that you’re wrong. It certainly is not that.

Q: Your thoughts on, there’s lots of recommendations about practical alternatives to p values. Most notably the Nature piece that was published last year—something like 400 signatories—that said that we should retire the p value. Their alternative was to just report effect sizes and confidence intervals. Now you’ve said you’re not against anything that should be standard practice, but I wonder whether this alternative is actually useful, to retire the p value?

A: I don’t think the 400 author piece in Nature recommended ditching p values at all. It recommended ditching the 0.05 threshold, and just stating a p value. That would mean abandoning the term “statistically significant” which is so shockingly misleading for the reasons I’ve been talking about. But it didn’t say that you shouldn’t give p values, and I don’t think it really recommended an alternative. I would be against not giving p values because it’s the p value which enables you to calculate the equivalent false positive risk which would be much harder work if people didn’t give the p value.

If you use the false positive risk, you’ll inevitably get a larger false negative rate. So, if you’re using it to make a decision, other things come into it than the false positive risk and the p value. Namely, the cost of missing an effect which is real (a false negative), and the cost of getting a false positive. They both matter. If you can estimate the costs associated with either of them, then then you can draw some sort of optimal conclusion.

Certainly the costs of getting false positives or rather low for most people. In fact, there may be a great advantage to your career to publish a lot of false positives, unfortunately. This is the problem that the RIOT science club is dealing with I guess.

Q: What about changing the alpha level? To tinker with the alpha level has been popular in the light of the replication crisis, to make it even a more difficult test pass when testing your hypothesis. Some people have said that it should be 0.005 should be the threshold.

A: Daniel Benjamin said that and a lot of other authors. I wrote to them about that and they said that they didn’t really think it was very satisfactory but it would be better than the present practice. They regarded it as a sort of interim thing.

It’s true that you would have fewer false positives if you did that, but it’s a very crude way of treating the false positive risk problem. I would much prefer to make a direct estimate, even though it’s rough, of the false positive risk rather than just crudely reducing to p = 0.005. I do have a long paragraph in one of the papers discussing this particular thing {towards the end of Conclusions in the 2017 paper). 

If you were willing to assume a 50:50 prior chance of there being a real effect the p = 0.005 would correspond to FPR50 = 0.034, which sounds satisfactory (from Table, above, or web calculator).

But if, for example, you are testing a hypothesis about teleportation or mind-reading or homeopathy then you probably wouldn’t be willing to give a prior of 50% to that being right before the experiment. If the prior  probability of there being a real effect were 0.1, rather than 0.5, the Table above shows that observation of p = 0.005 would suggest, in my example, FPR = 0.24 and a 24%  risk of a false positive would still be disastrous.  In this case you would have to have observed p = 0.00043 in order to reduce the false positive risk to 0.05. 

So no fixed p value threshold will cope adequately with every problem.

Links

  1. For up-to-date links to the web calculator, and to papers, start at http://www.onemol.org.uk/?page_id=456
  2. Colquhoun, 2014, An investigation of the false discovery rate and the
    misinterpretation of p-values
    https://royalsocietypublishing.org/doi/full/10.1098/rsos.140216
  3. Colquhoun, 2017, The reproducibility of research and the misinterpretation
    of p-values https://royalsocietypublishing.org/doi/10.1098/rsos.171085
  4. Colquhoun, 2019, The False Positive Risk: A Proposal Concerning What to Do About p-Values
    https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1529622
  5. Benjamin & Berger, Three Recommendations for Improving the Use of p-Values
    https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1543135
  6. Sellke, T., Bayarri, M. J., and Berger, J. O. (2001), “Calibration of p Values for Testing Precise Null Hypotheses,” The American Statistician, 55, 62–71. DOI: 10.1198/000313001300339950. [Taylor & Francis Online],

Twitter

I am going to set out my current views about the transgender problem. It’s something that has caused a lot of discussion on twitter, much of it unpleasantly vituperative. When I refer to ‘problem’ I’m referring to the vituperation, not, of course, the existence of transgender people.  Short posts on twitter don’t allow nuance, so I thought it might be helpful to lay out my views here in the (doubtless vain) hope of being able to move on to talk about other things.  This will be my last word on it, because I feel that the time spent on this single problem has become counterproductive.

  1. The problem is very complicated and nobody knows the answers. Why, for example has the number of people referred to the Tavistock clinic increased 25-fold since 2009? Nobody knows. There has been a great deal of disagreement within the Gender Identity Development Service (GIDS) at the Tavistock about whether and when to refer children for treatment with puberty blockers or surgery. There was a good report by Deborah Cohen about this: https://www.youtube.com/watch?v=zTRnrp9pXHY 
  2. There’s also a good report from BBC Newsnight about people who have chosen to detransition: https://www.youtube.com/watch?v=fDi-jFVBLA8. It shows how much is not known, even by experts.
  3. Anyone who pretends that it’s a simple problem that can be solved with slogans just isn’t listening. The long term effects of hormone treatments are simply not known.
  4. This poses a real problem for doctors who are asked for advice by people who feel that they were born in the wrong sex. There is an empathetic discussion from the front line in a recent paper
  5.  I’m very conscious that trans people have often been subjected to discrimination and abuse. That’s totally unacceptable. It’s also unacceptable to vilify women whose views are a bit different.
  6. Most of the arguments have centred on the meanings of the words ‘woman’, ‘female’, ‘gender’ and ‘sex’.  Many of the bitter rows about this topic might be avoided if people defined these words before using them.
  7. ‘Sex’ and ‘gender’ are relatively easy.  When I was growing up, ‘gender’ was a grammatical term, unrelated to sex. Then it evolved to be used as a euphemism for ‘sex’ by those who were too squeamish to use the word ‘sex’. The current use of these words is quite different. It’s discussed at https://www.merriam-webster.com/dictionary/gender#usage-1.

    “Sex as the preferred term for biological forms, and gender limited to its meanings involving behavioral, cultural, and psychological traits.“.

    This is a sensible distinction, I think. But beware that it’s by no means universally agreed. The meanings are changing all the time and you can get pilloried if you use the ‘wrong’ word.

  8. The words ‘male’, ‘female’, ‘women’ are much more contentious.  Some people say that they refer to biology, having XX chromosomes.  This is certainly the definition used in every dictionary I’ve seen.  The vast majority of people are born male or female. Apart from the small number of people who are born with chromosomal abnormalities, it’s unambiguous and can’t change.
  9. But other people now insist, often stridently, the ‘woman’ now refers to gender rather than sex. It would certainly help to avoid misapprehensions if, when using slogans like “trans women are women”, they made clear that they are using this new and unconventional definition of ‘woman’.
  10. Someone on twitter said that someone had said “transwomen are not women. That is transphobic. If she’d said that transwomen are not female, she’d have just been correct.” I doubt that this distinction is widely accepted.  Both statements seem to me to mean much the same thing, but again it’s a matter of definitions.
  11. If someone who is biologically male feels happier as a woman, that’s fine. They should be able to live as a woman safely, and without discrimination.  They should be treated as though they were women.  This I take to be the intention of the tweet from J.K. Rowling:

    Rowling tweet

  12. It seems to me to be totally unfair, and deeply misogynist, to pillory Rowling as a ‘transphobe’ on the basis of this (or anything else) she’s said. She’s had some pretty vile abuse. There’s already a problem of women getting abuse on social media, and that’s only added to by the way she’s been treated because of this tweet.

  13. It seems to me that there is a wafer-thin distinction between “trans women are women” and “trans women should be treated as though they were women”. Yet if you say the wrong one you can be pilloried.
  14. Many of my friends in what’s known loosely as the skeptical movement have been quite unreasonably exercised about this fine distinction.  Many of today’s problems arise from the extreme polarisation of views (on almost everything). This seems to me to be deeply unhelpful.
  15. I was pilloried by some people when I posted this tweet: “I’ve just finished reading the whole of the post by @jk_rowling. It only increases my admiration for her -a deeply empathetic human.  The attacks on her are utterly unjustified.”   It’s true that I gained several hundred followers after posting it (though I suspect that not all of them were followers that I would wish to have).
  16. The problems arise when a small minority of people who have male genitalia (whether they are trans women or predatory males) have used their access to spaces that have been traditionally reserved for women as an opportunity of voyeurism or even rape.  In such cases the law should take its course.  The existence of a few such cases shouldn’t be used as an excuse to discriminate against all trans women.
  17. Another case that’s often cited is sports.  Being biologically male gives advantages in many sports.  Given the huge advances that women have made in sports since the 1960s, it would be very unfortunate if they were to be beaten regularly by people who were biologically male (this has actually happened in sprinting and in weightlifting). In contact sports it could be dangerous. The Rugby Football Union has a policy which will have the effect of stopping most trans women from joining their women’s teams. That seems fair to me. Sports themselves should make the rules to ensure fair play. Some of the rules are summarised in https://en.wikipedia.org/wiki/Transgender_people_in_sports.The problem is to weigh the discrimination against trans women against the discrimination against biological women.  In this case, you can’t have both.
  18. The trans problem has been particularly virulent in the Green Party.  I recently endorsed Dr Rosi Sexton for leadership of the Green Party, because she has committed to having better regard for evidence than the other candidates, and because she’s committed to inclusion of minority groups.  They are both good things.  She has also said “trans women are women”, and that led to prolonged harassment from some of my best skeptical friends. She’s undoubtedly aware of X and Y chromosomes so I take it that she’s using ‘woman’ in the sense of gender rather than sex.  Although I’d prefer slightly different words, such as “trans women should be treated as though they were women”, the difference between these two forms of wording seems to be far too small to justify the heat, and even hate, generated on both sides of the argument. Neither form of wording is “transphobic”. To say that they are is, in my opinion, absurd.
  19. All that I ask is that there should be less stridency and a bit more tolerance of views that don’t differ as much as people seem to think. Of course, anyone who advocates violence should be condemned. Be clear about definitions and don’t try to get people fired because their definitions are different from yours. Be kind to people.
  20. Postcript

    The fairness and safety of sports is very often raised in this context. The answer isn’t as obvious as I thought at first, This is a very thoughtful article on that topic: MMA pioneer Rosi Sexton once opposed Fallon Fox competing. Now she explains why she supports trans athletes. The following quotation from it seems totally sensible to me.

    “The International Olympic Committee has had a trans-inclusive policy since 2003. In that time, there have been no publicly out trans Olympic athletes (though that will likely change in 2021).

    The idea that trans women would make women’s sport meaningless by easily dominating the competition has not, so far, materialized at any level.

    If trans women do have an unfair advantage over cis women, then it’s a hard one to spot.”

There will soon be an election for the leader of the Green Party of England and Wales.  I support Dr Sexton for the job. Here’s my endorsement. I’ll say why below.

I support Dr Sexton as a candidate to lead the Green Party (England and Wales). She said

“The Green Party is a political party, not a lifestyle movement”.

That’s perceptive. For too long the Green party in the UK has been regarded as marginal, even as tree-huggers. That’s the case despite their success in local government and in other European countries which have fairer voting systems.

She continued

“We need to be serious about inclusion, serious about evidence, and serious about winning elections.”

They are all good aims. As somebody with three degrees in mathematics, she’s better qualified to evaluate evidence than just about any of our members of parliament, for most of whom science is a closed book.

Her breadth of experience is unrivalled. As well as mathematics, she has excelled in music and has been a champion athlete. Winning is her speciality. I believe that she has what it takes to win in politics too.

Here is her first campaign video.

Why am I endorsing Dr Sexton?

Like many people I know, I’ve been politically homeless for a while.  I could obviously  never vote Conservative, especially now that they’ve succumbed to a far-right coup.  In the past, I’ve voted Labour mostly but I couldn’t vote for Jeremy Corbyn.  I’ve voted Lib Dem in some recent elections, but it’s hard to forgive what they did during the coalition.  So what about the Green Party?  I voted for them in the European elections because they have a fair voting system. I would have voted for them more often if it were not for our appallingly unfair first-past-the-post system. So why now?

Firstly, the Greens are growing. They are well represented in the European parliament, and increasingly in local government. Secondly the urgency of doing something about climate change gets ever more obvious.  The Greens are also growing up.  They are no longer as keen on alternative pseudo- medicine as they once were. Their anti-scientific wing is in retreat. And Dr Sexton, as a person who is interested in evidence, is just the sort of leader that they need to cement that advance.

That’s why I decided to join the Green Party to vote for her to be its leader.

If you want to know more about her, check her Wikipedia page. Or watch this video of an interview that I did with her in 2018.

You can also check her campaign website.

During the Black Lives Matter demonstrations on Sunday 7th June, the statue of Edward Colston was pulled down and dumped in the harbour in Bristol.

I think that it was the most beautiful thing that happened yesterday.

Colston made his money from the slave trade. 84,000 humans were transported on his ships. 19,000 of them died because of the appalling conditions on slave ships.

The statue was erected 174 years after he died, and, astonishingly, 62 years after the abolition of slavery.

According to Historic England, the plaque on the statue read thus.

Edward Colston   Born 1636 Died 1721.
Erected by citizens of Bristol as a memorial
of one of the most
virtuous and wise sons of their city
AD 1895

(https://historicengland.org.uk/…/the-list/list-entry/1202137 )

Over the years, many attempts have been made to change the wording on the plaque, but it has never happened.
https://en.wikipedia.org/wiki/Statue_of_Edward_Colston

Would Priti Patel also condemn the removal of statues of Jimmy Saville, the notorious paedophile, as “utterly disgraceful” because he gave money to charities?

Would she condemn the toppling of the statues of Saddam Hussein and of Stalin as vandalism?  I very much doubt it.

To those who say that removal of the statue erases history, there is a simple response. There are no statues to Hitler. And he most certainly hasn’t been forgotten.

Quite on the contrary, a lot more people are aware of Colston than was the case 24 hours ago.

The people who pulled the statue down deserve a medal. If they are prosecuted it would bring shame on us.

Please listen to the great historian, David Olusoga. He explains the matter perfectly.

“Statues aren’t about history they are about adoration. This man was not great, he was a slave trader and a murderer. Historian @DavidOlusoga brilliantly explains why BLM protestors were right to tear down the statue of Edward Colston. https://t.co/F1Zn1G8LVn”

This example illustrates just how fast exponential growth is. It was proposed on twitter by Charles Arthur (@charlesarthur) who attributes the idea to Simon Moores. The stadium version is a variant of the better known ‘grains of wheat (or rice) on a chessboard‘ problem. The stadium example is better, I think, because the time element gives it a sense of urgency, and that’s what we need right now.

Wembley stadium

Here’s Wembley Stadium. The watering system develops a fault: in minute 1 one drop of water is released; minute 2, two drops, minute 3 four drops, and so on. Every minute the number of drops doubles. How long will it take to fill Wembley stadium?

The answer is that after 44 minutes, before half-time, the stadium would overflow.

Here’s why.

The sequence is 1, 2, 4, 8, 16, . . . so the nth term in the sequence is 2n – 1. For example, the 4th term is 23 = 8.

Next we need to know how many drops are needed to fill the stadium. Suppose a drop of water has a volume of 0.07 ml. This is 0.00000007, or 7 x 10-8, cubic metres. Wembley Stadium has a volume of 1.1 million cubic metres. So the stadium holds 15,714,285,714,285 drops. Or about 15.7×1012 drops. How many minutes does it take to get to this volume of water?

After n minutes, the total volume of water will be the sum of all the drips up to that point. This turns out to be 2n-1. If this baffles you, check this video (in our case a =1 and r = 2).

We want to solve for n, the number of steps (minutes), 2n = 1 + 15.7×1012. The easiest way to do this is to take the logarithm of both sides.
n log(2) = log(1 + 15.7×1012).
So
n = log(1 + 15.7×1012) / log(2) = 44.8 minutes

At the 43rd minute the stadium would be more than half full: (243 – 1) = 8.80 x 1012, i.e. 56% of capacity.

By the 44th minute the stadium would have overflowed: (244 – 1) = 17.6 x 1012, i.e. 112% of capacity.

Notice that (after the first few minutes) in the nth minute the volume released is equal to the total volume that’s already in, so at the 44th minute an extra 8.80 x 1012 drops are dumped in. And at the 45th minute more than another stadium-full would appear.

The speed of the rise is truly terrifying.

Relationship of this to COVID-19

The rise in the number of cases, and of deaths, rises at first in a way similar to the rise in water level in Wembley stadium. The difference is that the time taken for the number to double is not one minute, but 2 – 3 days.

As of today, Monday 23rd March, both the number of diagnosed cases, and the number of deaths, in the UK are still rising exponentially. The situation in the UK now is almost exactly what it was in Italy 15 days ago. This from Inigo Martincorena (@imartincorena), shows it beautifully.

corena-tweet

italy+14 days

corena-tweet-2

Boris Johnson’s weak and woolly approach will probably cost many thousands of lives and millions of pounds.

Up to now I’ve resisted the temptation to suggest that Whitty and Vallance might have been influenced by Dominic Cummings. After this revelation, in yesterday’s Sunday times, it’s getting progressively harder to believe that.

cummings-sunday-times

We have been self-isolated since March 12th, well before Johnson advised us it. It was obvious common sense.

Please stay at home and see nobody if you possible can.. This cartoon, by Toby Morris of thespinoff.co.nz, shows why.

spinoff.co.nz

Some good reading

The report from Imperial College, 16 March 2020, that seems to have influenced the government:

Tomas Pueyo. His piece on “Coronavirus: Why You Must Act Now“, dated March 10th, had 40 million views in a week

Tomas Pueyo. March 19th; What the Next 18 Months Can Look Like, if Leaders Buy Us Time.

“Some countries, like France, Spain or Philippines, have since ordered heavy lockdowns. Others, like the US, UK, or Switzerland, have dragged their feet, hesitantly venturing into social distancing measures.”

David Spiegelhalter. March 21st.

“So, roughly speaking, we might say that getting COVID-19 is like packing a year’s worth of risk into a week or two. Which is why it’s important to spread out the infections to avoid the NHS being overwhelmed.”

Washington Post. Some excellent animations March 14th

Up to date statistics. Worldometer is good (allows semi-log plots too).

Jump to follow-up

Jump to follow-up

On Sunday 23 September, we recorded an interview with Rosi Sexton. Ever since I got to know her, I’ve been impressed by her polymathy. She’s a musician, a mathematician and a champion athlete, and now an osteopath: certainly an unusual combination. You can read about her on her Wikipedia page: https://en.wikipedia.org/wiki/Rosi_Sexton.

The video is long and wide-ranging, so I’ll give some bookmarks, in case you don’t want to watch it all. (And please excuse my garish London marathon track suit.)

Rosi recently started to take piano lessons again, after a 20 year break. She plays Chopin in the introduction, and Prokofiev and Schubert at 17:37 – 20:08. They are astonishingly good, given the time that’s elapsed since she last played seriously.

We started corresponding in 2011, about questions concerning evidence and alternative medicine as well as sports. Later we talked about statistics too: her help is acknowledged in my 2017 paper about p values. And discussions with her gave rise to the slide at 26:00 in my video on that topic.

Rosi’s accomplishments in MMA have been very well-documented and my aim was to concentrate on her other achievements. Nonetheless we inevitably had to explore the reasons why a first class mathematician chose to spend 14 years of her life in such a hard sport. I’m all for people taking risks if they want to. I have more sympathy for her choice than many of my friends, having myself spent time doing boxing, rugby, flying, sailing, long distance running, and mountain walking. I know how they can provide a real relief from the pressures of work.

The interview starts by discussing when she started music (piano, age 6) and how she became interested in maths. In her teens, she was doing some quite advanced maths: she relates later (at 1:22:50) how she took on holiday some of Raymond Smullyan’s books on mathematical logic at the age of 15 or 16. She was also playing the piano and the cello in the Reading Youth Orchestra, and became an Associate of the London College of music at 17. And at 14 she started Taekwondo, which she found helpful in dealing with teenage demons.

She was so good at maths that she was accepted at Trinity College, Cambridge where she graduated with 1st class hons. And then went on to a PhD, at Manchester. It was during her PhD that she became interested in MMA. We talk at 23:50 about why she abandoned maths (there’s a glimpse of some of her maths at 24:31), and devoted herself to MMA until she retired from that in 2014. In the meantime she took her fifth degree, in osteopathy, in 2010. She talks about some of her teenage demons at 28:00.

Many of my sceptical friends regard all osteopaths as quacks. Some certainly are. I asked Rosi about this at 38:40 and her responses can’t be faulted. She agrees that it’s rarely possible to know whether the treatments she uses are effective or whether the patient would have improved anyway. She understands regression to the mean. We discussed the problem of responders and non-responders. She appreciates that it’s generally not possible to tell whether or not they exist (for more on this, see Stephen Senn’s work. . Even the best RCT tells us only about the average response. Not all osteopath’s are the same.

We talk about the problems of doping and of trans competitors in sports at 49:30, and about the perception of contact sports at 59:32. Personally I have no problem with people competing in MMA, boxing or rugby, if that’s what they want to do. Combat sports are the civilised alternative to war. It isn’t the competitors that I worry about, it’s the fans.

At 1:14:28 we discussed how little is known about the long-term dangers of contact sports. The possible dangers of concussion led to a discussion of Russell’s paradox at 1:20:40.

I asked why she’s reluctant to criticise publicly things like acupuncture or “craniosacral therapy” (at 1:25:00). I found her answers quite convincing.

At 1:43:50, there’s a clip taken from a BBC documentary of Rosi’s father speaking about his daughter’s accomplishments, her perfectionism and her search for happiness.

Lastly, at 1:45:27, there’s a section headed “A happy new beginning”. It documents Rosi’s 40th birthday treat, when she with her new partner, Stephen Caudwell, climbed the highest climbing wall in the world, the Luzzone dam. After they walked down at the end of the climb, they got engaged.

I wish them both a very happy future.

Postcript. Rosi now runs the Combat Sports Clinic. The have recently produced a video about neck strength training, designed to help people who do contact sports -things like rugby, boxing, muay thai and MMA. I’ve seen only the preview, but there is certainly nothing quackish about it. It’s about strength training.

Jump to follow-up

If you are not a pharmacologist or physiologist, you may never have heard of Bernard Ginsborg. I first met him in 1960. He was a huge influence on me and a great friend. I’m publishing this here because the Physiological Society has published only a brief obituary.

Bernard & Andy

Bernard with his wife, Andy (Andrina).

You can download the following documents.

I’ll post here my own recollections of Bernard here.

Bernard Ginsborg was a lecturer in the Pharmacology department in Edinburgh when I joined that department in 1960, as a PhD student.

I recall vividly our first meeting in the communal tea room: smallish in stature, large beard and umbrella, My first reaction was ‘is this chap ever going to stop talking?’. My second reaction followed quickly: this chap has an intellect like nobody I’d encountered before.

I’d been invited to Edinburgh by Walter Perry, who had been external examiner for my first degrees in Leeds. In my 3rd year  viva, he’d asked me to explain the difference between confidence limits and fiducial limits. Of course I couldn’t answer, and spent much of my 4th year trying to find out.  I didn’t succeed but produced a paper that most have impressed him.  He, together with W.E. Brocklehurst, were my PhD supervisors. I saw Perry only when he dropped into my lab for a cigarette between committee meetings., but he treated me very well. He got me a Scottish Hospitals Endowment Trust scholarship which paid twice the usual MRC salary for a PhD student, and he made me an honorary lecturer so that I could join the magnificent staff club on Chambers Street (now gone), where I met, among many others, Peter Higgs, of boson fame.

I very soon came to wish that Bernard was my supervisor rather than Perry. I loved his quantitative approach. A physicist was more appealing to me than a medic.  We spent a lot of time talking and I learnt a huge amount from him.  I had encountered some of Bernard Katz’s papers in my 4th undergraduate year, and realised they were something special, but I didn’t know enough about electrophysiology to appreciate them fully. Bernard explained it all to me.  His 1967 review, Ion movements in junctional transmission, is a classic: still worth reading by anyone interested in electrophysiology. Bernard’s mathematical ability was invaluable to me when, during my PhD, I was wrestling with the equations for diffusion in a cylinder with binding (see appendix here).

The discussions in the communal tea room were hugely educational. Dick Barlow and R.P. Stephenson were especially interesting, I soon came to realise that Bernard had a better grasp on quantitative ideas about receptors than either of them. His use of Laplace transforms to solve simultaneous differential equations in a 1974 paper was my first introduction to them, and that proved very useful to me later. Those discussions laid the ground for a life-long interest in the topic for me.

After I left the pharmacology department in 1964, contact became more intermittent for a while. I recall vividly a meeting held in Crans sur Sierre, Switzerland in 1977,  The meetings there were good, despite have started as golfing holidays for J. Murdoch Ritchie and Joe Hoffman.  There was a certain amount of tension between Bernard and Charles F Stevens, the famous US electrophysiologist. Alan Hawkes and I had just postulated that the unitary event in ion channel opening at the neuromuscular junction was a short burst of openings rather than single openings.  This contradicted the postulate by  Anderson & Stevens (1973) that binding of the agonist was very fast compared with the channel opening and shutting. At the time our argument was theoretical –it wasn’t confirmed experimentally until the early 80s.  Bernard was chairing a session and he tried repeatedly to get Stevens to express an opinion on our ideas, but failed. 

At dinner, Stevens was holding court: he expressed the view that rich people shouldn’t pay tax because there were too few of them and it cost more to get them pay up than it was worth.  He sat back to wait for the angry protestations of the rest of the people at the table,  He hadn’t reckoned with Bernard. He said how much he agreed, and by the same token, the police shouldn’t waste time trying to catch murderers. There were too few of them and it wasted too much police time.  The argument was put eloquently as only Bernard could do.  Stevens, who I suspect had not met Bernard before, was uncharacteristically speechless. He had no idea what hit him.  It was a moment to savour. 

M+BLG 1977
May 1977, Crans sur Sierre, Switzerland.

For those who knew Bernard, it was another example of his ability to argue eloquently for any proposition whatsoever. I’d been impressed by his speech on how the best way to teach about drugs was to teach them in alphabetical order: it would make as much sense as any other way of categorising them.  Usually there was just enough truth in these propositions to make the listener who hadn’t heard him in action before, wonder, for a moment, if he was serious.  The picture shows him with my wife, Margaret, at the top of a mountain during a break in the meeting.  He’d walked up, putting those of us who’d taken the train to shame.

In 1982, Alan Hawkes and I published a paper with the title “On the stochastic properties of bursts of single ion channel openings and of clusters of bursts.”  It was 59 pages long with over 400 equations, most of which used matrix notation. After it had been accepted, I discovered that Bernard had taken on the heroic job of reviewing itl  This came to light when I got a letter from him that consisted of two matrices which, when multiplied out, revealed his role.

For many years Bernard invited me to Edinburgh to talk to undergraduates about receptors and electrophysiology. (I’ve often wondered if that’s why most of our postdocs came from Glasgow than from Edinburgh during that time.)  It was on one such visit in 1984 that I got a phone call to say that my wife, Margaret, had collapsed on the railway station at Walton-on-Thames while 6 months pregnant, and had been taken to St Peter’s Hospital in Chertsey.  The psychodrama of our son’s birth has been documented elsewhere,  A year later we came to Edinburgh once again. The pictures taken then show Bernard looking impishly happy, as he very often did, in his Edinburgh house in Magdala Crescent. The high rooms were lined with books, all of which he seemed to have read.  His intellect was simply dazzling.

BLG 1985

M&DC-Magdala-Crescent

 December 19th 1985. Magdala Crescent, Edinburgh

The following spring we visited again, this time with our son Andrew, aged around 15 months. We went with Bernard and Andy to the Edinburgh Botanic gardens. Andrew who was still not walking, crawled away rapidly up a grassy slope. Andy said don’t worry, when he gets to the tope he’ll stop and look back for you.  She was a child psychologist so we believed her. Andrew, however, disappeared from sight over the brow of the hill.

During these visits, we stayed with Bernard and Andy at their Edinburgh house.

The experience of staying with them was like being exposed to an effervescent intellectual fountain. It’s hard to think of a better matched couple.

After Bernard retired in 1985, he took no further interest in science.  For him, it was a chance to spend time on his many other interests.  After he went to live in France, contact became more intermittent. Occasional emails were exchanged.  It was devastating to hear about the death of Andy in 2013.  The last time that I saw both of them was in 2008, at John Kelly’s house.  He was barely changed from the day that I met him in 1960.

Bernard was a legend.  It’s hard to believe that he’s no longer here.

Kelly's 2008

BLG+cat 2008
Bernard in 2008 at John Kelly’s house.

Lastly, here is a picture taken at the 2009 meeting of the British Pharmacological Society, held in Edinburgh.

BPS-2009
At the British Pharm. Soc meeting, 2009.Left to right: DC, BLG, John Kelly, Mark Evans, Anthony Harmer

 

Follow-up

Jump to follow-up

See also The history of eugenics at UCL: the inquiry report.

On Monday evening (8th January 2018), I got an email from Ben van der Merwe, a UCL student who works as a reporter for the student newspaper, London Student.  He said

“Our investigation has found a ring of academic psychologists associated with Richard Lynn’s journal Mankind Quarterly to be holding annual conferences at UCL. This includes the UCL psychologist professor James Thompson”.

He asked me for comment about the “London Conference on Intelligence”. His piece came out on Wednesday 10th January. It was a superb piece of investigative journalism.  On the same day, Private Eye published a report on the same topic.

I had never heard about this conference, but it quickly became apparent that it was a forum for old-fashioned eugenicists of the worst kind.  Perhaps it isn’t surprising that neither I, nor anyone else at UCL that I’ve spoken to had heard of these conferences because they were surrounded by secrecy.  According to the Private Eye report:

“Attendees were only told the venue at the last minute and asked not to share the information”

The conference appears to have been held at least twice before. The programmes for the 2015 conference [download pdf] and the 2016 conference [download pdf] are now available, but weren’t public at the time.   They have the official UCL logo across the top despite the fact that Thompson has been only an honorary lecturer since 2007.

LCI header

A room was booked for the conference through UCL’s external room booking service. The abstracts are written in the style of a regular conference. It’s possible that someone with no knowledge of genetics (as is likely to be the case for room-booking staff) might have not spotted the problem.

The huge problems are illustrated by the London Student piece, which identifies many close connections between conference speakers and far-right, and neo-nazi hate groups.

“[James Thompson’s] political leanings are betrayed by his public Twitter account, where he follows prominent white supremacists including Richard Spencer (who follows him back), Virginia Dare, American Renaissance, Brett Stevens, the Traditional Britain Group, Charles Murray and Jared Taylor.”

“Thompson is a frequent contributor to the Unz Review, which has been described as “a mix of far-right and far-left anti-Semitic crackpottery,” and features articles such as ‘America’s Jews are Driving America’s Wars’ and ‘What to do with Latinos?’.

His own articles include frequent defences of the idea that women are innately less intelligent than men (1, 2, 3,and 4), and an analysis of the racial wage gap which concludes that “some ethnicities contribute relatively little,” namely “blacks.”

“By far the most disturbing of part of Kirkegaard’s internet presence, however, is a blog-post in which he justifies child rape. He states that a ‘compromise’ with paedophiles could be:

“having sex with a sleeping child without them knowing it (so, using sleeping medicine. If they don’t notice it is difficult to see how they cud be harmed, even if it is rape. One must distinguish between rape becus the other was disconsenting (wanting to not have sex), and rape becus the other is not consenting, but not disconsenting either.”

The UCL Students’ Union paper, Cheesegrater, lists some of James Thompson’s tweets,including some about brain size in women.

Dr Alice Lee

It’s interesting that these came to light on the same day that I learned that the first person to show that there was NO correlation between brain size and intelligence was Dr Alice Lee, in 1901:  A First Study of the Correlation of the Human Skull. Phil. Trans. Roy. Soc A  https://doi.org/10.1098/rsta.1901.0005 [download pdf].

Alice Lee published quite a lot, much of it with Pearson. In 1903, for example, On the correlation of the mental and physical characters in man. Part II Alice Lee, Marie A. Lewenz and Karl Pearson https://doi.org/10.1098/rspl.1902.0070  [download pdf].  She shows herself to be quite feisty in this paper -she says of a paper with conclusions that differs from hers

“Frankly, we consider that the memoir is a good illustration of how little can be safely argued from meagre data and a defective statistical theory.”

She also published a purely mathematical paper, “On the Distribution of the Correlation Coefficient in Small Samples”,  H. E. Soper, A. W. Young, B. M. Cave, A. Lee and K. Pearson, Biometrika, 11, 1917, pp. 328-413 (91 pages) [download pdf].  There is interesting comment on this paper in encyclopedia.com.

Alice Lee was the first woman to get a PhD in mathematics from UCL and she was working in the Galton laboratory, under Karl Pearson. Pearson was a great statistician but also an extreme eugenicist.  It was good to learn that he supported women in science at a time when that was almost unknown.  The Dictionary of National Biography says

“He considered himself a supporter of equal rights and opportunities for women (later in his capacity as a laboratory director he hired many female assistants), yet he also expressed a willingness to subordinate these ideals to the greater good of the race.”

But it must never be forgotten that Karl Pearson said, in 1934,

” . . . that lies rather in the future, perhaps with Reichskanzler Hitler and his proposals to regenerate the German people. In Germany a vast experiment is in hand, and some of you may live to see its results. If it fails it will not be for want of enthusiasm, but rather because the Germans are only just starting the study of mathematical statistics in the modern sense!”

And if you think that’s bad, remember that Ronald Fisher, after World War 2, said, in 1948,

“I have no doubt also that the [Nazi] Party sincerely wished to benefit the German racial stock, especially by the elimination of manifest defectives,
such as those deficient mentally, and I do not doubt that von Verschuer gave, as I should have done, his support to such a movement.”

For the context of this comment, see Weiss (2010).

That’s sufficient reason for the removal of their names from buildings at UCL.

What’s been done so far?

After I’d warned UCL of the impending scandal, they had time to do some preliminary investigation. An official UCL announcement appeared on the same day (10 Jan, 2018) as the articles were published.

“Our records indicate the university was not informed in advance about the speakers and content of the conference series, as it should have been for the event to be allowed to go ahead”

“We are an institution that is committed to free speech but also to combatting racism and sexism in all forms.”

“We have suspended approval for any further conferences of this nature by the honorary lecturer and speakers pending our investigation into the case.”

That is about as good as can be expected. It remains to be seen why the true nature of the conferences was not spotted, and it remains to be seen why someone like James Thompson was an honorary senior lecturer at UCL. Watch this space.

How did it happen

Two videos that feature Thompson are easily found. One, from 2010, is on the UCLTV channel. And in March 2011, a BBC World News video featured Thompson.

But both of these videos are about his views on disaster psychology (Chilean miners, and Japanese earthquake, respectively). Neither gives any hint of his extremist political views. To discover them you’d have to delve into his twitter account (@JamesPsychol) or his writings on the unz site.  It’s not surprising that they were missed.

I hope we’ll know more soon about how these meetings slipped under the radar.  Until recently, they were very secret.  But then six videos of talks at the 2017 meeting were posted on the web, by the organisers themselves. Perhaps they were emboldened by the presence of an apologist for neo-nazis in the White House, and by the government’s support for Toby Young, who wrote in support of eugenics. The swing towards far-right views in the UK, in the USA and in Poland, Hungary and Turkey, has seen a return to public discussions of views that have been thought unspeakable since the 1930s. See, for example, this discussion of eugenics by Spectator editor Fraser Nelson with Toby Young, under the alarming heading “Eugenics is back“.

The London Conference on Intelligence channel used the UCL logo, and it was still public on 10th January. It had only 49 subscribers. By 13th January it had been taken down (apparently by its authors). But it still has a private playlist with four videos which have been viewed only 36 times (some of which were me). Before it vanished, I made a copy of Emil Kirkegard’s talk, for the record.

youtube channel

Freedom of speech

Incidents like this pose difficult problems, especially given UCL’s past history. Galton and Pearson supported the idea of eugenics at the beginning of the 20th century, as did George Bernard Shaw. But modern geneticists at the Galton lab have been at the forefront in showing that these early ideas were simply wrong.

UCL has, in the past, rented rooms for conferences of homeopaths. Their ideas are deluded and sometimes dangerous, but not illegal. I don’t think they should be arrested, but I’d much prefer that their conferences were not at UCL.

A more serious case occurred on 26 February 2008. The student Islamic Society invited  representatives of the radical Islamic creationist, Adnan Oktar, to speak at UCL. They were crowing that the talk would be held in the Darwin lecture theatre (built in the place formerly occupied by Charles Darwin’s house on Gower Street). In the end, the talk was allowed to go ahead, but it was moved by the then provost to the Gustave Tuck lecture theatre, which is much smaller, and which was built from a donation by the former president of the Jewish Historical Society. See more accounts here, here and here. It isn’t known what was said, so there is no way to tell whether it was illegal, or just batty.

It is very hard to draw the line between hate talk and freedom of speech.  There was probably nothing illegal about what was said at the Intelligence Conferences.  It was just bad science, used to promote deeply distasteful ideas..

Although, in principle, renting a room doesn’t imply any endorsement, in practice all crackpot organisations love to use the name of UCL to promote their cause. That alone is sufficient reason to tell these people to find somewhere else to promote their ideas.

Follow up in the media

For a day or two the media were full of the story. It was reported, for example, in the Guardian and in the Jewish Chronicle,

On 11th January I was asked to talk about the conference on BBC World Service. The interview can be heard here.

speakerClick to play the interview.

The real story

Recently some peope have demanded that the names of Galton and Pearson should be expunged from UCL.

There would be a case for that if their 19th century ideas were still celebrated, just as there is a case for removing statues that celebrate confederate generals in the southern USA.  Their ideas about measurement and statistics are justly celebrated. But their ideas about eugenics are not celebrated.

On the contrary, it is modern genetics, done in part by people in the Galton lab, that has shown the wrongness of 19th century views on race. If you want to know the current views of the Galton lab, try these.  They could not be further from Thompson’s secretive pseudoscience.

Steve Jones’ 2015 lecture “Nature, nurture or neither: the view from the genes”,

or “A matter of life and death: To condemn the study of complex genetic issues as eugenics is to wriggle out of an essential debate“.

Or check the writing of UCL alumnus, Adam Rutherford: “Why race is not a thing, according to genetics”,

or, from Rutherford’s 2017 article

“We’ve known for many years that genetics has profoundly undermined the concept of race”

“more and more these days, racists and neo-Nazis are turning to consumer genetics to attempt to prove their racial purity and superiority. They fail, and will always fail, because no one is pure anything.”

“the science that Galton founded in order to demonstrate racial hierarchies had done precisely the opposite”

Or read this terrific account of current views by Jacob A Tennessen “Consider the armadillos“.

These are accounts of what geneticists now think. Science has shown that views expressed at the London Intelligence Conference are those of a very small lunatic fringe of pseudo-scientists. But they are already being exploited by far-right politicians.

It would not be safe to ignore them.

Follow-up

15 January 2018. The involvement of Toby Young

The day after this was posted, my attention was drawn to a 2018 article by the notorious Toby Young. In it he confirms the secretiveness of the conference organisers.

“I discovered just how cautious scholars in this field can be when I was invited to attend a two-day conference on intelligence at University College London by the academic and journalist James Thompson earlier this year. Attendees were only told the venue at the last minute – an anonymous antechamber at the end of a long corridor called ‘Lecture
Room 22’ – and asked not to share the information with anyone else.”

More importantly, it shows that Toby Young has failed utterly to grasp the science.

“You really have to be pretty stubborn to dispute that general cognitive ability is at least partly genetically based.”

There is nobody who denies this.
The point is that the interaction of nature and nurture is far more subtle than Young believes, and that makes attempts to separate them quantitatively futile. He really should educate himself by looking at the accounts listed above (The real story)

16 January 2018. How UCL has faced its history

Before the current row about the “London Intelligence Conference”, UCL has faced up frankly to its role in the development of eugenics. It started at the height of Empire, in the 19th century and continued into the early part of the 20th century. The word “eugenics” has not been used at UCL since it fell into the gravest disrepute in the 1930s, and has never been used since WW2. Not, that is, until Robert Thompson and Toby Young brought it back. The history has been related by curator and science historian, Subhadra Das. You can read about it, and listen to episodes of her podcast, at “Bricks + Mortals, A history of eugenics told through buildings“. Or you can listen to her whole podcast.

Although Subhadra Das describes Galton as the Victorian scientist that you’ve never heard of. I was certainly well aware of his ideas before I first came to UCL (in 1964). But at that time. I thought of Karl Pearson only as a statistician, and I doubt if I’d even heard of Flinders Petrie. Learning about their roles was a revelation.

17 January 2018.

Prof Semir Zeki has been pointed out to me that it’s not strictly to say “the word “eugenics” has not been used at UCL since it fell into the gravest disrepute in the 1930s”. It’s true to say that nobody advocated it but the chair of Eugenics was not renamed the chair of Human Genetics until 1963. This certainly didn’t imply approval. Zeki tells me that it’s holder “Lionel Penrose, when he mentioned his distaste for the title, saying that it was a hangover from the past, and should be changed”.

Sarah Ferguson, ex-wife of Prince Andrew, Duke of York, seems to need a lot of money. Some of her wheezes are listed in today’s Times. That’s behind a paywall, as is the version reproduced in The Australian (Murdoch connection presumably). You can read it (free) here, with more details below the article.

Duchess

Thomas Ough and David Brown

Published at 12:01AM, January 15 2015

In her seemingly endless quest to make money, Sarah, Duchess of York, has had little hesitation using her title to generate sales.

This week, though, she landed herself in trouble after appearing to use the name of Britain’s foremost scientific university to lend credibility to a promotion for her new diet system.

The duchess told NBC’s Today show during an interview to promote her “emulsifier” programme that she was aware of the dangers of obesity through her work as an ambassador for the Institute of Global Health Improvement at Imperial College London.

Last night she apologised for “any misunderstanding” after Imperial College, ranked the joint second-best university in the world, sought to distance itself from the duchess’s promotion.

A spokesman said: “The commercial activities promoted by Sarah Ferguson in the interview with Today are not connected in any way to Imperial’s staff or research activities, and the college does not endorse the suggestion of any possible link.”

The institute, which has more than 160 specialists, including clinicians, engineers, scientists and psychologists, is headed by Lord Darzi of Denham, a former Labour health minister.

The duchess told the Today presenter Matt Lauer that she had been a comfort eater since the age of 12 but the “turning point” was when she realised that she was the same weight as when pregnant with Princess Beatrice, now 25.

“I couldn’t bear looking at myself any minute longer,” she confided. “In fact, the size of my ass probably saved my life.” She said she discovered that the “emulsifier” was “a solution for behavioural change” and helped her to lose 55lbs. The $99 kit, which includes a blender, a couple of recipe books and some workout DVDs, is produced by Tristar Products, a direct marketing company for home and health items.

The duchess told the breakfast show: “I have just found out on my discoveries with Imperial College London . . . I’m an ambassador for the Institute for Global Health Innovation, and I found out that children, little children, are going to die before their parents because of obesity.”

The benefits of the kit were questioned yesterday by Ayela Spiro, a senior scientist at the British Nutrition Foundation.

She said: “In terms of the particular product, no juicer or blender on their own can enhance how much nutrition your body will absorb. Any claims made about such products such that it accelerates weight loss, boosts energy and strengthens the immune system need to be treated with caution.”

Professor David Colquhoun of University College London, said: “I find it pretty amazing that Imperial chose someone like her to be an ‘ambassador’. Imperial does have an interest in appetite suppression but hasn’t come up with any usable product yet and this research has nothing to do with blenders.

“[Her television appearance] was sheer name-dropping, something she’s quite good at. The only ‘discovery’ she seems to have made is that if you eat less you’ll lose weight. The $100 blender has nothing to do with it.”

A spokesman for the duchess said: “She is not trying to use her association with the institute to promote her personal interests. She was talking about ‘behavioural change’, which is endorsed by the institute, and her own behavioural change.”

With the article there’s an inset that gives details of other ways in which Sarah Ferguson has exploited her title to make money.

duchess business

Fergie’s latest wheeze, Duchess Discoveries is being promoted heavily on US television. It bears a close resemblance to those ghastly daytime TV advertising channels. Watch her interview on a US TV programme, "Today".

It’s partly promoting her latest diet scam, and partly a vigorous defence of her ex-husbands innocence in the face of allegations of sexual shenanigans. Of course she doesn’t know whether the allegations are true. The Queen doesn’t know (so why bother with the denial from Buckingham Palace?). And I don’t know. We know plenty about Prince Andrew’s bad behaviour, but we don’t know whether he’s had sex with minors.

Worse still is the promotional video on the “Duchess Discoveries” site itself.

I quote:

“I’m SO excited about my fusion accelerator system, accelerates weight loss, accelerates your energy, accelerates and strengthens your immune system.”

"accelerates weight loss" is certainly unproven. Mere hype

"accelerates your energy" is totally meaningless. It’s the sort of sciencey-sounding words that are loved by all quacks.

"accelerates and strengthens your immune system". Sigh. "strengthening the immune system is the perpetual mantra of just about every quack. It’s totally meaningless. Just made-up nutribollocks.

The promotional video is fraudulent nonsense. If it were based in the UK I have no doubt that it would be quickly slapped down by the Advertising Standards Authority. But in the USA the first amendment allows people to lie freely about nutrition, which is why it’s such big business.

It bothers me that the most that the best that the British Nutrition Foundation could manage was to say that such claims "need to be treated with caution". They are mendacious nonsense. Why not just say so?

Follow-up

Jump to follow-up

The College of Medicine is well known to be the reincarnation of the late unlamented Prince of Wales Foundation for Integrated Health. I labelled it as a Fraud and Delusion, but that was perhaps over-generous. It seems to be morphing into a major operator in the destruction of the National Health Service through its close associations with the private health industry.

Their 2012 Conference was held on 3rd May. It has a mixture of speakers, some quite sound, some outright quacks. It’s a typical bait and switch event. You can judge its quality by the fact that the picture at the top of the page that advertises the conference shows Christine Glover, a homeopathic pharmacist who makes a living by selling sugar pills to sick people (and a Trustee of the College of Medicine).

Her own company’s web site says

"Worried about beating colds and flu this winter? We have several approaches to help you build your immune system."

The approaches are, of course, based on sugar pills. The claim is untrue and dangerous. My name for that is fraud.

glover

When the "College of Medicine" started it was a company, but on January 30th 2012, it was converted to being a charity. But the Trustees of the charity are the same people as the directors of the company. They are all advocates of ineffective quack medicine. The contact is named as Linda Leung, who was Operations Director of the Prince’s Foundation until it closed, and then became Company Secretary for the “College of Medicine”.

The trustees of the charity are the same people who were directors of the company

  • Dr Michael Dixon, general practitioner. Michael Dixon was Medical Director of the Prince’s Foundation until it closed down.
  • Professor George Lewith, is Professor of Health Research in the Complementary Medicine Research Unit, University of Southampton. He was a Foundation Fellow of the Prince’s Foundation until it closed down. Much has been written about him here.
  • Professor David Peters. is Professor of Integrated Healthcare and Clinical Director at the University of Westminster’s School of Integrated Health; He’s famous for allowing dowsing with a pendulum as a method of diagnosis for treatment with unproven herbal medicines,
    He was a Foundation Fellow of the Prince’s Foundation until it closed down.
  • Mrs Christine Glover is a pharmacist who sells homeopathic pills. She was a Foundation Fellow of the Prince’s Foundation until it closed down.

The involvement of Capita

According to their web site

"A Founder of the College of Medicine is Capita."

Still more amazingly, the CEO of the College of Medicine is actually an employee of Capita too.

"Mark Ratnarajah is interim CEO of the College of Medicine as well as Business Director at Capita Health and Wellbeing."

brunjes

That isn’t the end of it. The vice-president of the College of Medicine is Dr Harry Brunjes. There is an article about him in the May 2012 issue of Director Magazine.

It has to be said that he doesn’t sound like a man with much interest in the National Health Service..

Within 9 years of graduating he set up in private practice in Harley Street. Five years later he set up Premier Medical, which, after swallowing a couple of rivals, he sold to Capita for £60 million. He is now recorded in a Companies House document as Dr Henry Otto Brunjes, a director of Capita Health Holdings Limited. This company owns all the shares in Capita Health and Wellbeing Limited, and it is, in turn, owned by Capita Business Services Limited. And they are owned by Capita Holdings Limited. I do hope that this baroquely complicated array of companies with no employees has nothing to do with tax avoidance.

Capita is, of course, a company with a huge interest in the privatisation of health care. It also has a pretty appalling record for ripping off the taxpayer.

It has long been known in Private Eye, as “Crapita” and “the world’s worst outsourcing firm”.

Capita were responsible for of the multimillion pound failed/delayed IT project for the NHS and HMRC. They messed up on staff administration services at Leicester Hospitals NHS Trust and the BBC where staff details were lost.  They failed to provide sufficient computing systems for the Criminal Records Bureau, which caused lengthy delays.  Capita were also involved in the failure of the Individual Learning Accounts following a £60M over-spend. And most recently, they have caused the near collapse of court translation services after their acquisition of Applied Language Services.

With allies like that, perhaps the College of Medicine hardly needs enemies. No doubt Capita will be happy to provide the public with quackery for an enormous fee from the taxpayer.

dixon

One shouldn’t be surprised that the College is involved in Andrew Lansley’s attempts to privatise healthcare. Michael Dixon, Chair of the College of Medicine, also runs the "NHS Alliance", almost the only organisation that supported the NHS Bill. The quackery at his own practice defies belief (some it is described here).

One would have thought that such a close association with a company with huge vested interests would not be compatible with charitable status. I’ve asked the Charity Commission about that. The Charity commission, sadly, makes no judgements about the worthiness of the objects of the charities it endorses. All sorts of dangerous quack organisations are registered charities, like, for example, Yes to Life.

Secrecy at the College of Medicine

One of the big problems about the privatisation of medicine and education is that you can’t use the Freedom of Information Act to discover what they are up to. A few private companies try to abide by that act, despite not being obliged to do so. But the College of Medicine is not one of them.

Capita They refuse to disclose anything about their relationship with Capita. I asked I asked Graeme Catto, who is a friend (despite the fact that I think he’s wrong). I got nothing.

"Critical appraisal" I also asked Catto for the teaching materials used on a course that they ran about "critical appraisal". Any university is obliged, by the decision of the Information Tribunal, to produce such material on request. The College of Medicine refused, point blank. What, one wonders, have they got to hide? Their refusal strikes me as deeply unethical.

The course (costing £100) on Critical Appraisal, ran on February 2nd 2012. The aims are "To develop introductory skills in the critical appraisal of randomised controlled trials (RCTs) and systematic reviews (SRs)". That sounds good. Have they had a change of heart about testing treatments?

But, as always, you have to look at who is running the course. Is it perhaps a statistician with expertise in clinical trials? Or is it a clinician with experience in running trials? There are plenty of people with this sort of expertise. But no, It is being run by a pharmacist, Karen Pilkington, from that hotbed of unscientific medicine, the University of Westminster.

Pilkington originally joined the University of Westminster as manager for a 4-year project to review the evidence on complementary therapies (funded by the Department of Health). All of her current activities centre round alternative medicine and most of her publications are in journals that are dedicated to alternative medicine. She teaches "Critical Appraisal" at Westminster too, so I should soon have the teaching materials, despite the College’s attempts to conceal them.

Three people who ought to know better

Ore has to admire, however grudgingly, the way that the quacks who run the College of Medicine have managed to enlist the support of several people who really should know better. I suppose they have been duped by that most irritating characteristic of quacks, the tendency to pretend they have the monopoly on empathetic treatment of patients. We all agree that empathy is good, but every good doctor has it. One problem seems to be that senior medical people are not very good at using Google. They don’t do their homework.

catto

Professor Sir Graeme Catto MD DSc FRCP FMedSci FRSE is president of the College of Medicine. He’s Emeritus Professor of Medicine at the University of Aberdeen. He was President of the General Medical Council from 2002 to 2009, Pro Vice-Chancellor, University of London and Dean of Guy’s, King’s and St Thomas’ medical school between 2000 and 2005. He’s nice and well-meaning chap, but he doesn’t seem to know much about what’s going on in the College.

kennedy

Professor Sir Ian Kennedy LLD, FBA, FKC, FUCL, Hon.DSc(Glasgow), Hon.FRCP is vice-president of the College. Among many other things he is Emeritus Professor of Health Law, Ethics and Policy at University College London. He was Chair of the Healthcare Commission until 2003, when it merged with other regulators to form the Care Quality Commission. No doubt he can’t be blamed for the recent parlous performence of the CQC.

halligan

Professor Aidan Halligan MA, MD, FRCOG, FFPHM, MRCPI Since March 200y he has been Director of Education at University College London Hospitals. From 2003 until 2005, he was Deputy Chief Medical Officer for England, with responsibility for issues of clinical governance, patient safety and quality of care. He’s undoubtedly a well-meaning man, but so focussed on his (excellent) homelessness project that he seems immune to the company he keeps. Perhaps the clue lies in the fact that when I asked him what he thought of Lansely’s health bill, he seemed to quite like it.

It seems to me to be incomprehensible that these three people should be willing to sign a letter in the British Medical Journal in defence of the College, with co-signatories George Lewith (about whom much has been written here) and the homeopath Christine Glover. In so doing, they betray medicine, they betray reason, and most important of all, they betray patients. Perhaps they have spent too much time sitting on very important committees and not enough time with patients.

The stated aims of the College sound good.

"A force that combines scientific knowledge, clinical expertise and the patient’s own perspective. A force that will re-define what good medicine means − renewing the traditional values of service, commitment and compassion and creating a more holistic, patient-centred, preventative approach to healthcare."

But what they propose to do about it is, with a few exceptions, bad. They try to whip up panic by exaggerating the crisis in the NHS. There are problems of course, but they result largely from under-funding (we still spend less on healthcare than most developed countries), and from the progressive involvement of for-profit commercial companies, like Capita. The College has the wrong diagnosis and the wrong solution. How do they propose to take care of an aging population? Self-care and herbal medicines seem to be their solution.

The programme for the College’s workshop shows it was run by herbalist Simon Mills and by Dick Middleton an employee of the giant herbal company, Schwabe. You can see Middleton attempting to defend misleading labelling of herbal products on YouTube, opposed by me.

It seems that the College of Medicine are aiding and abetting the destruction of the National Health Service. That makes me angry.(here’s why)

I can end only with the most poignant tweet in the run up to the passing of the Health and Social Care Act. It was from someone known as @HeardInLondon, on March 15th

"For a brief period during 20th century, people gave a fuck and looked after each other. Unfortunately this proved unprofitable."

Unprofitable for Crapita, that is.

Follow-up

5 May 2012. Well well, if there were any doubt about the endarkenment values of the College, I see that the Prince of Wales, the Quacktitioner Royal himself, gave a speech at the College’s conference.

"”I have been saying for what seems a very long time that until we develop truly integrated systems – not simply treating the symptoms of disease, but actively creating health, putting the patient at the heart of the process by incorporating our core human elements of mind, body and spirit – we shall always struggle, in my view, with an over-emphasis on mechanistic, technological approaches.”

Of course we all want empathy. The speech, as usual, contributes precisely nothing.

12 June 2012. Oh my, how did I manage to miss the fact the the College’s president, Professor Sir Graeme Catto, is also a Crapita eployee. It’s over a year since he was apponted to Capita’s clinical governance board he says " “ In a rapidly growing health and wellbeing marketplace, delivering best practice in clinical governance is of utmost importance. I look forward to working with the team at Capita to assist them with continuing to adopt a best in class approach.". The operative word is "marketplace".

Jump to follow-up

Open access is in the news again.

Index on Censorship held a debate on open data on December 6th.

The video of of the meeting is now on YouTube. A couple of dramatic moments in the video: At 48 min O’Neill & Monbiot face off about "competent persons" (and at 58 min Walport makes fun of my contention that it’s better to have more small grants rather than few big ones, on the grounds that it’s impossible to select the stars).

poster

The meeting has been written up on the Bishop Hill Blog, with some very fine cartoon minutes.

Bishop Hill blog (I love the Josh cartoons -pity he seems to be a climate denier, spoken of approvingly by the unspeakable James Delingpole.)

It was gratifying that my remarks seemed to be better received by the scientists in the audience than they were by some other panel members. The Bishop Hill blog comments "As David Colquhoun, the only real scientist there and brilliant throughout, said “Give them everything!” " Here’s a subsection of the brilliant cartoon minutes

notes-dc

The bit about "I just lied -but he kept his job" referred to the notorious case of Richard Eastell and the University of Sheffield.

We all agreed that papers should be open for anyone to read, free. Monbiot and I both thought that raw data should be available on request, though O’Neill and Walport had a few reservations about that.

A great deal of time and money would be saved if data were provided on request. It shouldn’t need a Freedom of Information Act (FOIA) request, and the time and energy spent on refusing FOIA requests is silly. It simply gives the impression that there is something to hide (Climate scientists must be ruthlessly honest about data). The University of Central Lancashire spent £80,000 of taxpayers’ money trying (unsuccessfully) to appeal against the judgment of the Information Commissioner that they must release course material to me. It’s hard to think of a worse way to spend money.

A few days ago, the Department for Business, Innovation and Skills (BIS) published a report which says (para 6.6)

“The Government . . . is committed to ensuring that publicly-funded research should be accessible
free of charge.”

That’s good, but how it can be achieved is less obvious. Scientific publishing is, at the moment, an unholy mess. It’s a playground for profiteers. It runs on the unpaid labour of academics, who work to generate large profits for publishers. That’s often been said before, recently by both George Monbiot (Academic publishers make Murdoch look like a socialist) and by me (Publish-or-perish: Peer review and the corruption of science). Here are a few details.

Extortionate cost of publishing

Mark Walport has told me that

The Wellcome Trust is currently spending around £3m pa on OA publishing costs and, looking at the Wellcome papers that find their way to UKPMC, we see that around 50% of this content is routed via the “hybrid option”; 40% via the “pure” OA journals (e.g. PLoS, BMC etc), and the remaining 10% through researchers self-archiving their author manuscripts.  

I’ve found some interesting numbers, with help from librarians, and through access to The Journal Usage Statistics Portal (JUSP).

Elsevier

UCL pays Elsevier the astonishing sum of €1.25 million, for access to its journals. And that’s just one university. That price doesn’t include any print editions at all, just web access and there is no open access. You have to have a UCL password to see the results. Elsevier has, of course, been criticised before, and not just for its prices.

Elsevier publish around 2700 scientific journals. UCL has bought a package of around 2100 journals. There is no possibility to pick the journals that you want. Some of the journals are used heavily ("use" means access of full text on the web). In 2010, the most heavily used journal was The Lancet, followed by four Cell Press journals

elsevier top

But notice the last bin. Most of the journals are hardly used at all. Among all Elsevier journals, 251 were not accessed even once in 2010. Among the 2068 journals bought by UCL, 56 were never accessed in 2010 and the most frequent number of accesses per year is between 1 and 10 (the second bin in the histogram, below). 60 percent of journals have 300 or fewer usages in 2010, Above 300, the histogram tails on up to 51878 accesses for The Lancet. The remaining 40 percent of journals are represented by the last bin (in red). The distribution is exceedingly skewed. The median is 187, i.e. half of the journals had fewer than 187 usages in 2010), but the mean number of usages (which is misleading for such a skewed distribution, was 662 usages).

histo

Nature Publishing Group

UCL bought 65 journals from NPG in 2010. They get more use than Elsevier, though surprisingly three of them were never accessed in 2010, and 17 had fewer than 1000 accesses in that year. The median usage was 2412, better than most. The leader, needless to say, was Nature itself, with 153,321.

Oxford University Press

The situation is even more extreme for 248 OUP journals, perhaps because many of the journals are arts or law rather than science.

OUP-jisto

The most frequent (modal) usage of was zero (54 journals), followed by 1 to 10 accesses (42 journals) 64 percent of journals had fewer than 200 usages, and the 36 percent with over 200 are pooled in the last (red) bin. The histogram extends right up to 16060 accesses for Brain. The median number of usages in 2010 was 66.

So far I haven’t been able to discover the costs of the contracts with OUP or Nature Publishing group. It seems that the university has agreed to confidentiality clauses. This itself is a shocking lack of transparency. If I can find the numbers I shall -watch this space.

Almost all of these journals are not open access. The academics do the experiments, most often paid for by the taxpayer. They write the paper (and now it has to be in a form that is almost ready for publication without further work), they send it to the journal, where it is sent for peer review, which is also unpaid. The journal sells the product back to the universities for a high price, where the results of the work are hidden from the people who paid for it.

It’s even worse than that, because often the people who did the work and wrote the paper, have to pay "page charges". These vary, but can be quite high. If you send a paper to the Journal of Neuroscience, it will probably cost you about $1000. Other journals, like the excellent Journal of Physiology, don’t charge you to submit a paper (unless you want a colour figure in the print edition, £200), but the paper is hidden from the public for 12 months unless you pay $3000.

The major medical charity, the Wellcome Trust, requires that the work it funds should be available to the public within 6 months of publication. That’s nothing like good enough to allow the public to judge the claims of a paper which hits the newspapers the day that it’s published. Nevertheless it can cost the authors a lot. Elsevier journals charge $3000 except for their most-used journals. The Lancet charges £400 per page and Cell Press journals charge $5000 for this unsatisfactory form of open access.

Open access journals

The outcry about hidden results has resulted in a new generation of truly open access journals that are open to everyone from day one. But if you want to publish in them you have to pay quite a lot.

Furthermore, although all these journals are free to read, most of them do not allow free use of the material they publish. Most are operating under all-rights-reserved copyrights. In 2009 under 10 percent of open access journals had true Creative Commons licence.

Nature Publishing Group has a true open access journal, Nature Communications, but it costs the author $5000 to publish there. The Public Library of Science journals are truly open access but the author is charged $2900 for PLoS Medicine though PLoS One costs the author only $1350

A 2011 report considered the transition to open access publishing but it doesn’t even consider radical solutions, and makes unreasonably low estimates of the costs of open access publishing.

Scam journals have flourished under the open access flag

Open access publishing has, so far, almost always involved paying a hefty fee. That has brought the rats out of the woodwork and one gets bombarded daily with offers to publish in yet another open access journal. Many of these are simply scams. You pay, we put it on the web and we won’t fuss about quality. Luckily there is now a guide to these crooks: Jeffrey Beall’s List of Predatory, Open-Access Publishers.

One that I hear from regularly is Bentham Open Journals

(a name that is particularly inappropriate for anyone at UCL). Jeffery Beall comments

"Among the first, large-scale gold OA publishers, Bentham Open continues to expand its fleet of journals, now numbering over 230. Bentham essentially operates as a scholarly vanity press."

They undercut real journals. A research article in The Open Neuroscience Journal will cost you a mere $800. Although these journals claim to be peer-reviewed, their standards are suspect. In 2009, a nonsensical computer-generated spoof paper was accepted by a Bentham Journal (for $800),

What can be done about publication, and what can be done about grants?

Both grants and publications are peer-reviewed, but the problems need to be discussed separately.

Peer review of papers by journals

One option is clearly to follow the example of the best open access journals, such as PLoS. The cost of $3000 to 5000 per paper would have to be paid by the research funder, often the taxpayer. It would be money subtracted from the research budget, but it would retain the present peer review system and should cost no more if the money that were saved on extortionate journal subscriptions were transferred to research budgets to pay the bills, though there is little chance of this happening.

The cost of publication would, in any case, be minimised if fewer papers were published, which is highly desirable anyway.

But there are real problems with the present peer review system. It works quite well for journals that are high in the hierarchy. I have few grumbles myself about the quality of reviews, and sometimes I’ve benefitted a lot from good suggestions made by reviewers. But for the user, the process is much less satisfactory because peer review has next to no effect on what gets published in journals. All it influences is which journal the paper appears in. The only effect of the vast amount of unpaid time and effort put into reviewing is to maintain a hierarchy of journals, It has next to no effect on what appears in Pubmed.

For authors, peer review can work quite well, but

from the point of view of the consumer, peer review is useless.

It is a myth that peer review ensures the quality of what appears in the literature.

A more radical approach

I made some more radical suggestions in Publish-or-perish: Peer review and the corruption of science.

It seems to me that there would be many advantages if people simply published their own work on the web, and then opened the comments. For a start, it would cost next to nothing. The huge amount of money that goes to publishers could be put to better uses.

Another advantage would be that negative results could be published. And proper full descriptions of methods could be provided because there would be no restrictions on length.

Under that system, I would certainly send a draft paper to a few people I respected for comments before publishing it. Informal consortia might form for that purpose.

The publication bias that results from non-publication of negative results is a serious problem, mainly, but not exclusively, for clinical trials. It is mandatory to register a clinical trial before it starts, but many of the results never appear. (see, for example, Deborah Cohen’s report for Index on Censorship). Although trials now have to be registered before they start, there is no check on whether or not the results are published. A large number of registered trials do not result in any publication, and this publication bias can costs thousands of lives. It is really important to ensure that all results get published,

The ArXiv model

There are many problems that would have to be solved before we could move to self-publication on the web. Some have already been solved by physicists and mathematicians. Their archive, ArXiv.org provides an example of where we should be heading. Papers are published on the web at no cost to either user or reader, and comments can be left. It is an excellent example of post-publication peer review. Flame wars are minimised by requiring users to register, and to show they are bona fide scientists before they can upload papers or comments. You may need endorsement if you haven’t submitted before.

Peer review of grants

The problems for grants are quite different from those for papers. There is no possibility of doing away with peer review for the award of grants, however imperfect the process may be. In fact candidates for the new Wellcome Trust investigator awards were alarmed to find that the short listing of candidates for their new Investigator Awards was done without peer review.

The Wellcome Trust has been enormously important for the support of medical and biological support, and never more than now, when the MRC has become rather chaotic (let’s hope the new CEO can sort it out). There was, therefore, real consternation when Wellcome announced a while ago its intention to stop giving project and programme grants altogether. Instead it would give a few Wellcome Trust Investigator Awards to prominent people. That sounds like the Howard Hughes approach, and runs a big risk of “to them that hath shall be given”.

The awards have just been announced, and there is a good account by Colin Macilwain in Science [pdf]. UCL did reasonable well with four awards, but four is not many for a place the size of UCL. Colin Macilwain hits the nail on the head.

"While this is great news for the 27 new Wellcome Investigators who will share £57 million, hundreds of university-based researchers stand to lose Wellcome funds as the trust phases out some existing programs to pay for the new category of investigators".

There were 750 applications, but on the basis of CV alone, they were pared down to a long-list if 173. The panels then cut this down to a short-list of 55. Up to this point no external referees were used, quite unlike the normal process for award of grants. This seems to me to have been an enormous mistake. No panel, however distinguished, can have the knowledge to distinguish the good from the bad in areas outside their own work, It is only human nature to favour the sort of work you do yourself. The 55 shortlisted people were interviewed, but again by a panel with an even narrower range of expertise, Macilwain again:

"Applications for MRC grants have gone up “markedly” since the Wellcome ones closed, he says: “We still see that as unresolved.” Leszek Borysiewicz, vice-chancellor of the University of Cambridge, which won four awards, believes the impact will be positive: “Universities will adapt to this way of funding research."

It certainly isn’t obvious to most people how Cambridge or UCL will "adapt" to funding of only four people.

The Cancer Research Campaign UK has recently made the same mistake.

One problem is that any scheme of this sort will inevitably favour big groups, most of whom are well-funded already. Since there is some reason to believe that small groups are more productive (see also University Alliance report), it isn’t obvious that this is a good way to go. I was lucky enough to get 45 minutes with the director of the Wellcome Trust, Mark Walport, to put these views. He didn’t agree with all I said, but he did listen.

One of the things that I put to him was a small statistical calculation to illustrate the great danger of a plan that funds very few people. The funding rate was 3.6% of the original applications, and 15.6% of the long-listed applications. Let’s suppose, as a rough approximation, that the 173 long-listed applications were all of roughly equal merit. No doubt that won’t be exactly true, but I suspect it might be more nearly true than the expert panels will admit. A quick calculation in Mathcad gives this, if we assume a 1 in 8 chance of success for each application.

Distribution of the number of successful applications

Suppose $ n $ grant applications are submitted. For example, the same grant submitted $ n $ times to selection boards of equal quality, OR $ n $ different grants of equal merit are submitted to the same board.

Define $ p $ = probability of success at each application

Under these assumptions, it is a simple binomial distribution problem.

According to the binomial distribution, the probability of getting $ r $ successful applications in $ n $ attempts is

\[ P(r)=\frac{n!}{r!\left(n-r\right)! }\; {p}^{r} \left(1-p \right)^{n-r} \]

For a success rate of 1 in 8, $ p = 0.125 $, so if you make $ n = 8 $ applications, the probability that $ r $ of them will succeed is shown in the graph.

Despite equal merit, almost as many people end up with no grant at all as almost as many people end up with no grant at all as get one grant. And 26% of people will get two or more grants.

mcd1-g

Of course it would take an entire year to write 8 applications. If we take a more realistic case of making four applications we have $ n = 4 $ (and $ p = 0.125 $, as before). In this case the graph comes out as below. You have a nearly 60% chance of getting nothing at all, and only a 1 in 3 chance of getting one grant.

mcd3-g

These results arise regardless of merit, purely as consequence of random chance. They are disastrous, and especially disastrous for the smaller, better-value, groups for which a gap in funding can mean loss of vital expertise. It also has the consequence that scientists have to spend most of their time not doing science, but writing grant applications. The mean number of applications before a success is 8, and a third of people will have to write 9 or more applications before they get funding. This makes very little sense.

Grant-awarding panels are faced with the near-impossible task of ranking many similar grants. The peer review system is breaking down, just as it has already broken down for journal publications.

I think these considerations demolish the argument for funding a small number of ‘stars’. The public might expect that the person making the application would take an active part in the research. Too often, now, they spend most of their time writing grant applications. What we need is more responsive-mode smallish programme grants and a maximum on the size of groups.

Conclusions

We should be thinking about the following changes,

  • Limit the number of papers that an individual can publish. This would increase quality, it would reduce the impossible load on peer reviewers and it would reduce costs.
  • Limit the size of labs so that more small groups are encouraged. This would increase both quality and value for money.
  • More (and so smaller) grants are essential for innovation and productivity.
  • Move towards self-publishing on the web so the cost of publishing becomes very low rather than the present extortionate costs. It would also mean that negative results could be published easily and that methods could be described in proper detail.

The entire debate is now on YouTube.

Follow-up

24 January 2012. The eminent mathematician, Tim Gowers, has a rather hard-hitting blog on open access and scientific publishing, Elsevier – my part in its downfall. I’m right with him. Although his post lacks the detailed numbers of mine, it shows that mathematicians has exactly the same problems of the rest of us.

11 April 2012. Thanks to Twitter, I came across a remarkably prescient article, in the Guardian, in 2001.
Science world in revolt at power of the journal owners, by James Meek. Elsevier have been getting away with murder for quite a while.

19 April 2012.

I got invited to give after-dinner talk on open access at Cumberland Lodge. It was for the retreat of out GEE Department (that is the catchy brand name we’ve had since 2007: I’m in the equally memorable NPP). I think it stands for Genetics, Evolution and Environment. The talk seemed to stiir up a lot of interest: the discussions ran on to the next day.

cumberland

It was clear that younger people are still as infatuated with Nature and Science as ever. And that, of course is the fault of their elders.

The only way that I can see, is to abandon impact factor as a way of judging people. It should have gone years ago,and good people have never used it. They read the papers. Access to research will never be free until we think oi a way to break the hegemony of Nature, Science and a handful of others. Stephen Curry has made some suggestions

Probably it will take action from above. The Wellcome Trust has made a good start. And so has Harvard. We should follow their lead (see also, Stephen Curry’s take on Harvard)

And don’t forget to sign up for the Elsevier boycott. Over 10,000 academics have already signed. Tim Gowers’ initiative took off remarkably.

24 July 2012. I’m reminded by Nature writer, Richard van Noorden (@Richvn) that Nature itself has written at least twice about the iniquity of judging people by impact factors. In 2005 Not-so-deep impact said

"Only 50 out of the roughly 1,800 citable items published in those two years received more than 100 citations in 2004. The great majority of our papers received fewer than 20 citations."

"None of this would really matter very much, were it not for the unhealthy reliance on impact factors by administrators and researchers’ employers worldwide to assess the scientific quality of nations and institutions, and often even to judge individuals."

And, more recently, in Assessing assessment” (2010).

27 April 2014

The brilliant mathematician,Tim Gowers, started a real revolt against old-fashioned publishers who are desperately trying to maintain extortionate profits in a world that has changed entirely. In his 2012 post, Elsevier: my part in its downfall, he declared that he would no longer publish in, or act as referee for, any journal published by Elsevier. Please follow his lead and sign an undertaking to that effect: 14,614 people have already signed.

Gowers has now gone further. He’s made substantial progress in penetrating the wall of secrecy with which predatory publishers (of which Elsevier is not the only example) seek to prevent anyone knowing about the profitable racket they are operating. Even the confidentiality agreements, which they force universities to sign, are themselves confidential.

In a new post, Tim Gowers has provided more shocking facts about the prices paid by universities. Please look at Elsevier journals — some facts. The jaw-dropping 2011 sum of €1.25 million paid by UCL alone, is now already well out-of-date. It’s now £1,381,380. He gives figures for many other Russell Group universities too. He also publishes some of the obstructive letters that he got in the process of trying to get hold of the numbers. It’s a wonderful aspect of the web that it’s easy to shame those who deserve to be shamed.

I very much hope the matter is taken to the Information Commissioner, and that a precedent is set that it’s totally unacceptable to keep secret what a university pays for services.

Jump to follow-up

The University of Wales is to stop validating all external degrees at home and abroad. and it has a new vice chancellor, Professor Medwin Hughes. This has happened eventually, after years of pressure, first from bloggers, and then from BBC Wales. That is a tacit admission that their validation procedures were useless, and bordering on the corrupt.

But the news doesn’t appear on the University of Wales front page. It is hidden away in a news item. With the incredible spin it is billed as “University of Wales launches bold new academic strategy“. No admission of the bad mistakes appears anywhere. Still worse, the vice-chancellor who was responsible for years of malpractice, Marc Clement, has not vanished, but has been promoted to be president of the University of Wales. It was Professor Clement who claimed, in November 2010, to be taken by “complete and total surprise“, when Welsh education minister accused him of bring the University of Wales into disrepute. This, sadly, cannot be true. I know that Clement was well aware of my 2008 blog, and of the opinion of Polly Toynbee. He’d corresponded with both of us in 2008.

Until recently the University of Wales validated an astonishing 11,675 degree courses, including fundamantalist bible colleges in Russia, Chinese medicine in Barcelona and courses in quackery at the Scottish School of Herbal Medicine,  the Northern College of Acupuncture and the Mctimoney College of Chiropractic.

In October 2008 I posted Another worthless validation: the University of Wales and nutritional therapy. With the help of the Freedom of Information Act, it was possible to reveal the mind-boggling incompetence of the validation process used by the University of Wales. The vice chancellors organisation, UUK, did nothing whatsoever, as usual.

The mainstream media eventually caught up with bloggers. In 2010, BBC1 TV (Wales) produced an excellent TV programme that exposed the enormous degree validation scam run by the University of Wales. The programme can be seen on YouTube (Part 1, and Part 2). The programme also exposed, incidentally, the uselessness of the Quality Assurance Agency (QAA) which did nothing until the scam was exposed by TV and blogs.

One of the potty courses "validated" by the University of Wales was a "BSc" in Nutrional Therapy, run by Jacqueline Young at the Northern College of Acupuncture. The problem of Jacqueline Young’s fantasy approach to facts was pointed out at least as far back as 2004, by Ray Girvan., who wrote about it again in May 2005. The problems were brought to wider attention when Ben Goldacre wrote two articles in his Badscience column, Imploding Researchers (September 2005), and the following week, Tangled Webs.

“we were pondering the ethics and wisdom of Jacqueline Young dishing out preposterous, made-up, pseudoscientific nonsense as if it was authoritative BBC fact, with phrases such as: “Implosion researchers have found that if water is put through a spiral its electrical field changes and it then appears to have a potent, restorative effect on cells.” “

I wrote to Clement in 2008 to ask for his opinion, as a one-time electrical engineer, of this statement. Naturally I got no reply. It seems to be a characteristic of people who are very well paid that failure to do their job results in promotion, not firing.

Furthermore, the people who were primarily responsible for validating crackpot degrees are still there. It seems that Professor Nigel Palastanga (a physiotherapist) is still  Pro Vice-Chancellor (Learning, Teaching and Enhancement). He was the person who totally failed to notice that Jacqueline Young had been a laughing stock for years before his committee solemnly approved her degree. And as late as 2010, Jacqueline Young was made a senior lecturer.

Palastanga

Among the courses "validated" by Palastanga were the McTimoney College of Chiropractic. They are presumably now in deep trouble. The College was, incidentally, owned by the BPP University, as recently promoted by David Willetts, so this is yet another sign of Willetts’ bad judgment. It’s private so it must be good, OK? The matter was recently given some well-deserved bad publicity in Private Eye.

It is a sorry story of pathetic quality control procedures and snail-like response from regulatory authorites. But something has been done and we should be grateful for that. Patients will be a bit safer now.

Follow-up

Quackometer also posted on October 3rd: McTimoney Chiropractic College in Deep Trouble.

October 4th More from BBC Wales, The new vice chancellor speaks. He admits nothing about malpractice.

October 5th “University of Wales degree and visa scam exposed by BBC”. More BBC Wales from Ciaran Jenkins.

October 5th (20.31) “Scrap University of Wales call by vice-chancellors” “Five universities want the University of Wales (UoW) title scrapped because they are appalled by claims about the validation of its qualifications.” So they noticed at last, 3 years after the first revelations, on this blog and by the BBC.

October 12. “University of Wales helps 650 stranded Tasmac students“. More than 650 overseas students may lose £7,850 fees paid to the Tasmac London School of Business, for a degree formerly validated by University of Wales. Tasmac has gone into liquidation, That’s what happens when validations are handed out like sweeties. The fact that the QAA approved the course means, as this case among others has shown, absolutely nothing. The University of Wales should be sufficiently ashamed of its past behaviour to refund the fees.

October 18. Bob Croydon, Welsh representative of the Prince of Wales’ Foundation for the Built Environment, is rather cross with Leighton Andrews, the Welsh minister of education who must get much of the credit for winding up the scams. He may also be in trouble with the quacktitioner royal after writing in an email " “Whilst the Minister could doubtless give the Prince of Wales a good run for his money in the Self Importancy Sweepstakes it would be. . ."

October 21 It now seems that the University of Wales will essentially vanish (and BBC report). . The chair of Council, Huw Thomas, has resigned. The closure is a pity for those who have real degrees awarded by the university, but it is a fate well-earned, The remaining mystery is the fate of Prof Clement and Prof Palastanga, who presided over the validation fiasco. It is they who are responsible for the students who have now been abandoned (read an account of Tasmac)

October 22 The Daily Telegraph reports University of Wales abolished after visa scandal. This misses the whole back story about validation scams.

Although the chair of Council has resigned, there is no information yet about Professors Clement and Palastanga, who presided over the fiasco. I was told on 24 October that “This information is yet to be determined”.

October 28. The university has produced two statements. one for students, and one for alumni. Both documents say

"Why is the merger happening?
Recent developments in Welsh Higher Education policy have been driven by a desire for there to be fewer universities in Wales. The Welsh Government is putting policies in place to have only six or seven universities (there are currently eleven). Our merger is in response to this challenge from the Government."

This, sadly, is a straighforward lie. The closure is a result of a long-running saga of phony and incompetent validations, sold for money. They have been aware of this at least since my first post in 2008, but nothing happened until the BBC gave it wide publicity. You don’t have to take my word that it’s a lie. Geraint Talfan Davies, a member of the Welsh government-commissioned McCormick review, said:

“I think that everybody has been aware of some of the risks involved in the University of Wales’ overseas activities.

“Validating at distance provides some real challenges, governance challenges, and that goes for everybody.

“The difficulty with the University of Wales was that it was on a pretty big scale and doing it on that scale makes it more difficult.”

It seems the University has lost touch with the idea of telling the truth.

Jump to follow-up

One wonders about the standards of peer review at the British Journal of General Practice. The June issue has a paper, "Acupuncture for ‘frequent attenders’ with medically unexplained symptoms: a randomised controlled trial (CACTUS study)". It has lots of numbers, but the result is very easy to see. Just look at their Figure.

Paterson BJGP

There is no need to wade through all the statistics; it’s perfectly obvious at a glance that acupuncture has at best a tiny and erratic effect on any of the outcomes that were measured.

But this is not what the paper said. On the contrary, the conclusions of the paper said

Conclusion

The addition of 12 sessions of five-element acupuncture to usual care resulted in improved health status and wellbeing that was sustained for 12 months.

How on earth did the authors manage to reach a conclusion like that?

The first thing to note is that many of the authors are people who make their living largely from sticking needles in people, or advocating alternative medicine. The authors are Charlotte Paterson, Rod S Taylor, Peter Griffiths, Nicky Britten, Sue Rugg, Jackie Bridges, Bruce McCallum and Gerad Kite, on behalf of the CACTUS study team. The senior author, Gerad Kite MAc , is principal of the London Institute of Five-Element Acupuncture London. The first author, Charlotte Paterson, is a well known advocate of acupuncture. as is Nicky Britten.

The conflicts of interest are obvious, but nonetheless one should welcome a “randomised controlled trial” done by advocates of alternative medicine. In fact the results shown in the Figure are both interesting and useful. They show that acupuncture does not even produce any substantial placebo effect. It’s the authors’ conclusions that are bizarre and partisan. Peer review is indeed a broken process.

That’s really all that needs to be said, but for nerds, here are some more details.

How was the trial done?

The description "randomised" is fair enough, but there were no proper controls and the trial was not blinded. It was what has come to be called a "pragmatic" trial, which means a trial done without proper controls. They are, of course, much loved by alternative therapists because their therapies usually fail in proper trials. It’s much easier to get an (apparently) positive result if you omit the controls. But the fascinating thing about this study is that, despite the deficiencies in design, the result is essentially negative.

The authors themselves spell out the problems.

“Group allocation was known by trial researchers, practitioners, and patients”

So everybody (apart from the statistician) knew what treatment a patient was getting. This is an arrangement that is guaranteed to maximise bias and placebo effects.

"Patients were randomised on a 1:1 basis to receive 12 sessions of acupuncture starting immediately (acupuncture group) or starting in 6 months’ time (control group), with both groups continuing to receive usual care."

So it is impossible to compare acupuncture and control groups at 12 months, contrary to what’s stated in Conclusions.

"Twelve sessions, on average 60 minutes in length, were provided over a 6-month period at approximately weekly, then fortnightly and monthly intervals"

That sounds like a pretty expensive way of getting next to no effect.

"All aspects of treatment, including discussion and advice, were individualised as per normal five-element acupuncture practice. In this approach, the acupuncturist takes an in-depth account of the patient’s current symptoms and medical history, as well as general health and lifestyle issues. The patient’s condition is explained in terms of an imbalance in one of the five elements, which then causes an imbalance in the whole person. Based on this elemental diagnosis, appropriate points are used to rebalance this element and address not only the presenting conditions, but the person as a whole".

Does this mean that the patients were told a lot of mumbo jumbo about “five elements” (fire earth, metal, water, wood)? If so, anyone with any sense would probably have run a mile from the trial.

"Hypotheses directed at the effect of the needling component of acupuncture consultations require sham-acupuncture controls which while appropriate for formulaic needling for single well-defined conditions, have been shown to be problematic when dealing with multiple or complex conditions, because they interfere with the participative patient–therapist interaction on which the individualised treatment plan is developed. 37–39 Pragmatic trials, on the other hand, are appropriate for testing hypotheses that are directed at the effect of the complex intervention as a whole, while providing no information about the relative effect of different components."

Put simply that means: we don’t use sham acupuncture controls so we can’t distinguish an effect of the needles from placebo effects, or get-better-anyway effects.

"Strengths and limitations: The ‘black box’ study design precludes assigning the benefits of this complex intervention to any one component of the acupuncture consultations, such as the needling or the amount of time spent with a healthcare professional."

"This design was chosen because, without a promise of accessing the acupuncture treatment, major practical and ethical problems with recruitment and retention of participants were anticipated. This is because these patients have very poor self-reported health (Table 3), have not been helped by conventional treatment, and are particularly desperate for alternative treatment options.". 

It’s interesting that the patients were “desperate for alternative treatment”. Again it seems that every opportunity has been given to maximise non-specific placebo, and get-well-anyway effects.

There is a lot of statistical analysis and, unsurprisingly, many of the differences don’t reach statistical significance. Some do (just) but that is really quite irrelevant. Even if some of the differences are real (not a result of random variability), a glance at the figures shows that their size is trivial.

My conclusions

(1) This paper, though designed to be susceptible to almost every form of bias, shows staggeringly small effects. It is the best evidence I’ve ever seen that not only are needles ineffective, but that placebo effects, if they are there at all, are trivial in size and have no useful benefit to the patient in this case..

(2) The fact that this paper was published with conclusions that appear to contradict directly what the data show, is as good an illustration as any I’ve seen that peer review is utterly ineffective as a method of guaranteeing quality. Of course the editor should have spotted this. It appears that quality control failed on all fronts.

Follow-up

In the first four days of this post, it got over 10,000 hits (almost 6,000 unique visitors).

Margaret McCartney has written about this too, in The British Journal of General Practice does acupuncture badly.

The Daily Mail exceeds itself in an article by Jenny Hope whch says “Millions of patients with ‘unexplained symptoms’ could benefit from acupuncture on the NHS, it is claimed”. I presume she didn’t read the paper.

The Daily Telegraph scarcely did better in Acupuncture has significant impact on mystery illnesses. The author if this, very sensibly, remains anonymous.

Many “medical information” sites churn out the press release without engaging the brain, but most of the other newspapers appear, very sensibly, to have ignored ther hyped up press release. Among the worst was Pulse, an online magazine for GPs. At least they’ve publish the comments that show their report was nonsense.

The Daily Mash has given this paper a well-deserved spoofing in Made-up medicine works on made-up illnesses.

“Professor Henry Brubaker, of the Institute for Studies, said: “To truly assess the efficacy of acupuncture a widespread double-blind test needs to be conducted over a series of years but to be honest it’s the equivalent of mapping the DNA of pixies or conducting a geological study of Narnia.” ”

There is no truth whatsoever in the rumour being spread on Twitter that I’m Professor Brubaker.

Euan Lawson, also known as Northern Doctor, has done another excellent job on the Paterson paper: BJGP and acupuncture – tabloid medical journalism. Most tellingly, he reproduces the press release from the editor of the BJGP, Professor Roger Jones DM, FRCP, FRCGP, FMedSci.

"Although there are countless reports of the benefits of acupuncture for a range of medical problems, there have been very few well-conducted, randomised controlled trials. Charlotte Paterson’s work considerably strengthens the evidence base for using acupuncture to help patients who are troubled by symptoms that we find difficult both to diagnose and to treat."

Oooh dear. The journal may have a new look, but it would be better if the editor read the papers before writing press releases. Tabloid journalism seems an appropriate description.

Andy Lewis at Quackometer, has written about this paper too, and put it into historical context. In Of the Imagination, as a Cause and as a Cure of Disorders of the Body. “In 1800, John Haygarth warned doctors how we may succumb to belief in illusory cures. Some modern doctors have still not learnt that lesson”. It’s sad that, in 2011, a medical journal should fall into a trap that was pointed out so clearly in 1800. He also points out the disgracefully inaccurate Press release issued by the Peninsula medical school.

Some tweets

Twitter info 426 clicks on http://bit.ly/mgIQ6e alone at 15.30 on 1 June (and that’s only the hits via twitter). By July 8th this had risen to 1,655 hits via Twitter, from 62 different countries,

@followthelemur Selina
MASSIVE peer review fail by the British Journal of General Practice http://bit.ly/mgIQ6e (via @david_colquhoun)

@david_colquhoun David Colquhoun
Appalling paper in Brit J Gen Practice: Acupuncturists show that acupuncture doesn’t work, but conclude the opposite http://bit.ly/mgIQ6e
Retweeted by gentley1300 and 36 others

@david_colquhoun David Colquhoun.
I deny the Twitter rumour that I’m Professor Henry Brubaker as in Daily Mash http://bit.ly/mt1xhX (just because of http://bit.ly/mgIQ6e )

@brunopichler Bruno Pichler
http://tinyurl.com/3hmvan4
Made-up medicine works on made-up illnesses (me thinks Henry Brubaker is actually @david_colquhoun)

@david_colquhoun David Colquhoun,
HEHE RT @brunopichler: http://tinyurl.com/3hmvan4 Made-up medicine works on made-up illnesses

@psweetman Pauline Sweetman
Read @david_colquhoun’s take on the recent ‘acupuncture effective for unexplained symptoms’ nonsense: bit.ly/mgIQ6e

@bodyinmind Body In Mind
RT @david_colquhoun: ‘Margaret McCartney (GP) also blogged acupuncture nonsense http://bit.ly/j6yP4j My take http://bit.ly/mgIQ6e’

@abritosa ABS
Br J Gen Practice mete a pata na poça: RT @david_colquhoun […] appalling acupuncture nonsense http://bit.ly/j6yP4j http://bit.ly/mgIQ6e

@jodiemadden Jodie Madden
amusing!RT @david_colquhoun: paper in Brit J Gen Practice shows that acupuncture doesn’t work,but conclude the opposite http://bit.ly/mgIQ6e

@kashfarooq Kash Farooq
Unbelievable: acupuncturists show that acupuncture doesn’t work, but conclude the opposite. http://j.mp/ilUALC by @david_colquhoun

@NeilOConnell Neil O’Connell
Gobsmacking spin RT @david_colquhoun: Acupuncturists show that acupuncture doesn’t work, but conclude the opposite http://bit.ly/mgIQ6e

@euan_lawson Euan Lawson (aka Northern Doctor)
Aye too right RT @david_colquhoun @iansample @BenGoldacre Guardian should cover dreadful acupuncture paper http://bit.ly/mgIQ6e

@noahWG Noah Gray
Acupuncturists show that acupuncture doesn’t work, but conclude the opposite, from @david_colquhoun: http://bit.ly/l9KHLv

 

8 June 2011 I drew the attention of the editor of BJGP to the many comments that have been made on this paper. He assured me that the matter would be discussed at a meeting of the editorial board of the journal. Tonight he sent me the result of this meeting.

Subject: BJGP
From: “Roger Jones” <rjones@rcgp.org.uk>
To: <d.colquhoun@ucl.ac.uk>

Dear Prof Colquhoun

We discussed your emails at yesterday’s meeting of the BJGP Editorial Board, attended by 12 Board members and the Deputy Editor

The Board was unanimous in its support for the integrity of the Journal’s peer review process for the Paterson et al paper – which was accepted after revisions were made in response to two separate rounds of comments from two reviewers and myself – and could find no reason either to retract the paper or to release the reviewers’ comments

Some Board members thought that the results were presented in an overly positive way; because the study raises questions about research methodology and the interpretation of data in pragmatic trials attempting to measure the effects of complex interventions, we will be commissioning a Debate and Analysis article on the topic.

In the meantime we would encourage you to contribute to this debate throught the usual Journal channels

Roger Jones

Professor Roger Jones MA DM FRCP FRCGP FMedSci FHEA FRSA
Editor, British Journal of General Practice

Royal College of General Practitioners
One Bow Churchyard
London EC4M 9DQ
Tel +44 203 188 7400

It is one thing to make a mistake, It is quite another thing to refuse to admit it. This reply seems to me to be quite disgraceful.

20 July 2011. The proper version of the story got wider publicity when Margaret McCartney wrote about it in the BMJ. The first rapid response to this article was a lengthy denial by the authors of the obvious conclusion to be drawn from the paper. They merely dig themselves deeper into a hole. The second response was much shorter (and more accurate).

Thank you Dr McCartney

Richard Watson, General Practitioner
Glasgow

The fact that none of the authors of the paper or the editor of BJGP have bothered to try and defend themselves speaks volumes.

Like many people I glanced at the report before throwing it away with an incredulous guffaw. You bothered to look into it and refute it – in a real journal. That last comment shows part of the problem with them publishing, and promoting, such drivel. It makes you wonder whether anything they publish is any good, and that should be a worry for all GPs.

 

30 July 2011. The British Journal of General Practice has published nine letters that object to this study. Some of them concentrate on problems with the methods. others point out what I believe to be the main point, there us essentially no effect there to be explained. In the public interest, I am posting the responses here [download pdf file]

Thers is also a response from the editor and from the authors. Both are unapologetic. It seems that the editor sees nothing wrong with the peer review process.

I don’t recall ever having come across such incompetence in a journal’s editorial process.

Here’s all he has to say.

The BJGP Editorial Board considered this correspondence recently. The Board endorsed the Journal’s peer review process and did not consider that there was a case for retraction of the paper or for releasing the peer reviews. The Board did, however, think that the results of the study were highlighted by the Journal in an overly-positive manner. However,many of the criticisms published above are addressed by the authors themselves in the full paper.

 

If you subscribe to the views of Paterson et al, you may want to buy a T-shirt that has a revised version of the periodic table.

t shirt

 

5 August 2011. A meeting with the editor of BJGP

Yesterday I met a member of the editorial board of BJGP. We agreed that the data are fine and should not be retracted. It’s the conclusions that should be retracted. I was also told that the referees’ reports were "bland". In the circumstances that merely confirmed my feeling that the referees failed to do a good job.

Today I met the editor, Roger Jones, himself. He was clearly upset by my comment and I have now changed it to refer to the whole editorial process rather than to him personally. I was told, much to my surprise, that the referees were not acupuncturists but “statisticians”. That I find baffling. It soon became clear that my differences with Professor Jones turned on interpretations of statistics.

It’s true that there were a few comparisons that got below P = 0.05, but the smallest was P = 0.02. The warning signs are there in the Methods section: "all statistical tests were …. deemed to be statistically significant if P < 0.05". This is simply silly -perhaps they should have read Lectures on Biostatistics. Or for a more recent exposition, the XKCD cartoon in which it’s proved that green jelly beans are linked to acne (P = 0.05). They make lots of comparisons but make no allowance for this in the statistics. Figure 2 alone contains 15 different comparisons: it’s not surprising that a few come out "significant", even if you don’t take into account the likelihood of systematic (non-random) errors when comparing final values with baseline values.

Keen though I am on statistics, this is a case where I prefer the eyeball test. It’s so obvious from the Figure that there’s nothing worth talking about happening, it’s a waste of time and money to torture the numbers to get "significant" differences. You have to be a slavish believer in P values to treat a result like that as anything but mildly suggestive. A glance at the Figure shows the effects, if there are any at all, are trivial.

I still maintain that the results don’t come within a million miles of justifying the authors’ stated conclusion “The addition of 12 sessions of five-element acupuncture to usual care resulted in improved health status and wellbeing that was sustained for 12 months.” Therefore I still believe that a proper course would have been to issue a new and more accurate press release. A brief admission that the interpretation was “overly-positive”, in a journal that the public can’t see, simply isn’t enough.

I can’t understand either, why the editorial board did not insist on this being done. If they had done so, it would have been temporarily embarrassing, certainly, but people make mistakes, and it would have blown over. By not making a proper correction to the public, the episode has become a cause célèbre and the reputation oif the journal will suffer permanent harm. This paper is going to be cited for a long time, and not for the reasons the journal would wish.

Misinformation, like that sent to the press, has serious real-life consequences. You can be sure that the paper as it still stands, will be cited by every acupuncturist who’s trying to persuade the Department of Health that he’s a "qualified provider".

There was not much unanimity in the discussion up to this point, Things got better when we talked about what a GP should do when there are no effective options. Roger Jones seemed to think it was acceptable to refer them to an alternative practitioner if that patient wanted it. I maintained that it’s unethical to explain to a patient how medicine works in terms of pre-scientific myths.

I’d have love to have heard the "informed consent" during which "The patient’s condition is explained in terms of imbalance in the five elements which then causes an imbalance in the whole person". If anyone had tried to explain my conditions in terms of my imbalance in my Wood, Water, Fire, Earth and Metal. I’d think they were nuts. The last author. Gerad Kite, runs a private clinic that sells acupuncture for all manner of conditions. You can find his view of science on his web site. It’s condescending and insulting to talk to patients in these terms. It’s the ultimate sort of paternalism. And paternalism is something that’s supposed to be vanishing in medicine. I maintained that this was ethically unacceptable, and that led to a more amicable discussion about the possibility of more honest placebos.

It was good of the editor to meet me in the circumstances. I don’t cast doubt on the honesty of his opinions. I simply disagree with them, both at the statistical level and the ethical level.

 

30 March 2014

I only just noticed that one of the authors of the paper, Bruce McCallum (who worked as an acupuncturist at Kite’s clinic) appeared in a 2007 Channel 4 News piece. I was a report on the pressure to save money by stopping NHS funding for “unproven and disproved treatments”. McCallum said that scientific evidence was needed to show that acupuncture really worked. Clearly he failed, but to admit that would have affected his income.

Watch the video (McCallum appears near the end).