Queen Mary

The reproducibility of Science. A meeting report.

Published April 14, 2015

There is a widespread belief that science is going through a crisis of reproducibility. A meeting was held to discuss the problem. It was organised by Academy of Medical Sciences, the Wellcome Trust, MRC and BBSRC, and It was chaired by Dorothy Bishop (of whose blog I’m a huge fan). It’s good to see that scientific establishment is beginning to take notice. Up to now it’s been bloggers who’ve been making the running. I hadn’t intended to write a whole post about it, but some sufficiently interesting points arose that I’ll have a go.

The first point to make is that, as far as I know, the “crisis” is limited to, or at least concentrated in, quite restricted areas of science. In particular, it doesn’t apply to the harder end of sciences. Nobody in physics, maths or chemistry talks about a crisis of reproducibility. I’ve heard very little about irreproducibility in electrophysiology (unless you include EEG work). I’ve spent most of my life working on single-molecule biophysics and I’ve never encountered serious problems with irreproducibility. It’s a small and specialist field so I think if I would have noticed if it were there. I’ve always posted on the web our analysis programs, and if anyone wants to spend a year re-analysing it they are very welcome to do so (though I have been asked only once).

The areas that seem to have suffered most from irreproducibility are experimental psychology, some areas of cell biology, imaging studies (fMRI) and genome studies. Clinical medicine and epidemiology have been bad too. Imaging and genome studies seem to be in a slightly different category from the others. They are largely statistical problems that arise from the huge number of comparisons that need to be done. Epidemiology problems stem largely from a casual approach to causality. The rest have no such excuses.

The meeting was biased towards psychology, perhaps because that’s an area that has had many problems. The solutions that were suggested were also biased towards that area. It’s hard to see some of them could be applied to electrophysiology for example.

There was, it has to be said, a lot more good intentions than hard suggestions. Pre-registration of experiments might help a bit in a few areas. I’m all for open access and open data, but doubt they will solve the problem either, though I hope they’ll become the norm (they always have been for me).

All the tweets from the meeting hve been collected as a Storify. The most retweeted comment was from Liz Wager

@SideviewLiz: Researchers are incentivised to publish, get grants, get promoted but NOT incentivised to be right! #reprosymp

This, I think, cuts to the heart if the problem. Perverse incentives, if sufficiently harsh, will inevitably lead to bad behaviour. Occasionally it will lead to fraud. It’s even led to (at least) two suicides. If you threaten people in their forties and fifties with being fired, and losing their house, because they don’t meet some silly metric, then of course people will cut corners. Curing that is very much more important than pre-registration, data-sharing and concordats, though the latter occupied far more of the time at the meeting.

The primary source of the problem is that there is not enough money for the number of people who want to do research (a matter that was barely mentioned). That leads to the unpalatable conclusion that the only way to cure the problem is to have fewer people competing for the money. That’s part of the reason that I suggested recently a two-stage university system. That’s unlikely to happen soon. So what else can be done in the meantime?

The responsibility for perverse incentives has to rest squarely on the shoulders of the senior academics and administrators who impose them. It is at this level that the solutions must be found. That was said, but not firmly enough. The problems are mostly created by the older generation It’s our fault.

IncidentalIy, I was not impressed by the fact that the Academy of Medical Sciences listed attendees with initials after peoples’ names. There were eight FRSs but I find it a bit embarrassing to be identified as one, as though it made any difference to the value of what I said.

It was suggested that courses in research ethics for young scientists would help. I disagree. In my experience, young scientists are honest and idealistic. The problems arise when their idealism is shattered by the bad example set by their elders. I’ve had a stream of young people in my office who want advice and support because they feel they are being pressured by their elders into behaviour which worries them. More than one of them have burst into tears because they feel that they have been bullied by PIs.

One talk that I found impressive was Ottloline Leyser who chaired the recent report on The Culture of Scientific Research in the UK, from the Nuffield Council on Bioethics. But I found that report to be bland and its recommendations, though well-meaning, unlikely to result in much change. The report was based on a relatively small, self-selected sample of 970 responses to a web survey, and on 15 discussion events. Relatively few people seem to have spent time filling in the text boxes, For example

“Of the survey respondents who provided a negative comment on the effects of competition in science, 24 out of 179 respondents (13 per cent) believe that high levels of competition between individuals discourage research collaboration and the sharing of data and methodologies.&rdquo:

Such numbers are too small to reach many conclusions, especially since the respondents were self-selected rather than selected at random (poor experimental design!). Nevertheless, the main concerns were all voiced. I was struck by

“Almost twice as many female survey respondents as male respondents raise issues related to career progression and the short term culture within UK research when asked which features of the research environment are having the most negative effect on scientists”

But no conclusions or remedies were put forward to remedy this problem. It was all put rather better, and much more frankly, some time ago by Peter Lawrence. I do have the impression that bloggers (including Dorothy Bishop) get to the heart of the problems much more directly than any official reports.

The Nuffield report seemed to me to put excessive trust in paper exercises, such as the “Concordat to Support the Career Development of Researchers”. The word “bullying” does not occur anywhere in the Nuffield document, despite the fact that it’s problem that’s been very widely discussed and a problem that’s critical for the problems of reproducibility. The Concordat (unlike the Nuffield report) does mention bullying.

"All managers of research should ensure that measures exist at every institution through which discrimination, bullying or harassment can be reported and addressed without adversely affecting the careers of innocent parties. "

That sounds good, but it’s very obvious that there are many places simply ignore it. All universities subscribe to the Concordat. But signing is as far as it goes in too many places. It was signed by Imperial College London, the institution with perhaps the worst record for pressurising its employees, but official reports would not dream of naming names or looking at publicly available documentation concerning bullying tactics. For that, you need bloggers.

On the first day, the (soon-to-depart) Dean of Medicine at Imperial, Dermot Kelleher, was there. He seemed a genial man, but he would say nothing about the death of Stefan Grimm. I find that attitude incomprehensible. He didn’t reappear on the second day of the meeting.

The San Francisco Declaration on Research Assessment (DORA) is a stronger statement than the Concordat, but its aims are more limited. DORA states that the impact factor is not to be used as a substitute “measure of the quality of individual research articles, or in hiring, promotion, or funding decisions”. That’s something that I wrote about in 2003, in Nature. In 2007 it was still rampant, including at Imperial College. It still is in many places. The Nuffield Council report says that DORA has been signed by “over 12,000 individuals and 500 organisations”, but fails to mention the fact that only three UK universities have signed up to DORA (oneof them, I’m happy to say, is UCL). That’s a pretty miserable record. And, of course, it remains to be seen whether the signatories really abide by the agreement. Most such worthy agreements are ignored on the shop floor.

The recommendations of the Nuffield Council report are all worthy, but they are bland and we’ll be lucky if they have much effect. For example

“Ensure that the track record of researchers is assessed broadly, without undue reliance on journal impact factors”

What on earth is “undue reliance”? That’s a far weaker statement than DORA. Why?

And

“Ensure researchers, particularly early career researchers, have a thorough grounding in research ethics”

In my opinion, what we should say to early career researchers is “avoid the bad example that’s set by your elders (but not always betters)”. It’s the older generation which has produced the problems and it’s unbecoming to put the blame on the young. It’s the late career researchers who are far more in need of a thorough grounding in research ethics than early-career researchers.

Although every talk was more or less interesting, the one I enjoyed most was the first one, by Marcus Munafo. It assessed the scale of the problem (though with a strong emphasis on psychology, plus some genetics and epidemiology), and he had good data on under-powered studies. It also made a fleeting mention of the problem of the false discovery rate. Since the meeting was essentially about the publication of results that aren’t true, I would have expected the statistical problem of the false discovery rate to have been given much more prominence than it was. Although Ioannidis’ now-famous paper “Why most published research is wrong” got the occasional mention, very little attention (apart from Munafo and Button) was given to the problems which he pointed out.

I’ve recently convinced myself that, if you declare that you’ve made a discovery when you observe P = 0.047 (as is almost universal in the biomedical literature) you’ll be wrong 30 – 70% of the time (see full paper, "An investigation of the false discovery rate and the misinterpretation of p-values".and simplified versions on Youtube and on this blog). If that’s right, then surely an important way to reduce the publication of false results is for journal editors to give better advice about statistics. This is a topic that was almost absent from the meeting. It’s also absent from the Nuffield Council report (the word “statistics” does not occur anywhere).

In summary, the meeting was very timely, and it was fun. But I ended up thinking it had a bit too much of preaching good intentions to the converted. It failed to grasp some of the nettles firmly enough. There was no mention of what’s happening at Imperial, or Warwick, or Queen Mary, or at Kings College London. Let’s hope that when it’s written up, the conclusion will be a bit less bland than those of most official reports.

It’s overdue that we set our house in order, because the public has noticed what’s going on. The New York Times was scathing in 2006. This week’s Economist said

"Modern scientists are doing too much trusting and not enough verifying -to the detriment of the whole of science, and of humanity.
Too many of the findings that fill the academic ether are the result of shoddy experiments or poor analysis"

"Careerism also encourages exaggeration and the cherrypicking of results."

This is what the public think of us. It’s time that vice-chancellors did something about it, rather than willy-waving about rankings.

Conclusions

After criticism of the conclusions of official reports, I guess that I have to make an attempt at recommendations myself. Here’s a first attempt.

The heart of the problem is money. Since the total amount of money is not likely to increase in the short term, the only solution is to decrease the number of applicants. This is a real political hot-potato, but unless it’s tackled the problem will persist. The most gentle way that I can think of doing this is to restrict research to a subset of universities. My proposal for a two stage university system might go some way to achieving this. It would result in better postgraduate education, and it would be more egalitarian for students. But of course universities that became “teaching only” would see (wrongly) as demotion, and it seems that UUK is unlikely to support any change to the status quo (except, of course, for increasing fees).
Smaller grants, smaller groups and fewer papers would benefit science.
Ban completely the use of impact factors and discourage use of all metrics. None has been shown to measure future quality. All increase the temptation to “game the system” (that’s the usual academic euphemism for what’s called cheating if an undergraduate does it).
“Performance management” is the method of choice for bullying academics. Don’t allow people to be fired because they don’t achieve arbitrary targets for publications or grant income. The criteria used at Queen Mary London, and Imperial, and Warwick and at Kings, are public knowledge. They are a recipe for employing spivs and firing Nobel Prize winners: the 1991 Nobel Laureate in Physiology or Medicine would have failed Imperial’s criteria in 6 years out of 10 years when he was doing the work which led to the prize.
Universities must learn that if you want innovation and creativity you have also to tolerate a lot of failure.
The ranking of universities by ranking businesses or by the REF encourages bad behaviour by encouraging vice-chancellors to improve their ranking, by whatever means they can. This is one reason for bullying behaviour. The rankings are totally arbitrary and a huge waste of money. I’m not saying that universities should be unaccountable to taxpayers. But all you have to do is to produce a list of publications to show that very few academics are not trying. It’s absurd to try to summarise a whole university in a single number. It’s simply statistical illiteracy
Don’t waste money on training courses in research ethics. Everyone already knows what’s honest and what’s dodgy (though a bit more statistics training might help with that). Most people want to do the honest thing, but few have the nerve to stick to their principles if the alternative is to lose your job and your home. Senior university people must stop behaving in that way.
University procedures for protecting the young are totally inadequate. A young student who reports bad behaviour of his seniors is still more likely to end up being fired than being congratulated (see, for example, a particularly bad case at the University of Sheffield). All big organisations close ranks to defend themselves when criticised. Even extreme cases, as when an employee commits suicide after being bullied, universities issue internal reports which blame nobody.
Universities must stop papering over the cracks when misbehaviour is discovered. It seems to be beyond the wit of PR people to realise that often it’s best (and always the cheapest) to put your hands up and say “sorry, we got that wrong”
There an urgent need to get rid of the sort of statistical illiteracy that allows P = 0.06 to be treated as failure and P = 0.04 as success. This is almost universal in biomedical papers, and given the hazards posed by the false discovery rate, could well be a major contribution to false claims. Journal editors need to offer much better statistical advice than is the case at the moment.

Follow-up

Tagged Alice Gast, bibliometrics, Dermot Kelleher, impact factor, Imperial College, irreproducibility, James Stirling, King's College London, metrics, Queen Mary, Queen Mary University of London, reproducibility, University of Warwick | 1 Comment

In which Simon Gaskell, of Queen Mary, University of London, makes a cock-up

Published August 16, 2012

Jump to follow-up

The row about ~~redundancies~~ firings at Queen Mary rumbles on.

I’ve already written about it twice in Is Queen Mary University of London trying to commit scientific suicide?, and in Queen Mary, University of London in The Times. Does Simon Gaskell care?. But wait, there is more to come.

The harm done to teaching at Queen Mary was outlined in a report written at the request of Simon Gaskell. He appears to have ignored it entirely. So let’s concentrate on research.

Simon Gaskell

Some explanation of the bizarre behaviour of the Queen Mary management can be gleaned from Queen Mary’s Frequently Asked Questions:
Restructures and Reviews in Academic Departments 2011-12. This says

“research-related metrics” used in the School of Biological and Chemical Sciences are the result of “extensive consultation with staff and include both the Australian [Research Council] journal classification system as well as impact factor”.

Let’s skip over the fact that the "extensive consultation" was largely sham. That’s standard procedure (and not only in universities).

Australian journal classification

The reference to the Australian journal classification is revealing. Australia has been noted in the past for being one of the worst places for abuse of publication metrics. Its 2010 Journal classification was utterly bizarre [download Excel]. It ranks 20,712 journals as being A*, A, B, or C (did the rankers really look at all of them?).

I’ll take one example from my own area: the Journal of General Physiology is probably one of the most-respected journals for electrophysiology with emphasis on mechanisms. Many of its papers are quite mathematical. In its niche, it is hugely respected for the quality of papers and for its high integrity [declaration of interest, I’m an editor, and was hugely flattered by that honour]. But the Australian 2010 list ranks the Journal of General Physiology as B, the same as Brain Research, British Journal of Religious Education and Chinese Medicine (a journal of quack medicine).

In contrast, the Nature journals all get A*, as do many review journals which don’t publish original research at all. But if you can’t get a paper into Nature, there are other A* journals that might help you keep your job, for example Television and New Media, or Tourism Management as well as some joke journals like the Journal of Alternative and Complementary Medicine and Complementary Therapies in Medicine.

No doubt you’ll find similar ludicrous anomalies in your own field.

Simon Gaskell has not answered emails from me, and he hasn’t answered those from his own staff either, but he did emerge briefly from his shell last week, in The Times and at greater length in today’s Times Higher Education. The latter, though longer, consists almost entirely of the vapid self-congratulation which all vice-chancellors feel compelled to spout. Only one short paragraph is devoted to answering his critics.

“Where academic performance has been assessed, it has been important to do so on the basis of objective criteria including metrics – any subjective assessment would be quite unacceptable. These objective criteria were based on generally recognised academic expectations . . . “

I find it almost impossible to believe that a vice-chancellor should be so out of touch as to believe that counting papers, and using impact factors to judge people are “generally recognised”. The extent of the misunderstanding of metrics is illustrated by two statements for Matthew Evans, head of the School of Biological and Chemical Sciences. is quoted as saying:

“Impact factor reflects the number of times an average paper is cited, [so] is a good indication of how many citations a particular paper is likely to achieve,”

Anyone who understands the difference between mode, median and mean of the highly skewed distribution of citations would not make an elementary mistake like that. It is simply statistical illiteracy.

Evans also

"…described metrics as a “vital tool” in assessing academics’ contributions to research and “the only empirical way of measuring success in science”."

Is Matthew Evans not aware that there isn’t the slightest evidence that any metric predicts the future success of a scientist? It’s an evidence-free zone Perhaps he should read about how to test social interventions.

It’s certainly evident that Gaskell’s management team can’t use Google. After getting the hint about the use of Australian rankings, it took five minutes to discover this statement.

"On 30 May 2011, the decision not to use ranked outlets for ERA 2012 was announced".

It’s more than a year since Kim Carr, at that time the Australian minister for innovation, industry, science and research, announced the result of a review of the way the next ce of Research in Australia (ERA) exercise would be conducted by the Australian Research Council (ARC).

“There is clear and consistent evidence that the rankings were being deployed inappropriately within some quarters of the sector, in ways that could produce harmful outcomes, and based on a poor understanding of the actual role of the rankings. One common example was the setting of targets for publication in A and A* journals by institutional research managers.

“their existence was focussing ill-informed, undesirable behaviour in the management of research – I have made the decision to remove the rankings, based on the ARC’s expert advice”.

It seems that Professor Gaskell is unaware of this, since he is enforcing the very “ill-informed, undesirable behaviour in the management of research” that Australia has dropped.

That is what I’d call a major cock-up.

HEFCE also reviewed the role of metrics. Their pilot tests were, almost needless to say, not properly randomised, but the conclusion was much the same as in Australia. Is Gaskell not aware that REF instructions say

“No sub-panel will make any use of journal impact factors, rankings, lists or the perceived standing of publishers in assessing the quality of research outputs”

The very people Gaskell seeks to please have condemned strongly his methods.

If that is not “bringing your university into disrepute”, I don’t know what is.

The responses to Simon Gaskell

This week’s Times Higher Education published three letters in response to Gaskell’s article.

The lead letter was signed by thirteen Queen Mary academics. This gives the flavour.

“Here we point to the unintended consequences of restructuring already in evidence. These include undermining morale in the schools and departments concerned; the flight of talented colleagues to other institutions; the consignment of teaching to lecturers in casual employment or those deemed unfit for research; scandalous gender disparity; and the lopsided, counterproductive allocation of resources. When staff are dismissed, replacements can come only from other institutions that have been willing to invest in people, research and scholarship. As a part of normal academic life, mobility is acceptable, even desirable, but when enforced on the scale envisaged at Queen Mary, it is random slaughter offset by poaching.”

My letter described the cock-up over the ARC Journal classification.

The third letter was from Fanis Missirlis, who was co-authot of the Lancet letter, Queen Mary: nobody expects the Spanish Inquisition. As a result he was fired at one day’s notice. He is one of the academics whose “careers are destroyed by decimal points in spurious calculations”

The Queen Mary process was summed up rather well when the University of Sydney went through a similar convulsion. An Australian academic referred to

"retrenchment exercises driven by “crass, bureaucratic, quantifiable simulacra of genuine research”.

Follow-up

August 23 2012.

I see it is now official. One of the great stars of Queen Mary, Lisa Jardine, is leaving Queen Mary (and, as it happens, coming to UCL). Learn more about her in an interview with Laurie Taylor.

She discovered lost papers by the great renaissance scientist, Robert Hooke –watch the video.

Lisa Jardine

20 December 2012

It seems that the predicted bad effects are coming true. Times Higher Education carries a letter from a biology undergraduate, Matthew James Erickson, which is highly critical of the effect of Gaskell’s policies on teaching at Queen Mary.

“From my perspective as a first-year undergraduate, the aggressive restructuring has had a profoundly negative effect on my opinion of my university.”

It is rather wonderful when a first year undergraduate dares to speak truth to power. Well done Matthew James Erickson.

Tagged impact factor, Queen Mary, Queen Mary University of London, Simon Gaskell | 11 Comments

Queen Mary, University of London in The Times. Does Simon Gaskell care?

Published July 30, 2012

Jump to follow-up

A lot of people from around the world ready my last post, Is Queen Mary University of London trying to commit scientific suicide?. Nonetheless, the mainstream media reach a different audience. They are still needed. Oddly enough, the Guardian Higher Education section didn’t seem very interested. For a paper that publishes Polly Toynbee and George Monbiot, the education section is surprisingly establishment-orientated. So is Times Higher Education, especially now the excellent Zoe Corbyn has left.

The Times, however, welcomed it. On Monday 30 July, the following much shortened version of the blog appeared. And in the Thunderer column, no less (again). Two of the most irritating things about writing for papers are the lack of links, and the often silly titles chosen by sub-editors not the author. [download pdf].

Times 300712

There are moments when the way a university runs its affairs is so boneheaded that it deserves scorn far beyond the world of academia. Queen Mary University of London is selecting which staff to sack from its science departments in a way that I can describe only as insane.

The firings, it seems, are nothing to do with hard financial times, but are a ham-fisted attempt to raise Queen Mary’s ranking in the league tables. A university’s position is directly related to its government research funding. So Queen Mary’s managers hope to do well in the 2014 “Research Excellence Framework” by firing staff who don’t publish a paper every ten minutes.

To survive as a professor there you need to have published 11 papers during 2008 to 2011, of which at least two are “high quality”. For lecturers, the target for keeping your job is five papers, of which one is “high quality”. You must also have had at least one PhD student complete their thesis.

What Queen Mary defines as “high quality” is publication in “high-impact journals” (periodicals that get lots of citations). Journals such as Nature and Science get most of their citations from very few articles, so it is utterly brainless to base decisions about the quality of research from such a skewed distribution of citations. But talk of skewed distribution is, no doubt, a bit too technical for innumerate HR people to understand. Which is precisely why they should have nothing to do with assessing scientists.

I have been lucky to know well three Nobel prizewinners. None would have passed the criteria laid down for a professor by QMUL. They would have been fired and so would Peter Higgs.

More offensive still is that you can buy immunity if you have had 26 papers published in 2008-11, with six being “high quality”. The encouragement to publish reams is daft. If you are publishing a paper every few weeks, you certainly are not writing them, and possibly not even reading them. Most likely you are appending your name to somebody else’s work with little or no checking of the data, let alone contributing real research.

It is also deeply unethical for Queen Mary to require all staff to have a PhD student with the aim of raising the university’s ranking rather than of benefitting the student.

Like so much managerialism, the rules are an active encouragement to dishonesty. The dimwitted assessment methods of Queen Mary will guarantee the creation of a generation of second-rate spiv scientists. Who in their right mind would want to work there, now that the way it treats its scientists is public knowledge?

David Colquhoun is Professor of Pharmacology at University College London

Follow-up

August 3 2012 A response from Simon Gaskell appeared in the letter column of The Times.

Publication of research findings is only one criterion in a range of expectations within the realm of academia

Sir, Professor Colquhoun is entitled to question the value of publishing in academic journals and the role this plays in academia (Thunderer, July 30). However, some will wish to understand more of the background of the criticism he levelled at Queen Mary, University of London. QM is ranked in the top dozen or so research universities in the UK, as judged by the last Research Assessment Exercise. To continue making this contribution and to ensure that our students receive the finest research-led education, we’ve had to address a small number of academic areas where performance doesn’t match expectations. And in a challenging environment for higher education, we need to safeguard QM’s financial stability.

We have applied objective criteria to the assessment of individual academic performance. These criteria are based on generally recognised academic expectations that take account of differences between disciplines and have been applied in a manner that acknowledges the imprecision of any such measures. Publication of research findings was only one criterion.

We are now investing in those areas that have been restructured with a focus on establishing strengths in the medium to long term that will continue to benefit not only our students but broader society and will make best use of our resources, both public and private.

Professor Simon J. Gaskell
Principal, Queen Mary, University of London

I fear that Gaskell has just dug himself deeper in the pit of his own making. Here is the letter I have submitted for publication. I left a similar online comment, which has already appeared.

Sir.

Professor Gaskell (Letters, August 3) tries to defend his actions at Queen Mary, University of London by saying "We have applied objective criteria to the assessment of individual academic performance. These criteria are based on generally recognised academic expectations". Nothing could be further from the truth. It is most certainly not "generally recognised" that you can measure the worth of a scientist by simply counting the number of papers they produce, or by looking at the impact factor of the journal in which they appear. I’m afraid Professor Gaskell appears to be totally out of touch with the literature about such matters.

The Research Excellence Framework says explicitly "No sub-panel will make any use of journal impact factors, rankings, lists or the perceived standing of publishers in assessing the quality of research outputs". Furthermore the REF allows submission of only four papers, and tries to assess their quality. Production of a large number of salami-sliced papers would hinder, not help. His actions appear to harm rather than help his university’s chances in the REF.

Gaskell’s says also that he wants to "ensure that our students receive the finest research-led education". But we all know that someone who produces the large number of publications that he demands is unlikely to have either the time or the inclination to teach students too.

Then, of course, there is the dubious legality of declaring a person "redundant" while advertising an essentially identical job. The word for his process is firing, not redundancy.

I see no reason to change my view (Thunderer July 30) that Professor Gaskell is bringing his university into disrepute.

David Colquhoun

August 3 2012 A correspondent write to me to point out an article that described the management methods that led to a lost decade by Microsoft. They are remarkably similar to those being imposed at Queen Mary.

"It leads to employees focusing on competing with each other rather than competing with other companies.”

It’s unfortunate that university managers so often seem to latch on to ideas that have already been discredited in industry.

August 6 2012. The Times hasn’t published my response, but they did publish today a letter from Professor Gavin Vinson, of Queen Mary. I was too kind to mention the obvious absurdity of Gaskell’s first sentence. Now it has been done..

Published at 12:01AM, August 6 2012

Judging scientists

No one disputes the value of publishing in science and academia. However, all opinions need to havea basis in fact to support them

Sir, Professor Simon Gaskell (letter, Aug 3) says, “Professor Colquhoun is entitled to question the value of publishing in academic journals and the role this plays in academia” (Thunderer, July 30). Try as I might, I have failed to find the basis for this statement in Professor Colquhoun’s article, or indeed anywhere else come to that. No one disputes the value of publishing in science and academia. What is in dispute is the use of spurious metrics in evaluating scientists.

Professor Gavin P. Vinson
London N10

Tagged Academia, British Chiropractic Association, impact factor, Queen Mary, Queen Mary University of London, REF, Simon Gaskell, Universities, vice-chancellors | 17 Comments

Is Queen Mary University of London trying to commit scientific suicide?

Published June 29, 2012

Jump to follow-up

Academic staff are going to be fired at Queen Mary University of London (QMUL). It’s possible that universities may have to contract a bit in hard times, so what’s wrong?

What’s wrong is that the victims are being selected in a way that I can describe only as insane. The criteria they use are guaranteed to produce a generation of second-rate spiv scientists, with a consequent progressive decline in QMUL’s reputation.

The firings, it seems, are nothing to do with hard financial times, but are a result of QMUL’s aim to raise its ranking in university league tables.

In the UK university league table, a university’s position is directly related to its government research funding. So they need to do well in the 2014 ‘Research Excellence Framework’ (REF). To achieve that they plan to recruit new staff with high research profiles, take on more PhD students and post-docs, obtain more research funding from grants, and get rid of staff who are not doing ‘good’ enough research.

So far, that’s exactly what every other university is trying to do. This sort of distortion is one of the harmful side-effects of the REF. But what’s particularly stupid about QMUL’s behaviour is the way they are going about it. You can assess your own chances of survival at QMUL’s School of Biological and Chemical Sciences from the following table, which is taken from an article by Jeremy Garwood (Lab Times Online. July 4, 2012). The numbers refer to the four year period from 2008 to 2011.

Category of staff	Research Output Quantity (No. of Papers)	Research Output Quality (No. of high quality papers)	Research Income (£) (Total)	Research Income (£) As Principal Investigator
Professor	11	2	400,000	at least 200,000
Reader	9	2	320,000	at least 150,000
Senior Lecturer	7	1	260,000	at least 120,000
Lecturer	5	1	200,000	at least 100,000

In addition to the three criteria, ‘Research Output ‐ quality’, ‘Research Output – quantity’, and ‘Research Income’, there is a minimum threshold of 1 PhD completion for staff at each academic level. All this data is “evidenced by objective metrics; publications cited in Web of Science, plus official QMUL metrics on grant income and PhD completion.”

To survive, staff must meet the minimum threshold in three out of the four categories, except as follows:

Demonstration of activity at an exceptional level in either ‘research outputs’ or ‘research income’, termed an ‘enhanced threshold’, is “sufficient” to justify selection regardless of levels of activity in the other two categories. And what are these enhanced thresholds?
For research quantity: a mere 26 published items with at least 11 as significant author (no distinction between academic level); research quality: a modest 6 items published in numerically-favoured journals (e.g. impact factor > 7). Alternatively you can buy your survival with a total ‘Research Income’ of £1,000,000 as PI.

The university notes that the above criteria “are useful as entry standards into the new school, but they fall short of the levels of activity that will be expected from staff in the future. These metrics should not, therefore, be regarded as targets for future performance.”

This means that those who survived the redundancy criteria will simply have to do better. But what is to reassure them that it won’t be their turn next time should they fail to match the numbers?

To help them, Queen Mary is proposing to introduce ‘D³’ performance management (www.unions.qmul.ac.uk/ucu/docs/d3-part-one.doc). Based on more ‘administrative physics’, D³ is shorthand for ‘Direction × Delivery × Development.’ Apparently “all three are essential to a successful team or organisation. The multiplication indicates that where one is absent/zero, then the sum is zero!”

D³ is based on principles of accountability: “A sign of a mature organisation is where its members acknowledge that they face choices, they make commitments and are ready to be held to account for discharging these commitments, accepting the consequences rather than seeking to pass responsibility.” Inspired?

I presume the D³ document must have been written by an HR person. It has all the incoherent use of buzzwords so typical of HR. And it says "sum" when it means "product" (oh dear, innumeracy is rife).

The criteria are utterly brainless. The use of impact factors for assessing people has been discredited at least since Seglen (1997) showed that the number of citations that a paper gets is not perceptibly correlated with the impact factor of the journal in which it’s published. The reason for this is the distribution of the number of citations for papers in a particular journal is enormously skewed. This means that high-impact journals get most of their citations from a few articles.

The distribution for Nature is shown in Fig. 1. Far from being gaussian, it is even more skewed than a geometric distribution; the mean number of citations is 114, but 69% of papers have fewer than the mean, and 24% have fewer than 30 citations. One paper has 2,364 citations but 35 have 10 or fewer. ISI data for citations in 2001 of the 858 papers published in Nature in 1999 show that the 80 most-cited papers (16% of all papers) account for half of all the citations (from Colquhoun, 2003)

The Institute of Scientific Information, ISI, is guilty of the unsound statistical practice of characterizing a distribution by its mean only, with no indication of its shape or even its spread. School of Biological and Chemical Sciences-QMUL is expecting everyone has to be above average in the new regime. Anomalously, the thresholds for psychologists are lower because it is said that it’s more difficult for them to get grants. This undermines even the twisted logic applied at the outset.

All this stuff about skewed distributions is, no doubt, a bit too technical for HR people to understand. Which, of course, is precisely why they should have nothing to do with assessing people.

At a time when so may PhDs fail to get academic jobs we should be limiting the numbers. But QMUL requires everyone to have a PhD student, not for the benefit of the student, but to increase its standing in league tables. That is deeply unethical.

The demand to have two papers in journals with impact factor greater than seven is nonsense. In physiology, for example, there are only four journals with an impact factor greater that seven and three of them are review journals that don’t publish original research. The two best journals for electrophysiology are Journal of Physiology (impact factor 4.98, in 2010) and Journal of General Physiology (IF 4.71). These are the journals that publish papers that get you into the Royal Society or even Nobel prizes. But for QMUL, they don’t count.

I have been lucky to know well three Nobel prize winners. Andrew Huxley. Bernard Katz, and Bert Sakmann. I doubt that any of them would pass the criteria laid down for a professor by QMUL. They would have been fired.

The case of Sakmann is analysed in How to Get Good Science, [pdf version]. In the 10 years from 1976 to 1985, when Sakmann rose to fame, he published an average of 2.6 papers per year (range 0 to 6). In two of these 10 years he had no publications at all. In the 4 year period (1976 – 1979 ) that started with the paper that brought him to fame (Neher & Sakmann, 1976) he published 9 papers, just enough for the Reader grade, but in the four years from 1979 – 1982 he had 6 papers, in 2 of which he was neither first nor last author. His job would have been in danger if he’d worked at QMUL. In 1991 Sakmann, with Erwin Neher, got the Nobel Prize for Physiology or Medicine.

The most offensive thing of the lot is the way you can buy yourself out if you publish 26 papers in the 4 year period. Sakmann came nowhere near this. And my own total, for the entire time from my first paper (1963) until I was elected to the Royal Society (May 1985) was 27 papers (and 7 book chapters). I would have been fired.

Peter Higgs had no papers at all from the time he moved to Edinburgh in 1960, until 1964 when his two paper’s on what’s now called the Higgs’ Boson were published in Physics Letters. That journal now has an impact factor less than 7 so Queen Mary would not have counted them as “high quality” papers, and he would not have been returnable for the REF. He too would have been fired.

The encouragement to publish large numbers of papers is daft. I have seen people rejected from the Royal Society for publishing too much. If you are publishing a paper every six weeks, you certainly aren’t writing them, and possibly not even reading them. Most likely you are appending your name to somebody else’s work with little or no checking of the data. Such numbers can be reached only by unethical behaviour, as described by Peter Lawrence in The Mismeasurement of Science. Like so much managerialism, the rules provide an active encouragement to dishonesty.

In the face of such a boneheaded approach to assessment of your worth, it’s the duty of any responsible academic to point out the harm that’s being done to the College. Richard Horton, in the Lancet, did so in Bullying at Barts. There followed quickly letters from Stuart McDonald and Nick Wright, who used the Nuremburg defence, pointing out that the Dean (Tom Macdonald) was just obeying orders from above. That has never been as acceptable defence. If Macdonald agreed with the procedure, he should be fired for incompetence. If he did not agree with it he should have resigned.

It’s a pity, because Tom Macdonald was one of the people with whom I corresponded in support of Barts’ students who, very reasonably, objected to having course work marked by homeopaths (see St Bartholomew’s teaches antiscience, but students revolt, and, later, Bad medicine. Barts sinks further into the endarkenment). In that case he was not unreasonable, and, a mere two years later I heard that he’d taken action.

To cap it all, two academics did their job by applying a critical eye to what’s going on at Queen Mary. They wrote to the Lancet under the title Queen Mary: nobody expects the Spanish Inquisition

"For example, one of the “metrics” for research output at professorial level is to have published at least two papers in journals with impact factors of 7 or more. This is ludicrous, of course—a triumph of vanity as sensible as selecting athletes on the basis of their brand of track suit. But let us follow this “metric” for a moment. How does the Head of School fair? Zero, actually. He fails. Just consult Web of Science. Take care though, the result is classified information. HR’s “data” are marked Private and Confidential. Some things must be believed. To question them is heresy."

Astoundingly, the people who wrote this piece are now under investigation for “gross misconduct”. This is behaviour worthy of the University of Poppleton, as pointed out by the inimitable Laurie Taylor, in Times Higher Education (June 7)

The rustle of censorship

It appears that last week’s edition of our sister paper, The Poppleton Evening News, carried a letter from Dr Gene Ohm of our Biology Department criticising this university’s metrics-based redundancy programme.

We now learn that, following the precedent set by Queen Mary, University of London, Dr Ohm could be found guilty of “gross misconduct” and face “disciplinary proceedings leading to dismissal” for having the effrontery to raise such issues in a public place.

Louise Bimpson, the corporate director of our ever-expanding human resources team, admitted that this response might appear “severe” but pointed out that Poppleton was eager to follow the disciplinary practices set by such soon-to-be members of the prestigious Russell Group as Queen Mary. Thus it was only to be expected that we would seek to emulate its espousal of draconian censorship. She hoped this clarified the situation.

David Bignell, emeritus professor of zoology at Queen Mary hit the nail on the head.

"These managers worry me. Too many are modest achievers, retired from their own studies, intoxicated with jargon, delusional about corporate status and forever banging the metrics gong. Crucially, they don’t lead by example."

What the managers at Queen Mary have failed to notice is that the best academics can choose where to go.

People are being told to pack their bags and move out with one day’s notice. Access to journals stopped, email address removed, and you may need to be accompanied to your (ex)-office. Good scientists are being treated like criminals.

What scientist in their right mind would want to work at QMUL, now that their dimwitted assessment methods, and their bullying tactics, are public knowledge?

The responsibility must lie with the principal, Simon Gaskell. And we know what the punishment is for bringing your university into disrepute.

Follow-up

Send an email. You may want to join the many people who have already written to QMUL’s principal, Simon Gaskell (principal@qmul.ac.uk), and/or to Sir Nicholas Montagu, Chairman of Council, n.montagu@qmul.ac.uk.

Sunday 1 July 2012. Since this blog was posted after lunch on Friday 29th June, it has had around 9000 visits from 72 countries. Here is one of 17 maps showing the origins of 200 of the hits in the last two days

map

The tweets about QMUL are collected in a Storify timeline.

I’m reminded of a 2008 comment, on a post about the problems imposed by HR, In-human resources, science and pizza.

Thanks for that – I LOVED IT. It’s fantastic that the truth of HR (I truly hate that phrase) has been so ruthlessly exposed. Should be part of the School Handbook. Any VC who stripped out all the BS would immediately retain and attract good people and see their productivity soar.

That’s advice that Queen Mary should heed.

Part of the reason for that popularity was Ben Goldacre’s tweet, to his 201,000 followers

“destructive, unethical and crude metric incentives in academia (spotlight QMUL) bit.ly/MFHk2H by @david_colquhoun”

3 July 2012. I have come by a copy of this email, which was sent to Queen Mary by a senior professor from the USA (word travels fast on the web). It shows just how easy it is to destroy the reputation of an institution.

Sir Nicholas Montagu, Chairman of Council, and Principal Gaskell,

I was appalled to read the criteria devised by your University to evaluate its faculty. There are so flawed it is hard to know where to begin.

Your criteria are antithetical to good scientific research. The journals are littered with weak publications, which are generated mainly by scientists who feel the pressure to publish, no matter whether the results are interesting, valid, or meaningful. The literature is flooded by sheer volume of these publications.

Your attempt to require “quality” research is provided by the requirement for publications in “high Impact Factor” journals. IF has been discredited among scientists for many reasons: it is inaccurate in not actually reflecting the merit of the specific paper, it is biased toward fields with lots of scientists, etc. The demand for publications in absurdly high IF journals encourages, and practically enforces scientific fraud. I have personally experienced those reviews from Nature demanding one or two more “final” experiments that will clinch the publication. The authors KNOW how these experiments MUST turn out. If they want their Nature paper (and their very academic survival if they are at a brutal, anti-scientific university like QMUL), they must get the “right” answer. The temptation to fudge the data to get this answer is extreme. Some scientists may even be able to convince themselves that each contrary piece of data that they discard to ensure the “correct” answer is being discarded for a valid reason. But the result is that scientific misconduct occurs. I did not see in your criteria for “success” at QMUL whether you discount retracted papers from the tally of high IF publications, or perhaps the retraction itself counts as yet another high IF publication!

Your requirement for each faculty to have one or more postdocs or students promotes the abusive exploitation of these individuals for their cheap labor, and ignores the fact that they are being “trained” for jobs that do not exist.

The “standards” you set are fantastically unrealistic. For example, funding is not graded, but a sharp step function – we have 1 or 2 or 0 grants and even if the average is above your limits, no one could sustain this continuously. Once you have fired every one of your faculty, which will almost certainly happen within 1-2 rounds of pogroms, where will you find legitimate scientists who are willing to join such a ludicrous University?

4 July 2012.

Professor John F. Allen is Professor of Biochemistry at Queen Mary, University of London, and distinguished in the fields of Photosynthesis, Chloroplasts, Mitochondria, Genome function and evolution and Redox signalling. He, with a younger colleague, wrote a letter to the Lancet, Queen Mary: nobody expects the Spanish Inquisition. It is an admirable letter, the sort of thing any self-respecting academic should write. But not according to HR. On 14 May, Allen got a letter from HR, which starts thus.

14th May 2012

Dear Professor Allen

I am writing to inform you that the College had decided to commence a factfinding investigation into the below allegation: That in writing and/or signing your name to a letter entitled "Queen Mary: nobody expects the Spanish Inquisition," (enclosed) which was published in the Lancet online on 4th May 2012, you sought to bring the Head of School of Biological and Chemical Sciences and the Dean for Research in the School of Medicine and Dentistry into disrepute.

. . . .

Sam Holborn
HR Consultant- Science & Engineering

Download the entire letter. It is utterly disgraceful bullying. If anyone is bringing Queen Mary into disrepute, it is Sam Holborn and the principal, Simon Gaskell.

Here’s another letter, from the many that have been sent. This is from a researcher in the Netherlands.

Dear Sir Nicholas,

I am addressing this to you in the hope that you were not directly involved in creating this extremely stupid set of measures that have been thought up, not to improve the conduct of science at QMUL, but to cheat QMUL’s way up the league tables over the heads of the existing academic staff.

Others have written more succinctly about the crass stupidity of your Human Resources department than I could, and their apparent ignorance of how science actually works. As your principal must bear full responsibility for the introduction of these measures, I am not sending him a copy of this mail. I am pretty sure that his “principal” mail address will no longer be operative.

We have had a recent scandal in the Netherlands where a social psychology professor, who even won a national “Man of the Year” award, as well as as a very large amount of research money, was recently exposed as having faked all the data that went into a total number of articles running into three figures. This is not the sort of thing one wants to happen to one’s own university. He would have done well according to your REF .. before he was found out.

Human Resources departments have gained too much power, and are completely incompetent when it comes to judging academic standards. Let them get on with the old dull, and gobbledigook-free, tasks that personnel departments should be carrying out.

Yours,

5 July 2012.

Here’s another letter. It’s from a member of academic staff at QMUL, someone who is not himself threatened with being fired. It certainly shows that I’m not making a fuss about nothing. Rather, I’m the only person old enough to say what needs to be said without fear of losing my job and my house.

Dear Prof. Colquhoun,

I am an academic staff member in SBCS, QMUL. I am writing from my personal email account because the risks of using my work account to send this email are too great.

I would like to thank you for highlighting our problems and how we have been treated by our employer (Queen Mary University of London), in your blog. I would please urge you to continue to tweet and blog about our plight, and staff in other universities experiencing similarly horrific working conditions.

I am not threatened with redundancy by QMUL, and in fact my research is quite successful. Nevertheless, the last nine months have been the most stressful of all my years of academic life. The best of my colleagues in SBCS, QMUL are leaving already and I hope to leave, if I can find another job in London.

Staff do indeed feel very unfairly treated, intimidated and bullied. I never thought a job at a university could come to this.

Thank you again for your support. It really does matter to the many of us who cannot really speak out openly at present.

Best regards,

In a later letter, the same person pointed out

"There are many of us who would like to speak more openly, but we simply cannot."

"I have mortgage . . . . Losing my job would probably mean losing my home too at this point."

"The plight of our female staff has not even been mentioned. We already had very few female staff. And with restructuring, female staff are more likely to be forced into teaching-only contracts or indeed fired"."

"total madness in the current climate – who would want to join us unless desperate for a job!"

“fuss about nothing” – absolutely not. It is potentially a perfect storm leading to teaching and research disaster for a university! Already the reputation of our university has been greatly damaged. And senior staff keep blaming and targeting the “messengers"."

6 July 2012.

Througn the miracle of WiFi, this is coming from Newton, MA. The Lancet today has another editorial on the Queen Mary scandal.

"As hopeful scientists prepare their applications to QMUL, they should be aware that, behind the glossy advertising, a sometimes harsh, at times repressive, and disturbingly unforgiving culture awaits them."

That sums it up nicely.

24 July 2012. I’m reminded by Nature writer, Richard van Noorden (@Richvn) that Nature itself has wriiten at least twice about the iniquity of judging people by impact factors. In 2005 Not-so-deep impact said

"Only 50 out of the roughly 1,800 citable items published in those two years received more than 100 citations in 2004. The great majority of our papers received fewer than 20 citations."

"None of this would really matter very much, were it not for the unhealthy reliance on impact factors by administrators and researchers’ employers worldwide to assess the scientific quality of nations and institutions, and often even to judge individuals."

And, more recently, in Assessing assessment” (2010).

29 July 2012. Jonathan L Rees. of the University of Edinburgh, ends his blog:

"I wonder what career advice I should offer to a young doctor circa 2012. Apart from not taking a job at Queen Mary of course. "

How to select candidates

I have, at various times, been asked how I would select candidates for a job, if not by counting papers and impact factors. This is a slightly modified version of a comment that I left on a blog, which describes roughly what I’d advocate

After a pilot study the entire Research Excellence Framework (which attempts to assess the quality of research in every UK university) made the following statement.

“No sub-panel will make any use of journal impact factors, rankings, lists or the perceived standing of publishers in assessing the quality of research outputs”

It seems that the REF is paying attention to the science not to bibliometricians.

It has been the practice at UCL to ask people to nominate their best papers (2 -4 papers depending on age). We then read the papers and asked candidates hard questions about them (not least about the methods section). It’s a method that I learned a long time ago from Stephen Heinemann, a senior scientist at the Salk Institute. It’s often been surprising to learn how little some candidates know about the contents of papers which they themselves select as their best. One aim of this is to find out how much the candidate understands the principles of what they are doing, as opposed to following a recipe.

Of course we also seek the opinions of people who know the work, and preferably know the person. Written references have suffered so much from ‘grade inflation’ that they are often worthless, but a talk on the telephone to someone that knows both the work, and the candidate, can be useful, That, however, is now banned by HR who seem to feel that any knowledge of the candidate’s ability would lead to bias.

It is not true that use of metrics is universal and thank heavens for that. There are alternatives and we use them.

Incidentally, the reason that I have described the Queen Mary procedures as insane, brainless and dimwitted is because their aim to increase their ratings is likely to be frustrated. No person in their right mind would want to work for a place that treats its employees like that, if they had any other option. And it is very odd that their attempt to improve their REF rating uses criteria that have been explicitly ruled out by the REF. You can’t get more brainless than that.

This discussion has been interesting to me, if only because it shows how little bibliometricians understand how to get good science.

Tagged Academia, assessment, citations, David Bignell, impact factor, metrics, Quality assessment, Queen Mary, Queen Mary University of London, Simon Gaskell, Tom Macdonald, Universities | 76 Comments

The reproducibility of Science. A meeting report.

Follow-up

Like this:

In which Simon Gaskell, of Queen Mary, University of London, makes a cock-up

Australian journal classification

The responses to Simon Gaskell

Follow-up

Like this:

Queen Mary, University of London in The Times. Does Simon Gaskell care?

Follow-up

Like this:

Is Queen Mary University of London trying to commit scientific suicide?

Follow-up

Like this:

Queen Mary

The reproducibility of Science. A meeting report.

Follow-up

Share this:

Like this:

In which Simon Gaskell, of Queen Mary, University of London, makes a cock-up

Australian journal classification

The responses to Simon Gaskell

Follow-up

Share this:

Like this:

Queen Mary, University of London in The Times. Does Simon Gaskell care?

Follow-up

Share this:

Like this:

Is Queen Mary University of London trying to commit scientific suicide?

Follow-up

Share this:

Like this: