Friday, 28 October 2016

The allure of autism for researchers

Data on $K spend on neurodevelopmental disorder research by NIH: from Bishop, D. V. M. (2010). Which neurodevelopmental disorders get researched and why? PLOS One, 5(11), e15112. doi: 10.1371/journal.pone.0015112

Every year I hear from students interested in doing postgraduate study with me at Oxford. Most of them express a strong research interest in autism spectrum disorder (ASD). At one level, this is not surprising: if you want to work on autism and you look at the University website, you will find me as one of the people listed as affiliated with the Oxford Autism Research Centre. But if you look at my publication list, you find that autism research is a rather minor part of what I do: 13% of my papers have autism as a keyword, and only 6% have autism or ASD in the title. And where I have published on autism, it is usually in the context of comparing language in ASD with developmental language disorder (DLD, aka specific language impairment, SLI). And, indeed in the publication referenced in the graph above, I concluded that there was disproportionate amounts of research, and research funding, going to ASD relative to other neurodevelopmental disorders.

Now, I don’t want to knock autism research. ASD is an intriguing condition which can have major effects on the lives of affected individuals and their families. It was great to see the recent publication of a study by Jonathan Green and his colleagues showing that a parent-based treatment with autistic toddlers could produce long-lasting reduction in severity of symptoms. Conducting a rigorous study of this size is hugely difficult to do and only possible with substantial research funding.

But I do wonder why there is such a skew in interest towards autism, when many children have other developmental disorders that have long-term impacts. Where are all the enthusiastic young researchers who want to work on developmental language disorders? Why is it that children with general learning disabilities (intellectual retardation) are so often excluded from research, or relegated to be a control group against which ASD is assessed?

Together with colleagues Becky Clark, Gina Conti-Ramsden, Maggie Snowling, and Courtenay Norbury, I started the RALLI campaign in 2012 to raise awareness of children’s language impairments, mainly focused on a YouTube channel where we post videos providing brief summaries of key information, with links to more detailed evidence. This year we also completed a study that brought together a multidisciplinary, multinational panel of experts with the goal of producing consensus statements on criteria and terminology for children’s language disorders – leading to one published paper and another currently in preprint stage. We hope that increased consistency in how we define and refer to developmental language disorders will lead to improved recognition.

We still have a long way to go in raising awareness. I doubt we will ever achieve a level of interest to parallel that of autism. And I suspect this is because autism fascinates because it does not appear just to involve cognitive deficits, but rather a qualitatively different way of thinking and interacting with the world. But I would urge those considering pursuing research in this field to think more broadly and recognise that there are many fascinating conditions about which we still know very little. Finding ways to understand and eventually ameliorate language problems or learning disabilities could help a huge number of children and we need more of our brightest and best students to recognise this potential.

Sunday, 16 October 2016

Bishopblog catalogue (updated 16 Oct 2016)

Source: http://www.weblogcartoons.com/2008/11/23/ideas/

Those of you who follow this blog may have noticed a lack of thematic coherence. I write about whatever is exercising my mind at the time, which can range from technical aspects of statistics to the design of bathroom taps. I decided it might be helpful to introduce a bit of order into this chaotic melange, so here is a catalogue of posts by topic.

Language impairment, dyslexia and related disorders
The common childhood disorders that have been left out in the cold (1 Dec 2010) What's in a name? (18 Dec 2010) Neuroprognosis in dyslexia (22 Dec 2010) Where commercial and clinical interests collide: Auditory processing disorder (6 Mar 2011) Auditory processing disorder (30 Mar 2011) Special educational needs: will they be met by the Green paper proposals? (9 Apr 2011) Is poor parenting really to blame for children's school problems? (3 Jun 2011) Early intervention: what's not to like? (1 Sep 2011) Lies, damned lies and spin (15 Oct 2011) A message to the world (31 Oct 2011) Vitamins, genes and language (13 Nov 2011) Neuroscientific interventions for dyslexia: red flags (24 Feb 2012) Phonics screening: sense and sensibility (3 Apr 2012) What Chomsky doesn't get about child language (3 Sept 2012) Data from the phonics screen (1 Oct 2012) Auditory processing disorder: schisms and skirmishes (27 Oct 2012) High-impact journals (Action video games and dyslexia: critique) (10 Mar 2013) Overhyped genetic findings: the case of dyslexia (16 Jun 2013) The arcuate fasciculus and word learning (11 Aug 2013) Changing children's brains (17 Aug 2013) Raising awareness of language learning impairments (26 Sep 2013) Good and bad news on the phonics screen (5 Oct 2013) What is educational neuroscience? (25 Jan 2014) Parent talk and child language (17 Feb 2014) My thoughts on the dyslexia debate (20 Mar 2014) Labels for unexplained language difficulties in children (23 Aug 2014) International reading comparisons: Is England really do so poorly? (14 Sep 2014) Our early assessments of schoolchildren are misleading and damaging (4 May 2015) Opportunity cost: a new red flag for evaluating interventions (30 Aug 2015)

Autism
Autism diagnosis in cultural context (16 May 2011) Are our ‘gold standard’ autism diagnostic instruments fit for purpose? (30 May 2011) How common is autism? (7 Jun 2011) Autism and hypersystematising parents (21 Jun 2011) An open letter to Baroness Susan Greenfield (4 Aug 2011) Susan Greenfield and autistic spectrum disorder: was she misrepresented? (12 Aug 2011) Psychoanalytic treatment for autism: Interviews with French analysts (23 Jan 2012) The ‘autism epidemic’ and diagnostic substitution (4 Jun 2012) How wishful thinking is damaging Peta's cause (9 June 2014)

Developmental disorders/paediatrics
The hidden cost of neglected tropical diseases (25 Nov 2010) The National Children's Study: a view from across the pond (25 Jun 2011) The kids are all right in daycare (14 Sep 2011) Moderate drinking in pregnancy: toxic or benign? (21 Nov 2012) Changing the landscape of psychiatric research (11 May 2014)

Genetics
Where does the myth of a gene for things like intelligence come from? (9 Sep 2010) Genes for optimism, dyslexia and obesity and other mythical beasts (10 Sep 2010) The X and Y of sex differences (11 May 2011) Review of How Genes Influence Behaviour (5 Jun 2011) Getting genetic effect sizes in perspective (20 Apr 2012) Moderate drinking in pregnancy: toxic or benign? (21 Nov 2012) Genes, brains and lateralisation (22 Dec 2012) Genetic variation and neuroimaging (11 Jan 2013) Have we become slower and dumber? (15 May 2013) Overhyped genetic findings: the case of dyslexia (16 Jun 2013) Incomprehensibility of much neurogenetics research ( 1 Oct 2016)

Neuroscience
Neuroprognosis in dyslexia (22 Dec 2010) Brain scans show that… (11 Jun 2011)  Time for neuroimaging (and PNAS) to clean up its act (5 Mar 2012) Neuronal migration in language learning impairments (2 May 2012) Sharing of MRI datasets (6 May 2012) Genetic variation and neuroimaging (1 Jan 2013) The arcuate fasciculus and word learning (11 Aug 2013) Changing children's brains (17 Aug 2013) What is educational neuroscience? ( 25 Jan 2014) Changing the landscape of psychiatric research (11 May 2014) Incomprehensibility of much neurogenetics research ( 1 Oct 2016)

Reproducibility
Accentuate the negative (26 Oct 2011) Novelty, interest and replicability (19 Jan 2012) High-impact journals: where newsworthiness trumps methodology (10 Mar 2013) Who's afraid of open data? (15 Nov 2015) Blogging as post-publication peer review (21 Mar 2013) Research fraud: More scrutiny by administrators is not the answer (17 Jun 2013) Pressures against cumulative research (9 Jan 2014) Why does so much research go unpublished? (12 Jan 2014) Replication and reputation: Whose career matters? (29 Aug 2014) Open code: note just data and publications (6 Dec 2015) Why researchers need to understand poker ( 26 Jan 2016) Reproducibility crisis in psychology ( 5 Mar 2016) Further benefit of registered reports ( 22 Mar 2016) Would paying by results improve reproducibility? ( 7 May 2016) Serendipitous findings in psychology ( 29 May 2016) Thoughts on the Statcheck project ( 3 Sep 2016)  

Statistics
Book review: biography of Richard Doll (5 Jun 2010) Book review: the Invisible Gorilla (30 Jun 2010) The difference between p < .05 and a screening test (23 Jul 2010) Three ways to improve cognitive test scores without intervention (14 Aug 2010) A short nerdy post about the use of percentiles (13 Apr 2011) The joys of inventing data (5 Oct 2011) Getting genetic effect sizes in perspective (20 Apr 2012) Causal models of developmental disorders: the perils of correlational data (24 Jun 2012) Data from the phonics screen (1 Oct 2012)Moderate drinking in pregnancy: toxic or benign? (1 Nov 2012) Flaky chocolate and the New England Journal of Medicine (13 Nov 2012) Interpreting unexpected significant results (7 June 2013) Data analysis: Ten tips I wish I'd known earlier (18 Apr 2014) Data sharing: exciting but scary (26 May 2014) Percentages, quasi-statistics and bad arguments (21 July 2014) Why I still use Excel ( 1 Sep 2016)

Journalism/science communication
Orwellian prize for scientific misrepresentation (1 Jun 2010) Journalists and the 'scientific breakthrough' (13 Jun 2010) Science journal editors: a taxonomy (28 Sep 2010) Orwellian prize for journalistic misrepresentation: an update (29 Jan 2011) Academic publishing: why isn't psychology like physics? (26 Feb 2011) Scientific communication: the Comment option (25 May 2011)  Publishers, psychological tests and greed (30 Dec 2011) Time for academics to withdraw free labour (7 Jan 2012) 2011 Orwellian Prize for Journalistic Misrepresentation (29 Jan 2012) Time for neuroimaging (and PNAS) to clean up its act (5 Mar 2012) Communicating science in the age of the internet (13 Jul 2012) How to bury your academic writing (26 Aug 2012) High-impact journals: where newsworthiness trumps methodology (10 Mar 2013)  A short rant about numbered journal references (5 Apr 2013) Schizophrenia and child abuse in the media (26 May 2013) Why we need pre-registration (6 Jul 2013) On the need for responsible reporting of research (10 Oct 2013) A New Year's letter to academic publishers (4 Jan 2014) Journals without editors: What is going on? (1 Feb 2015) Editors behaving badly? (24 Feb 2015) Will Elsevier say sorry? (21 Mar 2015) How long does a scientific paper need to be? (20 Apr 2015) Will traditional science journals disappear? (17 May 2015) My collapse of confidence in Frontiers journals (7 Jun 2015) Publishing replication failures (11 Jul 2015) Psychology research: hopeless case or pioneering field? (28 Aug 2015) Desperate marketing from J. Neuroscience ( 18 Feb 2016) Editorial integrity: publishers on the front line ( 11 Jun 2016)

Social Media
A gentle introduction to Twitter for the apprehensive academic (14 Jun 2011) Your Twitter Profile: The Importance of Not Being Earnest (19 Nov 2011) Will I still be tweeting in 2013? (2 Jan 2012) Blogging in the service of science (10 Mar 2012) Blogging as post-publication peer review (21 Mar 2013) The impact of blogging on reputation ( 27 Dec 2013) WeSpeechies: A meeting point on Twitter (12 Apr 2014) Email overload ( 12 Apr 2016)

Academic life
An exciting day in the life of a scientist (24 Jun 2010) How our current reward structures have distorted and damaged science (6 Aug 2010) The challenge for science: speech by Colin Blakemore (14 Oct 2010) When ethics regulations have unethical consequences (14 Dec 2010) A day working from home (23 Dec 2010) Should we ration research grant applications? (8 Jan 2011) The one hour lecture (11 Mar 2011) The expansion of research regulators (20 Mar 2011) Should we ever fight lies with lies? (19 Jun 2011) How to survive in psychological research (13 Jul 2011) So you want to be a research assistant? (25 Aug 2011) NHS research ethics procedures: a modern-day Circumlocution Office (18 Dec 2011) The REF: a monster that sucks time and money from academic institutions (20 Mar 2012) The ultimate email auto-response (12 Apr 2012) Well, this should be easy…. (21 May 2012) Journal impact factors and REF2014 (19 Jan 2013)  An alternative to REF2014 (26 Jan 2013) Postgraduate education: time for a rethink (9 Feb 2013)  Ten things that can sink a grant proposal (19 Mar 2013)Blogging as post-publication peer review (21 Mar 2013) The academic backlog (9 May 2013)  Discussion meeting vs conference: in praise of slower science (21 Jun 2013) Why we need pre-registration (6 Jul 2013) Evaluate, evaluate, evaluate (12 Sep 2013) High time to revise the PhD thesis format (9 Oct 2013) The Matthew effect and REF2014 (15 Oct 2013) The University as big business: the case of King's College London (18 June 2014) Should vice-chancellors earn more than the prime minister? (12 July 2014)  Some thoughts on use of metrics in university research assessment (12 Oct 2014) Tuition fees must be high on the agenda before the next election (22 Oct 2014) Blaming universities for our nation's woes (24 Oct 2014) Staff satisfaction is as important as student satisfaction (13 Nov 2014) Metricophobia among academics (28 Nov 2014) Why evaluating scientists by grant income is stupid (8 Dec 2014) Dividing up the pie in relation to REF2014 (18 Dec 2014)  Shaky foundations of the TEF (7 Dec 2015) A lamentable performance by Jo Johnson (12 Dec 2015) More misrepresentation in the Green Paper (17 Dec 2015) The Green Paper’s level playing field risks becoming a morass (24 Dec 2015) NSS and teaching excellence: wrong measure, wrongly analysed (4 Jan 2016)   Lack of clarity of purpose in REF and TEF ( 2 Mar 2016) Who wants the TEF? ( 24 May 2016) Cost benefit analysis of the TEF ( 17 Jul 2016)  Alternative providers and alternative medicine ( 6 Aug 2016)

Celebrity scientists/quackery
Three ways to improve cognitive test scores without intervention (14 Aug 2010) What does it take to become a Fellow of the RSM? (24 Jul 2011) An open letter to Baroness Susan Greenfield (4 Aug 2011) Susan Greenfield and autistic spectrum disorder: was she misrepresented? (12 Aug 2011) How to become a celebrity scientific expert (12 Sep 2011) The kids are all right in daycare (14 Sep 2011)  The weird world of US ethics regulation (25 Nov 2011) Pioneering treatment or quackery? How to decide (4 Dec 2011) Psychoanalytic treatment for autism: Interviews with French analysts (23 Jan 2012) Neuroscientific interventions for dyslexia: red flags (24 Feb 2012) Why most scientists don't take Susan Greenfield seriously (26 Sept 2014)

Women
Academic mobbing in cyberspace (30 May 2010) What works for women: some useful links (12 Jan 2011) The burqua ban: what's a liberal response (21 Apr 2011) C'mon sisters! Speak out! (28 Mar 2012) Psychology: where are all the men? (5 Nov 2012) Should Rennard be reinstated? (1 June 2014) How the media spun the Tim Hunt story (24 Jun 2015)

Politics and Religion
Lies, damned lies and spin (15 Oct 2011) A letter to Nick Clegg from an ex liberal democrat (11 Mar 2012) BBC's 'extensive coverage' of the NHS bill (9 Apr 2012) Schoolgirls' health put at risk by Catholic view on vaccination (30 Jun 2012) A letter to Boris Johnson (30 Nov 2013) How the government spins a crisis (floods) (1 Jan 2014)

Humour and miscellaneous Orwellian prize for scientific misrepresentation (1 Jun 2010) An exciting day in the life of a scientist (24 Jun 2010) Science journal editors: a taxonomy (28 Sep 2010) Parasites, pangolins and peer review (26 Nov 2010) A day working from home (23 Dec 2010) The one hour lecture (11 Mar 2011) The expansion of research regulators (20 Mar 2011) Scientific communication: the Comment option (25 May 2011) How to survive in psychological research (13 Jul 2011) Your Twitter Profile: The Importance of Not Being Earnest (19 Nov 2011) 2011 Orwellian Prize for Journalistic Misrepresentation (29 Jan 2012) The ultimate email auto-response (12 Apr 2012) Well, this should be easy…. (21 May 2012) The bewildering bathroom challenge (19 Jul 2012) Are Starbucks hiding their profits on the planet Vulcan? (15 Nov 2012) Forget the Tower of Hanoi (11 Apr 2013) How do you communicate with a communications company? ( 30 Mar 2014) Noah: A film review from 32,000 ft (28 July 2014) The rationalist spa (11 Sep 2015) Talking about tax: weasel words ( 19 Apr 2016)

Saturday, 1 October 2016

On the incomprehensibility of much neurogenetics research


Together with some colleagues, I am carrying out an analysis of methodological issues such as statistical power in papers in top neuroscience journals. Our focus is on papers that compare brain and/or behaviour measures in people who vary on common genetic variants.

I'm learning a lot by being forced to read research outside my area, but I'm struck by how difficult many of these papers are to follow. I'm neither a statistician nor a geneticist, but I have nodding acquaintance with both disciplines, as well as with neuroscience, yet in many cases I find myself struggling to make sense of what researchers did and what they found. Some papers that have taken hours of reading and re-reading to just get at the key information that we are seeking for our analysis, i.e. what was the largest association that was reported.

This is worrying for the field, because the number of people competent to review such papers will be extremely small. Good editors will, of course, try to cover all bases by finding reviewers with complementary skill sets, but this can be hard, and people will be understandably reluctant to review a highly complex paper that contains a lot of material beyond their expertise.  I remember a top geneticist on Twitter a while ago lamenting that when reviewing papers they often had to just take the statistics on trust, because they had gone beyond the comprehension of all but a small set of people. The same is true, I suspect, for neuroscience. Put the two disciplines together and you have a big problem.

I'm not sure what the solution is. Making raw data available may help, in that it allows people to check analyses using more familiar methods, but that is very time-consuming and only for the most dedicated reviewer.

Do others agree we have a problem, or is it inevitable that as things get more complex the number of people who can understand scientific papers will contract to a very small set?

Saturday, 3 September 2016

Some thoughts on the Statcheck project



Yesterday, a piece in Retractionwatch covered a new study, in which results of automated statistics checks on 50,000 psychology papers are to be made public on the PubPeer website.
I had advance warning, because a study of mine had been included in what was presumably a dry run, and this led to me receiving an email on 26th August as follows:
Assuming someone had a critical comment on this paper, I duly clicked on the link, and had a moment of double-take when I read the comment.
Now, this seemed like overkill to me, and I posted a rather grumpy tweet about it. There was a bit of to and fro on Twitter with Chris Hartgerink, one of the researchers on the Statcheck project, and with the folks at Pubpeer, where I explained why I was grumpy and they defended their approach; as far as I was concerned it was not a big deal, and if nobody else found this odd, I was prepared to let it go.
But then a couple of journalists got interested, and I sent them a more detailed thoughts.
I was quoted in the Retraction Watch piece, but I thought it worth reporting my response in full here, because the quotes could be interpreted as indicating I disapprove of the Statcheck project and am defensive about errors in my work. Neither of those is true. I think the project is an interesting piece of work; my concern is solely with the way in which feedback to authors is being implemented. So here is the email I sent to journalists in full:
I am in general a strong supporter of the reproducibility movement and I agree it could be useful to document the extent to which the existing psychology literature contains statistical errors.
However, I think there are 2 problems with how this is being done in the PubPeer study.
1. The tone of the PubPeer comments will, I suspect alienate many people. As I argued on Twitter, I found it irritating to get an email saying a paper of mine had been discussed on PubPeer, only to find that this referred to a comment stating that zero errors had been found in the statistics of that paper.
I don't think we need to be told that - by all means report somewhere a list of the papers that were checked and found to be error-free, but you don't need to personally contact all the authors and clog up PubPeer with comments of this kind.
My main concern was that during an exceptionally busy period, this was just another distraction from other things. Chris Hartgerink replied that I was free to ignore the email, but that would be extremely rash because a comment on PubPeer usually means that someone has a criticism of your paper.
As someone who works on language, I also found the pragmatics of the communication non-optimal. If you write and tell someone that you've found zero errors in their paper, the implication is that this is surprising, because you don't go around stating the obvious*. And indeed, the final part of the comment basically said that your work may well have errors in it and even though they hadn't found them, we couldn't trust it.
Now at the same time as having that reaction, I appreciate this was a computer-generated message, written by non-native English speakers, that I should not take it personally, and no slur on my work was intended. And I would like to know if errors were found in my stats, and it is entirely possible that there are some, since none of us is perfect. So I don't want to over-react, but I think that if I, as someone basically sympathetic to this agenda, was irritated by the style of the communication, then the odds are this will stoke real hostility for those who are already dubious about what has been termed 'bullying' and so on by people interested in reproducibility.
2. I'll be interested to see how this pans out for people where errors are found.
My personal view is that the focus should be on errors that do change the conclusions of the paper.
I think at least a sample of these should be hand-checked so we have some idea of the error rate - I'm not sure if this has been done, but the PubPeer comment certainly gave no indication of that - it just basically said there's probably an error in your stats but we can't guarantee that there is, putting the onus on the author to then check it out.
If it's known that on 99% of occasions the automated check is accurate, then fine. If the accuracy is only 90% I'd be really unhappy about the current process as it would be leading to lots of people putting time into checking their papers on the basis of an insufficiently sensitive diagnostic. It would make the authors of the comments look frankly lazy in stirring up doubts about someone's work and then leaving them to check it out.
In epidemiology the terms sensitivity and specificity are used to refer to the accuracy of a diagnostic test. Minimally if the sensitivity and specificity of the automated stats check is known, then those figures should be provided with the automated message.

The above was written before Dalmeet drew my attention to the second paper, in which errors had been found. Here’s how I responded to that:

I hadn't seen the 2nd paper - presumably because I was not the corresponding author on that one. It's immediately apparent that the problem is that F ratios have been reported with one degree of freedom, when there should be two. In fact, it's not clear how the automated program could assign any p-value in this situation.

I'll communicate with the first author, Thalia Eley, about this, as it does need fixing for the scientific record, but, given the sample size (on which the second, missing, degree of freedom is based), the reported p-values would appear to be accurate.
  I have added a comment to this effect on the PubPeer site.


* I was thinking here of Gricean maxims, especially maxim of relation. 

Thursday, 1 September 2016

Why I still use Excel



The Microsoft application, Excel, was in the news for all the wrong reasons last week.  A paper in Genome Biology documented how numerous scientific papers had errors in their data because they had used default settings in Excel, which had unhelpfully converted gene names to dates or floating point numbers. It was hard to spot as it didn't do it to all gene names, but, for instance, the gene Septin 2, with acronym SEPT2 would be turned into 2006/09/02.  This is not new: this paper in 2004 documented the problem, but it seems many people weren't aware of it, and it is now estimated that the literature on genetics is riddled with errors as a consequence. 
This isn't the only way Excel can mess up your data. If you want to enter a date, you need to be very careful to ensure you have the correct setting. If you are in the UK and you enter a date like 23/4/16, then it will be correctly entered as 23rd April, regardless of the setting. But if you enter 12/4/16, it will be treated as 4th December if you are on US settings and as 12th April if you are on UK settings.
Then there is the dreaded autocomplete function. This can really screw things up by assuming that if you start typing text into a cell, you want it the same as a previous entry in that column that begins with the same sequence of letters. Can be a boon and a time-saver in some circumstances, but a way to introduce major errors in others.
I've also experienced odd bugs in Excel's autofill function, which makes it easy to copy a formula across columns or rows. It's possible for a file to become corrupted so that the cells referenced in the formula are wrong. Such errors are also often introduced by users, but I've experienced corrupted files containing formulae, which is pretty scary.
The response to this by many people is to say serious scientists shouldn't use Excel.  It's just too risky having software that can actively introduce errors into your data entry or computations. But people, including me, persist in using it, and we have to consider why.
So what are the advantages of keeping going with Excel?
Well, first, it usually comes for free with Microsoft computers, so it is widely available free of charge*. This means most people will have some familiarity with it –though few both to learn how to use it properly.
Second, you can scan a whole dataset easily: it's very direct scrolling through rows or columns. You can use Freeze Panes to keep column and row headers static, and you can hide columns or rows that you don't want getting in the way.
Third, you can format a worksheet to facilitate data entry. A lot of people dismiss colour coding of columns as prettification, but it can help ensure you keep the right data in the right place. Data validation is easily added and can ensure that only valid values are entered.
Fourth, you can add textual comments – either as a row in their own right, or using the Comment function.
Fifth, you can very easily plot data. Better still, you can do so dynamically, as it is easy to create a plot and then change the data range it refers to.
Sixth, you can use lookup functions. In my line of work we need to convert raw scores to standard scores based on normative data. This is typically done using tables of numbers in a manual, which makes it very easy to introduce human error. I have found it is worth investing time to get the large table of numbers entered as a separate worksheet, so we can then automate the lookup functions.
Many of my datasets are slowly generated over a period of years: we gather large amounts of data on individuals, record responses on paper, and then enter the data as it comes in. The people doing the data entry are mostly research assistants who are relatively inexperienced. So having a very transparent method of data entry, which can include clear instructions on the worksheet, and data validation, is important. I'm not sure there are other options of software that would suit my needs.
But I'm concerned about errors and need strategies to avoid them. So here are the working rules I have developed so far.
1. Before you begin, turn off any fancy Excel defaults you don't need. And if entering gene names, ensure they are entered as text.
2. Double data entry is crucial: have the data re-entered from scratch when the whole dataset is in, and cross-check the data files. This costs money but is important for data quality. There are always errors.
3. Once you have the key data entered and checked, export it to a simple, robust format such as tab-separated text. It can then be read and re-used by people working with other packages.
4. The main analysis should be done using software that generates a script that means the whole analysis can be reproduced. Excel is therefore not suitable. I increasingly use R, though SPSS is another option, provided you keep a syntax file.
5. I still like to cross-check analyses using Excel – even if it is just to do a quick plot to ensure that the pattern of results is consistent with an analysis done in R.  
Now, I am not an expert data scientist – far from it. I'm just someone who has been analysing data for many years and learned a few things along the way. Like most people, I tend to stick with what I know, as there are costs in mastering new skills, but I will change if I can see benefits. I've become convinced that R is the way to go for data analysis, but I do think Excel still has its uses, as a complement to other methods for storing, checking and analysing data. But, given the recent crisis in genetics, I'd be interested to hear what others think about optimal, affordable approaches to data entry and data analysis – with or without Excel.

*P.S.  I have been corrected on Twitter by people who have told me it is NOT free; the price for Microsoft products may be bundled in with the cost of the machine, but someone somewhere is paying for it!

Update: 2nd September 2016
There was a surprising amount of support for this post on Twitter, mixed in with anticipated criticism from those who just told me Excel is rubbish. What's interesting is that very few of the latter group could suggest a useable alternative for data entry (and some had clearly not read my post and thought I was advocating using Excel for data analysis). And yes, I don't regard Access as a usable alternative: been there tried that, and it just induced a lot of swearing.
There was, however, one suggestion that looks very promising and which I will chase up
@stephenelane suggested I look at REDcap.
Website here: https://projectredcap.org/

Meanwhile, here's a very useful link on setting up Excel worksheets to avoid later problems that came in via @tjmahr on Twitter
http://kbroman.org/dataorg/

Update 4th October 2016
Just to say we have trialled REDCap, and I love it.  Very friendly interface. Extremely limited for any data manipulation/computation, but that doesn't matter, as you can readily import/export information into other applications for processing. It's free but institution needs to be signed up for it: Oxford is not yet fully functional with it, but we were able to use it via a colleague's server for a pilot.