Google Scholar

Type I and Type II error concerns in fMRI research: re-balancing the scale

MD Lieberman, WA Cunningham - Social cognitive and affective …, 2009 - academic.oup.com

Social cognitive and affective neuroscience, 2009•academic.oup.com

Statistical thresholding (ie P-values) in fMRI research has become increasingly conservative
over the past decade in an attempt to diminish Type I errors (ie false alarms) to a level
traditionally allowed in behavioral science research. In this article, we examine the
unintended negative consequences of this single-minded devotion to Type I errors:
increased Type II errors (ie missing true effects), a bias toward studying large rather than
small effects, a bias toward observing sensory and motor processes rather than complex …

Abstract

Statistical thresholding (i.e. P-values) in fMRI research has become increasingly conservative over the past decade in an attempt to diminish Type I errors (i.e. false alarms) to a level traditionally allowed in behavioral science research. In this article, we examine the unintended negative consequences of this single-minded devotion to Type I errors: increased Type II errors (i.e. missing true effects), a bias toward studying large rather than small effects, a bias toward observing sensory and motor processes rather than complex cognitive and affective processes and deficient meta-analyses. Power analyses indicate that the reductions in acceptable P-values over time are producing dramatic increases in the Type II error rate. Moreover, the push for a mapwide false discovery rate (FDR) of 0.05 is based on the assumption that this is the FDR in most behavioral research; however, this is an inaccurate assessment of the conventions in actual behavioral research. We report simulations demonstrating that combined intensity and cluster size thresholds such as P < 0.005 with a 10 voxel extent produce a desirable balance between Types I and II error rates. This joint threshold produces high but acceptable Type II error rates and produces a FDR that is comparable to the effective FDR in typical behavioral science articles (while a 20 voxel extent threshold produces an actual FDR of 0.05 with relatively common imaging parameters). We recommend a greater focus on replication and meta-analysis rather than emphasizing single studies as the unit of analysis for establishing scientific truth. From this perspective, Type I errors are self-erasing because they will not replicate, thus allowing for more lenient thresholding to avoid Type II errors.

Oxford University Press

Show moreShow less

Save Cite Cited by 1437 Related articles All 14 versions

Cite

Advanced search

Saved to My library

Type I and Type II error concerns in fMRI research: re-balancing the scale