How Often Is p[subscript rep] Close to the True Replication Probability?

Trafimow, David; MacDonald, Justin A.; Rice, Stephen; Clason, Dennis L.

Psychological Methods, v15 n3 p300-307 Sep 2010

Largely due to dissatisfaction with the standard null hypothesis significance testing procedure, researchers have begun to consider alternatives. For example, Killeen (2005a) has argued that researchers should calculate p[subscript rep] that is purported to indicate the probability that, if the experiment in question were replicated, the obtained finding would be in the same direction as the original finding. However, Killeen also seems to indicate that rather than being the probability of replication, p[subscript rep] is actually the probability of obtaining a finding whereby the experimental group mean exceeds the control group mean. Our goal was to determine the relative frequency with which obtained p[subscript rep] statistics are close to true replication probabilities. Regardless of which way p[subscript rep] is defined, our simulations show that it is unlikely to be close to the true value unless both the population effect magnitude and the sample size are uncommonly large. The definitional problem in combination with the inaccuracy under either interpretation, constitutes an important challenge for those who espouse the routine computation of p[subscript rep] statistics. (Contains 4 figures and 5 footnotes.)

Descriptors: Probability, Replication (Evaluation), Statistics, Comparative Analysis, Statistical Bias, Effect Size, Sample Size

