Publication Date

In 2023 | 1 |

Since 2022 | 2 |

Since 2019 (last 5 years) | 2 |

Since 2014 (last 10 years) | 10 |

Since 2004 (last 20 years) | 19 |

Descriptor

Statistical Significance | 160 |

Hypothesis Testing | 77 |

Statistical Analysis | 44 |

Analysis of Variance | 41 |

Correlation | 37 |

Mathematical Models | 27 |

Comparative Analysis | 22 |

Probability | 22 |

Computer Programs | 21 |

Sampling | 21 |

Effect Size | 20 |

More ▼ |

Source

Educational and Psychological… | 160 |

Author

Aiken, Lewis R. | 6 |

Berry, Kenneth J. | 4 |

Keselman, H. J. | 4 |

Mielke, Paul W., Jr. | 4 |

Borich, Gary D. | 3 |

Levy, Kenneth J. | 3 |

Overall, John E. | 3 |

Rae, Gordon | 3 |

Thompson, Bruce | 3 |

Fowler, Robert L. | 2 |

Halperin, Silas | 2 |

More ▼ |

Publication Type

Education Level

Elementary Secondary Education | 1 |

Higher Education | 1 |

Postsecondary Education | 1 |

Audience

Location

Australia | 1 |

Netherlands | 1 |

United Kingdom (England) | 1 |

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of… | 1 |

What Works Clearinghouse Rating

Henninger, Mirka; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023

To detect differential item functioning (DIF), Rasch trees search for optimal split-points in covariates and identify subgroups of respondents in a data-driven way. To determine whether and in which covariate a split should be performed, Rasch trees use statistical significance tests. Consequently, Rasch trees are more likely to label small DIF…

Descriptors: Item Response Theory, Test Items, Effect Size, Statistical Significance

Elliott, Mark; Buttery, Paula – Educational and Psychological Measurement, 2022

We investigate two non-iterative estimation procedures for Rasch models, the pair-wise estimation procedure (PAIR) and the Eigenvector method (EVM), and identify theoretical issues with EVM for rating scale model (RSM) threshold estimation. We develop a new procedure to resolve these issues--the conditional pairwise adjacent thresholds procedure…

Descriptors: Item Response Theory, Rating Scales, Computation, Simulation

Haig, Brian D. – Educational and Psychological Measurement, 2017

This article considers the nature and place of tests of statistical significance (ToSS) in science, with particular reference to psychology. Despite the enormous amount of attention given to this topic, psychology's understanding of ToSS remains deficient. The major problem stems from a widespread and uncritical acceptance of null hypothesis…

Descriptors: Statistical Significance, Statistical Analysis, Hypothesis Testing, Psychology

Walker, Cindy M.; Gocer Sahin, Sakine – Educational and Psychological Measurement, 2017

The theoretical reason for the presence of differential item functioning (DIF) is that data are multidimensional and two groups of examinees differ in their underlying ability distribution for the secondary dimension(s). Therefore, the purpose of this study was to determine how much the secondary ability distributions must differ before DIF is…

Descriptors: Item Response Theory, Test Bias, Correlation, Statistical Significance

García-Pérez, Miguel A. – Educational and Psychological Measurement, 2017

Null hypothesis significance testing (NHST) has been the subject of debate for decades and alternative approaches to data analysis have been proposed. This article addresses this debate from the perspective of scientific inquiry and inference. Inference is an inverse problem and application of statistical methods cannot reveal whether effects…

Descriptors: Hypothesis Testing, Statistical Inference, Effect Size, Bayesian Statistics

Campitelli, Guillermo; Macbeth, Guillermo; Ospina, Raydonal; Marmolejo-Ramos, Fernando – Educational and Psychological Measurement, 2017

We present three strategies to replace the null hypothesis statistical significance testing approach in psychological research: (1) visual representation of cognitive processes and predictions, (2) visual representation of data distributions and choice of the appropriate distribution for analysis, and (3) model comparison. The three strategies…

Descriptors: Research Methodology, Hypothesis Testing, Psychology, Social Science Research

Gwet, Kilem L. – Educational and Psychological Measurement, 2016

This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…

Descriptors: Differences, Correlation, Statistical Significance, Statistical Analysis

Raykov, Tenko; Marcoulides, George A.; Tong, Bing – Educational and Psychological Measurement, 2016

A latent variable modeling procedure is discussed that can be used to test if two or more homogeneous multicomponent instruments with distinct components are measuring the same underlying construct. The method is widely applicable in scale construction and development research and can also be of special interest in construct validation studies.…

Descriptors: Models, Statistical Analysis, Measurement Techniques, Factor Analysis

Leth-Steensen, Craig; Gallitto, Elena – Educational and Psychological Measurement, 2016

A large number of approaches have been proposed for estimating and testing the significance of indirect effects in mediation models. In this study, four sets of Monte Carlo simulations involving full latent variable structural equation models were run in order to contrast the effectiveness of the currently popular bias-corrected bootstrapping…

Descriptors: Mediation Theory, Structural Equation Models, Monte Carlo Methods, Simulation

Wollack, James A.; Cohen, Allan S.; Eckerly, Carol A. – Educational and Psychological Measurement, 2015

Test tampering, especially on tests for educational accountability, is an unfortunate reality, necessitating that the state (or its testing vendor) perform data forensic analyses, such as erasure analyses, to look for signs of possible malfeasance. Few statistical approaches exist for detecting fraudulent erasures, and those that do largely do not…

Descriptors: Tests, Cheating, Item Response Theory, Accountability

Raykov, Tenko; Marcoulides, George A.; Millsap, Roger E. – Educational and Psychological Measurement, 2013

A multiple testing method for examining factorial invariance for latent constructs evaluated by multiple indicators in distinct populations is outlined. The procedure is based on the false discovery rate concept and multiple individual restriction tests and resolves general limitations of a popular factorial invariance testing approach. The…

Descriptors: Testing, Statistical Analysis, Factor Analysis, Statistical Significance

Hoekstra, Rink; Johnson, Addie; Kiers, Henk A. L. – Educational and Psychological Measurement, 2012

The use of confidence intervals (CIs) as an addition or as an alternative to null hypothesis significance testing (NHST) has been promoted as a means to make researchers more aware of the uncertainty that is inherent in statistical inference. Little is known, however, about whether presenting results via CIs affects how readers judge the…

Descriptors: Computation, Statistical Analysis, Hypothesis Testing, Statistical Significance

Carvajal, Jorge; Skorupski, William P. – Educational and Psychological Measurement, 2010

This study is an evaluation of the behavior of the Liu-Agresti estimator of the cumulative common odds ratio when identifying differential item functioning (DIF) with polytomously scored test items using small samples. The Liu-Agresti estimator has been proposed by Penfield and Algina as a promising approach for the study of polytomous DIF but no…

Descriptors: Test Bias, Sample Size, Test Items, Computation

Kim, Eun Sook; Willson, Victor L. – Educational and Psychological Measurement, 2010

This article presents a method to evaluate pretest effects on posttest scores in the absence of an un-pretested control group using published results of pretesting effects due to Willson and Putnam. Confidence intervals around the expected theoretical gain due to pretesting are computed, and observed gains or differential gains are compared with…

Descriptors: Control Groups, Intervals, Educational Research, Educational Psychology

Alhija, Fadia Nasser-Abu; Levy, Adi – Educational and Psychological Measurement, 2009

Effect size (ES) reporting practices in a sample of 10 educational research journals are examined in this study. Five of these journals explicitly require reporting ES and the other 5 have no such policy. Data were obtained from 99 articles published in the years 2003 and 2004, in which 183 statistical analyses were conducted. Findings indicate no…

Descriptors: Effect Size, Periodicals, Educational Research, Policy