ERIC Number: ED408324
Record Type: Non-Journal
Publication Date: 1997-Mar
Reference Count: N/A
Detecting Rater Effects with a Multi-Faceted Rating Scale Model.
Wolfe, Edward W.; Chiu, Chris W. T.
How common patterns of rater errors may be detected in a large-scale performance assessment setting is discussed. Common rater effects are identified, and a scaling method that can be used to detect them in operational data sets is presented. Simulated data sets are generated to exhibit each of these rater effects. The three continua that depict the most commonly cited rater effects are: (1) accuracy/randomness; (2) harshness/leniency; and (3) centrality/extremism. Rasch measurement theory provides one way of examining these rater effects within a normative framework. Rasch measurement places each facet of the measurement context on a common underlying linear scale, resulting in measures that can be subjected to traditional statistical analyses while allowing for unambiguous substantive interpretations of the meaning of examinee performance as it relates to rater performance and task functioning. In addition, Rasch calibrations of examinees, tasks, and raters are sample free in that they remove the influence of sample variability. The Multi-Faceted Rating Scale Model (MFRSM) of J. M. Linacre (1989) was used with simulated datasets that illustrate rater effects. Rater effects could be detected in the normative framework through MFRSM, and these effects seemed to operate on several continua. Further research is needed to determine how large a departure from the pool of raters needs to be before it can be detected in a normative framework. (Contains 5 figures, 9 tables, and 13 references.) (SLD)
Publication Type: Reports - Evaluative; Speeches/Meeting Papers
Education Level: N/A
Authoring Institution: N/A