NotesFAQContact Us
Search Tips
Peer reviewed Peer reviewed
PDF on ERIC Download full text
ERIC Number: EJ1110924
Record Type: Journal
Publication Date: 2004-Apr
Pages: 55
Abstractor: As Provided
ISSN: EISSN-2330-8516
Model Diagnostics for Bayesian Networks. Research Report. ETS RR-04-17
Sinharay, Sandip
ETS Research Report Series, Apr 2004
Assessing fit of psychometric models has always been an issue of enormous interest, but there exists no unanimously agreed upon item fit diagnostic for the models. Bayesian networks, frequently used in educational assessments (see, for example, Mislevy, Almond, Yan, & Steinberg, 2001) primarily for learning about students' knowledge and skills, are no exception. This paper employs the "posterior predictive model checking method" (Guttman, 1967; Rubin, 1984), a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A number of aspects of model fit, those of usual interest to practitioners, are assessed in this paper using various diagnostic tools. The first diagnostic used is direct data display--a visual comparison of the observed data set and a number of the posterior predictive data sets (that are predicted by the model). The second aspect examined here is item fit. Examinees are grouped into a number of equivalence classes, based on the generated values of their skill variables, and the observed and expected proportion correct scores on an item for the classes are combined to provide a ?[superscript 2]-type and a G[superscript 2]-type test statistic for each item. Another (similar) set of ?[superscript 2]-type and G[superscript 2]-type test statistic is obtained by grouping the examinees by their raw scores and then comparing their observed and expected proportion correct scores on an item. This paper also suggests how to obtain posterior predictive p-values, natural candidate p-values from a Bayesian viewpoint, for the ?[superscript 2]-type and G[superscript 2]-type test statistics. The paper further examines the association among the items, especially if the model can explain the odds ratios corresponding to the responses to pairs of items. Finally, in an effort to examine the issue of differential item functioning (DIF), this paper suggests a version of the Mantel-Haenszel statistic (Holland, 1985), which uses "matched groups" based on equivalence classes, as a discrepancy measure with posterior predictive model checking. Limited simulation studies and a real data application examine the effectiveness of the suggested model diagnostics.
Educational Testing Service. Rosedale Road, MS19-R Princeton, NJ 08541. Tel: 609-921-9000; Fax: 609-734-5410; e-mail:; Web site:
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A