NotesFAQContact Us
Collection
Advanced
Search Tips
50 Years of ERIC
50 Years of ERIC
The Education Resources Information Center (ERIC) is celebrating its 50th Birthday! First opened on May 15th, 1964 ERIC continues the long tradition of ongoing innovation and enhancement.

Learn more about the history of ERIC here. PDF icon

Showing 166 to 180 of 1,152 results
Peer reviewed Peer reviewed
Direct linkDirect link
Korobko, Oksana B.; Glas, Cees A. W.; Bosker, Roel J.; Luyten, Johan W. – Journal of Educational Measurement, 2008
Methods are presented for comparing grades obtained in a situation where students can choose between different subjects. It must be expected that the comparison between the grades is complicated by the interaction between the students' pattern and level of proficiency on one hand, and the choice of the subjects on the other hand. Three methods…
Descriptors: Item Response Theory, Test Items, Comparative Analysis, Grades (Scholastic)
Peer reviewed Peer reviewed
Direct linkDirect link
van der Ark, L. Andries; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational Measurement, 2008
Two types of answer-copying statistics for detecting copiers in small-scale examinations are proposed. One statistic identifies the "copier-source" pair, and the other in addition suggests who is copier and who is source. Both types of statistics can be used when the examination has alternate test forms. A simulation study shows that the…
Descriptors: Cheating, Statistics, Test Format, Measures (Individuals)
Peer reviewed Peer reviewed
Direct linkDirect link
de La Torre, Jimmy; Deng, Weiling – Journal of Educational Measurement, 2008
The standardized log-likelihood of a response vector (l[subscript z]) is a popular IRT-based person-fit test statistic for identifying model-misfitting response patterns. Traditional use of l[subscript z] is overly conservative in detecting aberrance due to its incorrect assumption regarding its theoretical null distribution. This study proposes a…
Descriptors: Goodness of Fit, Measures (Individuals), Test Reliability, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
Betebenner, Damian W.; Shang, Yi; Xiang, Yun; Zhao, Yan; Yue, Xiaohui – Journal of Educational Measurement, 2008
No Child Left Behind (NCLB) performance mandates, embedded within state accountability systems, focus school AYP (adequate yearly progress) compliance squarely on the percentage of students at or above proficient. The singular importance of this quantity for decision-making purposes has initiated extensive research into percent proficient as a…
Descriptors: Classification, Error of Measurement, Statistics, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
van der Linden, Wim J.; Breithaupt, Krista; Chuah, Siang Chee; Zhang, Yanwei – Journal of Educational Measurement, 2007
A potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed…
Descriptors: Adaptive Testing, Evaluation Methods, Test Items, Reaction Time
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Seock-Ho; Cohen, Allan S.; Alagoz, Cigdem; Kim, Sukwoo – Journal of Educational Measurement, 2007
Data from a large-scale performance assessment (N = 105,731) were analyzed with five differential item functioning (DIF) detection methods for polytomous items to examine the congruence among the DIF detection methods. Two different versions of the item response theory (IRT) model-based likelihood ratio test, the logistic regression likelihood…
Descriptors: Performance Based Assessment, Performance Tests, Item Response Theory, Test Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Briggs, Derek C.; Wilson, Mark – Journal of Educational Measurement, 2007
An approach called generalizability in item response modeling (GIRM) is introduced in this article. The GIRM approach essentially incorporates the sampling model of generalizability theory (GT) into the scaling model of item response theory (IRT) by making distributional assumptions about the relevant measurement facets. By specifying a random…
Descriptors: Markov Processes, Generalizability Theory, Item Response Theory, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Moses, Tim; Yang, Wen-Ling; Wilson, Christine – Journal of Educational Measurement, 2007
This study explored the use of kernel equating for integrating and extending two procedures proposed for assessing item order effects in test forms that have been administered to randomly equivalent groups. When these procedures are used together, they can provide complementary information about the extent to which item order effects impact test…
Descriptors: Advanced Placement, Equated Scores, Test Items, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Jang, Eunice Eunhee; Roussos, Louis – Journal of Educational Measurement, 2007
This article reports two studies to illustrate methodologies for conducting a conditional covariance-based nonparametric dimensionality assessment using data from two forms of the Test of English as a Foreign Language (TOEFL). Study 1 illustrates how to assess overall dimensionality of the TOEFL including all three subtests. Study 2 is aimed at…
Descriptors: Reading Comprehension, Nonparametric Statistics, Listening Comprehension, Hypothesis Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Prowker, Adam; Camilli, Gregory – Journal of Educational Measurement, 2007
The central idea of differential item functioning (DIF) is to examine differences between two groups at the item level while controlling for overall proficiency. This approach is useful for examining hypotheses at a finer-grain level than are permitted by a total test score. The methodology proposed in this paper is also aimed at estimating…
Descriptors: Scores, Test Bias, Difficulty Level, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Zwick, Rebecca; Greif Green, Jennifer – Journal of Educational Measurement, 2007
In studies of the SAT, correlations of SAT scores, high school grades, and socioeconomic factors (SES) are usually obtained using a university as the unit of analysis. This approach obscures an important structural aspect of the data: The high school grades received by a given institution come from a large number of high schools, all of which have…
Descriptors: Organizations (Groups), High School Students, Grades (Scholastic), Grading
Peer reviewed Peer reviewed
Direct linkDirect link
Lockwood, J. R.; McCaffrey, Daniel F.; Hamilton, Laura S.; Stecher, Brian; Le, Vi-Nhuan; Martinez, Jose Felipe – Journal of Educational Measurement, 2007
Using longitudinal data from a cohort of middle school students from a large school district, we estimate separate "value-added" teacher effects for two subscales of a mathematics assessment under a variety of statistical models varying in form and degree of control for student background characteristics. We find that the variation in estimated…
Descriptors: Mathematics Achievement, Academic Achievement, Middle School Students, School District Size
Peer reviewed Peer reviewed
Direct linkDirect link
Monahan, Patrick O.; Lee, Won-Chan; Ankenmann, Robert D. – Journal of Educational Measurement, 2007
A Monte Carlo simulation technique for generating dichotomous item scores is presented that implements (a) a psychometric model with different explicit assumptions than traditional parametric item response theory (IRT) models, and (b) item characteristic curves without restrictive assumptions concerning mathematical form. The four-parameter beta…
Descriptors: True Scores, Psychometrics, Monte Carlo Methods, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Holland, Paul W. – Journal of Educational Measurement, 2007
It is a widely held belief that anchor tests should be miniature versions (i.e., "minitests"), with respect to content and statistical characteristics, of the tests being equated. This article examines the foundations for this belief regarding statistical characteristics. It examines the requirement of statistical representativeness of anchor…
Descriptors: Test Items, Comparative Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Petridou, Alexandra; Williams, Julian – Journal of Educational Measurement, 2007
Hypotheses about aberrant test-response behavior and hence invalid person-measurement have hitherto included factors like ability, gender, language, test-anxiety, and motivation, but these have not previously been collectively investigated with real data, or with multilevel models. This study analyzes the effect of these factors on person…
Descriptors: Data Analysis, Models, Students, English (Second Language)
Pages: 1  |  ...  |  8  |  9  |  10  |  11  |  12  |  13  |  14  |  15  |  16  |  ...  |  77