NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Educational and Psychological…24
Audience
Laws, Policies, & Programs
No Child Left Behind Act 20012
What Works Clearinghouse Rating
Showing 1 to 15 of 24 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023
Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…
Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Nana; Bolt, Daniel M. – Educational and Psychological Measurement, 2021
This paper presents a mixture item response tree (IRTree) model for extreme response style. Unlike traditional applications of single IRTree models, a mixture approach provides a way of representing the mixture of respondents following different underlying response processes (between individuals), as well as the uncertainty present at the…
Descriptors: Item Response Theory, Response Style (Tests), Models, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Educational and Psychological Measurement, 2021
An essential question when computing test--retest and alternate forms reliability coefficients is how many days there should be between tests. This article uses data from reading and math computerized adaptive tests to explore how the number of days between tests impacts alternate forms reliability coefficients. Results suggest that the highest…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Reliability, Reading Tests
Peer reviewed Peer reviewed
Direct linkDirect link
McGrath, Kathleen V.; Leighton, Elizabeth A.; Ene, Mihaela; DiStefano, Christine; Monrad, Diane M. – Educational and Psychological Measurement, 2020
Survey research frequently involves the collection of data from multiple informants. Results, however, are usually analyzed by informant group, potentially ignoring important relationships across groups. When the same construct(s) are measured, integrative data analysis (IDA) allows pooling of data from multiple sources into one data set to…
Descriptors: Educational Environment, Meta Analysis, Student Attitudes, Teacher Attitudes
Peer reviewed Peer reviewed
Direct linkDirect link
Marland, Joshua; Harrick, Matthew; Sireci, Stephen G. – Educational and Psychological Measurement, 2020
Student assessment nonparticipation (or opt out) has increased substantially in K-12 schools in states across the country. This increase in opt out has the potential to impact achievement and growth (or value-added) measures used for educator and institutional accountability. In this simulation study, we investigated the extent to which…
Descriptors: Value Added Models, Teacher Effectiveness, Teacher Evaluation, Elementary Secondary Education
Peer reviewed Peer reviewed
Direct linkDirect link
Briggs, Derek C.; Alzen, Jessica L. – Educational and Psychological Measurement, 2019
Observation protocol scores are commonly used as status measures to support inferences about teacher practices. When multiple observations are collected for the same teacher over the course of a year, some portion of a teacher's score on each occasion may be attributable to the rater, lesson, and the time of year of the observation. All three of…
Descriptors: Observation, Inferences, Generalizability Theory, Scores
Konstantopoulos, Spyros; Li, Wei; Miller, Shazia; van der Ploeg, Arie – Educational and Psychological Measurement, 2019
This study discusses quantile regression methodology and its usefulness in education and social science research. First, quantile regression is defined and its advantages vis-à-vis vis ordinary least squares regression are illustrated. Second, specific comparisons are made between ordinary least squares and quantile regression methods. Third, the…
Descriptors: Regression (Statistics), Statistical Analysis, Educational Research, Social Science Research
Peer reviewed Peer reviewed
Direct linkDirect link
Hauser, Carl; Thum, Yeow Meng; He, Wei; Ma, Lingling – Educational and Psychological Measurement, 2015
When conducting item reviews, analysts evaluate an array of statistical and graphical information to assess the fit of a field test (FT) item to an item response theory model. The process can be tedious, particularly when the number of human reviews (HR) to be completed is large. Furthermore, such a process leads to decisions that are susceptible…
Descriptors: Test Items, Item Response Theory, Research Methodology, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E.; Bunch, Michael B.; Deville, Craig; Viger, Steven G. – Educational and Psychological Measurement, 2014
This article describes a novel variation of the Body of Work method that uses construct maps to overcome problems of transparency, rater inconsistency, and scores gaps commonly occurring with the Body of Work method. The Body of Work method with construct maps was implemented to set cut-scores for two separate K-12 assessment programs in a large…
Descriptors: Standard Setting (Scoring), Educational Assessment, Elementary Secondary Education, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Ye, Meng; Xin, Tao – Educational and Psychological Measurement, 2014
The authors explored the effects of drifting common items on vertical scaling within the higher order framework of item parameter drift (IPD). The results showed that if IPD occurred between a pair of test levels, the scaling performance started to deviate from the ideal state, as indicated by bias of scaling. When there were two items drifting…
Descriptors: Scaling, Test Items, Equated Scores, Achievement Gains
Peer reviewed Peer reviewed
Direct linkDirect link
Timmermans, Anneke C.; Snijders, Tom A. B.; Bosker, Roel J. – Educational and Psychological Measurement, 2013
In traditional studies on value-added indicators of educational effectiveness, students are usually treated as belonging to those schools where they took their final examination. However, in practice, students sometimes attend multiple schools and therefore it is questionable whether this assumption of belonging to the last school they attended…
Descriptors: School Effectiveness, Student Mobility, Elementary Schools, Secondary Schools
Peer reviewed Peer reviewed
Direct linkDirect link
Schroeders, Ulrich; Wilhelm, Oliver – Educational and Psychological Measurement, 2011
Whether an ability test delivered on either paper or computer provides the same information is an important question in applied psychometrics. Besides the validity, it is also the fairness of a measure that is at stake if the test medium affects performance. This study provides a comprehensive review of existing equivalence research in the field…
Descriptors: Reading Comprehension, Listening Comprehension, English (Second Language), Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Stone, Gregory Ethan; Koskey, Kristin L. K.; Sondergeld, Toni A. – Educational and Psychological Measurement, 2011
Typical validation studies on standard setting models, most notably the Angoff and modified Angoff models, have ignored construct development, a critical aspect associated with all conceptualizations of measurement processes. Stone compared the Angoff and objective standard setting (OSS) models and found that Angoff failed to define a legitimate…
Descriptors: Cutting Scores, Standard Setting (Scoring), Models, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Furgol, Katherine E.; Ho, Andrew D.; Zimmerman, Dale L. – Educational and Psychological Measurement, 2010
Under the No Child Left Behind Act, large-scale test score trend analyses are widespread. These analyses often gloss over interesting changes in test score distributions and involve unrealistic assumptions. Further complications arise from analyses of unanchored, censored assessment data, or proportions of students lying within performance levels…
Descriptors: Trend Analysis, Sample Size, Federal Legislation, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Humphry, Stephen M. – Educational and Psychological Measurement, 2010
Discrimination has traditionally been parameterized for items but not other empirical factors. Consequently, if person factors affect discrimination they cause misfit. However, by explicitly formulating the relationship between discrimination and the unit of a metric, it is possible to parameterize discrimination for person groups. This article…
Descriptors: Discriminant Analysis, Models, Simulation, Reading Tests
Previous Page | Next Page »
Pages: 1  |  2