NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 151 to 165 of 725 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2014
Brennan (Brennan, R. L., 2012) noted that users of test scores often want (indeed, demand) that subscores be reported, along with total test scores, for diagnostic purposes. Haberman (Haberman, S. J., 2008) suggested a method based on classical test theory (CTT) to determine if subscores have added value over the total score. According to this…
Descriptors: Scores, Test Theory, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Ou Lydia; Brew, Chris; Blackmore, John; Gerard, Libby; Madhok, Jacquie; Linn, Marcia C. – Educational Measurement: Issues and Practice, 2014
Content-based automated scoring has been applied in a variety of science domains. However, many prior applications involved simplified scoring rubrics without considering rubrics representing multiple levels of understanding. This study tested a concept-based scoring tool for content-based scoring, c-rater™, for four science items with rubrics…
Descriptors: Science Tests, Test Items, Scoring, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Gotch, Chad M.; French, Brian F. – Educational Measurement: Issues and Practice, 2014
This work systematically reviews teacher assessment literacy measures within the context of contemporary teacher evaluation policy. In this study, the researchers collected objective tests of assessment knowledge, teacher self-reports, and rubrics to evaluate teachers' work in assessment literacy studies from 1991 to 2012. Then they evaluated…
Descriptors: Measures (Individuals), Objective Tests, Measurement Techniques, Scoring Rubrics
Peer reviewed Peer reviewed
Direct linkDirect link
Murphy, Daniel L.; Gaertner, Matthew N. – Educational Measurement: Issues and Practice, 2014
This study evaluates four growth prediction models--projection, student growth percentile, trajectory, and transition table--commonly used to forecast (and give schools credit for) middle school students' future proficiency. Analyses focused on vertically scaled summative mathematics assessments, and two performance standards conditions (high…
Descriptors: Prediction, Models, Achievement Gains, Middle School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Moses, Tim – Educational Measurement: Issues and Practice, 2014
This module describes and extends X-to-Y regression measures that have been proposed for use in the assessment of X-to-Y scaling and equating results. Measures are developed that are similar to those based on prediction error in regression analyses but that are directly suited to interests in scaling and equating evaluations. The regression and…
Descriptors: Scaling, Regression (Statistics), Equated Scores, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Hongli – Educational Measurement: Issues and Practice, 2014
Read-aloud accommodations have been proposed as a way to help remove barriers faced by students with disabilities in reading comprehension. Many empirical studies have examined the effects of read-aloud accommodations; however, the results are mixed. With a variance-known hierarchical linear modeling approach, based on 114 effect sizes from 23…
Descriptors: Reading Instruction, Reading Strategies, Reading Comprehension, Barriers
Peer reviewed Peer reviewed
Direct linkDirect link
Feinberg, Richard A.; Wainer, Howard – Educational Measurement: Issues and Practice, 2014
Subscores are often used to indicate test-takers' relative strengths and weaknesses and so help focus remediation. But a subscore is not worth reporting if it is too unreliable to believe or if it contains no information that is not already contained in the total score. It is possible, through the use of a simple linear equation provided in…
Descriptors: Scores, Equations (Mathematics), Prediction, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Higgins, Derrick; Heilman, Michael – Educational Measurement: Issues and Practice, 2014
As methods for automated scoring of constructed-response items become more widely adopted in state assessments, and are used in more consequential operational configurations, it is critical that their susceptibility to gaming behavior be investigated and managed. This article provides a review of research relevant to how construct-irrelevant…
Descriptors: Automation, Scoring, Responses, Test Wiseness
Peer reviewed Peer reviewed
Direct linkDirect link
Feinberg, Richard A.; Wainer, Howard – Educational Measurement: Issues and Practice, 2014
Subscores can be of diagnostic value for tests that cover multiple underlying traits. Some items require knowledge or ability that spans more than a single trait. It is thus natural for such items to be included on more than a single subscore. Subscores only have value if they are reliable enough to justify conclusions drawn from them and if they…
Descriptors: Scores, Test Items, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Buzick, Heather; Stone, Elizabeth – Educational Measurement: Issues and Practice, 2014
Read aloud is a testing accommodation that has been studied by many researchers, and its use on K-12 assessments continues to be debated because of its potential to change the measured construct or unfairly increase test scores. This study is a summary of quantitative research on the read aloud accommodation. Previous studies contributed…
Descriptors: Meta Analysis, Reading Aloud to Others, Educational Research, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Koch, Martha J. – Educational Measurement: Issues and Practice, 2014
Implications of the multiple-use of accountability assessments for the process of validation are examined. Multiple-use refers to the simultaneous use of results from a single administration of an assessment for its intended use and for one or more additional uses. A theoretical discussion of the issues for validation which emerge from…
Descriptors: Foreign Countries, Test Use, Accountability, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Templin, Jonathan; Hoffman, Lesa – Educational Measurement: Issues and Practice, 2013
Diagnostic classification models (aka cognitive or skills diagnosis models) have shown great promise for evaluating mastery on a multidimensional profile of skills as assessed through examinee responses, but continued development and application of these models has been hindered by a lack of readily available software. In this article we…
Descriptors: Classification, Models, Language Tests, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Lakin, Joni M.; Young, John W. – Educational Measurement: Issues and Practice, 2013
In recent years, many U.S. states have introduced growth models as part of their educational accountability systems. Although the validity of growth-based accountability models has been evaluated for the general population, the impact of those models for English language learner (ELL) students, a growing segment of the student population, has not…
Descriptors: English Language Learners, Accountability, Educational Policy, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Mee, Janet; Clauser, Brian E.; Margolis, Melissa J. – Educational Measurement: Issues and Practice, 2013
Despite being widely used and frequently studied, the Angoff standard setting procedure has received little attention with respect to an integral part of the process: how judges incorporate examinee performance data in the decision-making process. Without performance data, subject matter experts have considerable difficulty accurately making the…
Descriptors: Standard Setting (Scoring), Judges, Data, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Gierl, Mark J.; Lai, Hollis – Educational Measurement: Issues and Practice, 2013
Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…
Descriptors: Educational Assessment, Test Items, Automation, Computer Assisted Testing
Pages: 1  |  ...  |  7  |  8  |  9  |  10  |  11  |  12  |  13  |  14  |  15  |  ...  |  49