Publication Date
| In 2015 | 0 |
| Since 2014 | 18 |
| Since 2011 (last 5 years) | 65 |
| Since 2006 (last 10 years) | 157 |
| Since 1996 (last 20 years) | 288 |
Descriptor
| Elementary Secondary Education | 176 |
| Educational Assessment | 133 |
| Test Use | 128 |
| Test Construction | 117 |
| Testing Problems | 98 |
| Testing Programs | 80 |
| Scores | 79 |
| Test Validity | 77 |
| Educational Testing | 76 |
| Achievement Tests | 75 |
| More ▼ | |
Source
| Educational Measurement:… | 582 |
Author
| Mehrens, William A. | 12 |
| Plake, Barbara S. | 11 |
| Hills, John R. | 9 |
| Linn, Robert L. | 9 |
| Popham, W. James | 9 |
| Sireci, Stephen G. | 9 |
| Brennan, Robert L. | 8 |
| Cizek, Gregory J. | 8 |
| Frisbie, David A. | 8 |
| Stiggins, Richard J. | 8 |
| More ▼ | |
Publication Type
Education Level
| Elementary Secondary Education | 39 |
| Higher Education | 13 |
| Elementary Education | 9 |
| Postsecondary Education | 9 |
| Secondary Education | 8 |
| Grade 3 | 7 |
| Grade 4 | 7 |
| Grade 5 | 7 |
| High Schools | 7 |
| Grade 6 | 3 |
| More ▼ | |
Audience
| Researchers | 9 |
| Teachers | 6 |
| Practitioners | 3 |
| Counselors | 1 |
Showing 121 to 135 of 582 results
Webb, Noreen M.; Herman, Joan L.; Webb, Norman L. – Educational Measurement: Issues and Practice, 2007
This article examines the role of reviewer agreement in judgments about alignment between tests and standards. We used case data from three state alignment studies to explore how different approaches to incorporating reviewer agreement changes alignment conclusions. The three case studies showed varying degrees of reviewer agreement about…
Descriptors: Test Items, Case Studies, Mathematics, Interrater Reliability
Hendrickson, Amy – Educational Measurement: Issues and Practice, 2007
Multistage tests are those in which sets of items are administered adaptively and are scored as a unit. These tests have all of the advantages of adaptive testing, with more efficient and precise measurement across the proficiency scale as well as time savings, without many of the disadvantages of an item-level adaptive test. As a seemingly…
Descriptors: Adaptive Testing, Test Construction, Measurement Techniques, Evaluation Methods
Elliott, Stephen N.; Compton, Elizabeth; Roach, Andrew T. – Educational Measurement: Issues and Practice, 2007
The relationships between ratings on the Idaho Alternate Assessment (IAA) for 116 students with significant disabilities and corresponding ratings for the same students on two norm-referenced teacher rating scales were examined to gain evidence about the validity of resulting IAA scores. To contextualize these findings, another group of 54…
Descriptors: Inferences, Disabilities, Rating Scales, Eligibility
Leighton, Jacqueline P.; Gierl, Mark J. – Educational Measurement: Issues and Practice, 2007
The purpose of this paper is to define and evaluate the categories of cognitive models underlying at least three types of educational tests. We argue that while all educational tests may be based--explicitly or implicitly--on a cognitive model, the categories of cognitive models underlying tests often range in their development and in the…
Descriptors: Identification (Psychology), Misconceptions, Measurement, Inferences
Lu, Ying; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2007
Speededness refers to the situation where the time limits on a standardized test do not allow substantial numbers of examinees to fully consider all test items. When tests are not intended to measure speed of responding, speededness introduces a severe threat to the validity of interpretations based on test scores. In this article, we describe…
Descriptors: Test Items, Timed Tests, Standardized Tests, Test Validity
Sinharay, Sandip; Haberman, Shelby; Puhan, Gautam – Educational Measurement: Issues and Practice, 2007
There is an increasing interest in reporting subscores, both at examinee level and at aggregate levels. However, it is important to ensure reasonable subscore performance in terms of high reliability and validity to minimize incorrect instructional and remediation decisions. This article employs a statistical measure based on classical test theory…
Descriptors: Test Reliability, Test Theory, Test Validity, Statistical Analysis
Parkes, Jay – Educational Measurement: Issues and Practice, 2007
Reliability consists of both important social and scientific values and methods for evidencing those values, though in practice methods are often conflated with the values. With the two distinctly understood, a reliability argument can be made that articulates the particular reliability values most relevant to the particular measurement situation…
Descriptors: Validity, Reliability, Evaluation Methods, Measurement
Ho, Andrew D. – Educational Measurement: Issues and Practice, 2007
State test score trends are widely interpreted as indicators of educational improvement. To validate these interpretations, state test score trends are often compared to trends on other tests such as the National Assessment of Educational Progress (NAEP). These comparisons raise serious technical and substantive concerns. Technically, the most…
Descriptors: Test Results, Educational Improvement, National Competency Tests, Measures (Individuals)
Kim, Jee-Seon; Bolt, Daniel M. – Educational Measurement: Issues and Practice, 2007
The purpose of this ITEMS module is to provide an introduction to Markov chain Monte Carlo (MCMC) estimation for item response models. A brief description of Bayesian inference is followed by an overview of the various facets of MCMC algorithms, including discussion of prior specification, sampling procedures, and methods for evaluating chain…
Descriptors: Placement, Monte Carlo Methods, Markov Processes, Measurement
Choi, Kilchan; Seltzer, Michael; Herman, Joan; Yamashiro, Kyo – Educational Measurement: Issues and Practice, 2007
The No Child Left Behind Act (NCLB, 2002) establishes ambitious goals for increasing student learning and attaining equity in the distribution of student performance. Schools must assure that all students, including all significant subgroups, show adequate yearly progress (AYP) toward the goal of 100% proficiency by the year 2014. In this paper,…
Descriptors: Federal Legislation, Educational Improvement, Urban Schools, Academic Achievement
Lei, Pui-Wa; Wu, Qiong – Educational Measurement: Issues and Practice, 2007
Structural equation modeling (SEM) is a versatile statistical modeling tool. Its estimation techniques, modeling capacities, and breadth of applications are expanding rapidly. This module introduces some common terminologies. General steps of SEM are discussed along with important considerations in each step. Simple examples are provided to…
Descriptors: Structural Equation Models, Guidelines, Definitions, Computer Software
De Champlain, Andre F.; Cuddy, Monica M.; LaDuca, Tony – Educational Measurement: Issues and Practice, 2007
Practice analyses are routinely used in support of the development of occupational and professional certification and licensure examinations. These analyses usually survey incumbents to obtain importance ratings of (1) specific tasks and (2) knowledge, skill, and ability (KSA) statements deemed by subject matter experts as essential to safe and…
Descriptors: Scaling, Licensing Examinations (Professions), Context Effect, Rating Scales
Kopriva, Rebecca J.; Emick, Jessica E.; Hipolito-Delgado, Carlos Porfirio; Cameron, Catherine A. – Educational Measurement: Issues and Practice, 2007
Does it matter if students are appropriately assigned to test accommodations? Using a randomized method, this study found that individual students assigned accommodations keyed to their particular needs were significantly more efficacious for English language learners (ELLs) and that little difference was reported between students receiving…
Descriptors: Second Language Learning, Student Needs, Testing Accommodations, English (Second Language)
Karantonis, Ana; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2006
The Bookmark method for setting standards on educational tests is currently one of the most popular standard-setting methods. However, research to support the method is scarce. In this report, we review the published and unpublished literature on this method as well as some seminal work in the area of evaluating standard-setting studies. Our…
Descriptors: Academic Standards, Educational Testing, Literature Reviews, Validity
Solano-Flores, Guillermo; Li, Min – Educational Measurement: Issues and Practice, 2006
We contend that generalizability (G) theory allows the design of psychometric approaches to testing English-language learners (ELLs) that are consistent with current thinking in linguistics. We used G theory to estimate the amount of measurement error due to code (language or dialect). Fourth- and fifth-grade ELLs, native speakers of…
Descriptors: Foreign Countries, Grade 4, Grade 5, English (Second Language)

Peer reviewed
Direct link
