Publication Date
| In 2024 | 19 |
| Since 2023 | 40 |
| Since 2020 (last 5 years) | 133 |
| Since 2015 (last 10 years) | 325 |
| Since 2005 (last 20 years) | 695 |
Descriptor
| Cutting Scores | 1704 |
| Test Validity | 634 |
| Test Reliability | 571 |
| Evaluation Criteria | 483 |
| Aptitude Tests | 445 |
| Norms | 441 |
| Job Skills | 434 |
| Personnel Evaluation | 431 |
| Job Applicants | 429 |
| Career Guidance | 425 |
| Standard Setting (Scoring) | 228 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Elementary Education | 149 |
| Higher Education | 132 |
| Postsecondary Education | 105 |
| Elementary Secondary Education | 103 |
| Secondary Education | 101 |
| Middle Schools | 88 |
| Grade 3 | 86 |
| Grade 4 | 81 |
| Grade 8 | 81 |
| Grade 5 | 79 |
| Grade 6 | 68 |
| More ▼ | |
Audience
| Researchers | 58 |
| Practitioners | 14 |
| Policymakers | 11 |
| Teachers | 11 |
| Administrators | 5 |
| Students | 4 |
| Parents | 1 |
Location
| California | 29 |
| Florida | 28 |
| Texas | 20 |
| Canada | 16 |
| New York | 15 |
| Massachusetts | 14 |
| North Carolina | 14 |
| United Kingdom | 14 |
| Washington | 13 |
| Pennsylvania | 12 |
| New Jersey | 11 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 3 |
Norfolk, Philip A. – ProQuest LLC, 2017
Assessment of specific learning disabilities (SLD) in educational settings is one important function of a school psychologist. The federal definition of SLD describes "underlying cognitive processing deficits" as part of the assessment criteria, that is also incorporated in several states' SLD eligibility criteria, that is difficult to…
Descriptors: Cognitive Tests, Cognitive Ability, Achievement Tests, Reading Achievement
Sinharay, Sandip – Journal of Educational Measurement, 2014
Brennan noted that users of test scores often want (indeed, demand) that subscores be reported, along with total test scores, for diagnostic purposes. Haberman suggested a method based on classical test theory (CTT) to determine if subscores have added value over the total score. One way to interpret the method is that a subscore has added value…
Descriptors: Scores, Test Theory, Classification, Cutting Scores
Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018
Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…
Descriptors: Simulation, Decision Making, Test Construction, Validity
Dogan, Enis – Practical Assessment, Research & Evaluation, 2018
Several large scale assessments include student, teacher, and school background questionnaires. Results from such questionnaires can be reported for each item separately, or as indices based on aggregation of multiple items into a scale. Interpreting scale scores is not always an easy task though. In disseminating results of achievement tests, one…
Descriptors: Rating Scales, Benchmarking, Questionnaires, Achievement Tests
Mayes, Susan D.; Lockridge, Robin – Journal of Autism and Developmental Disorders, 2018
The Checklist for Autism Spectrum Disorder (CASD) completed by a psychologist (following standardized procedures integrating parent interview data, teacher report, and clinical observations) was compared with the CASD completed independently by mothers and teachers in 168 children with ASD and 40 with ADHD (1-12 years). The 30 CASD autism symptoms…
Descriptors: Check Lists, Autism, Pervasive Developmental Disorders, Parent Attitudes
Hansen, Mark; Monroe, Scott – Measurement: Interdisciplinary Research and Perspectives, 2018
This research concerns the application of multidimensional item response theory (e.g., Reckase, 2009) to link not-quite-vertical scales across age/grade levels. The approach allows scores to be related across levels without requiring the assumption of a vertical scale. The multidimensional framework is applied to an English language proficiency…
Descriptors: Item Response Theory, Instructional Program Divisions, Language Tests, Elementary Secondary Education
Winter, Phoebe C.; Hansen, Mark; McCoy, Michelle – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2019
In order to accurately assess the English language proficiency of special populations of English learners, student assessment programs must maintain the comparability of standard and modified assessment formats, allowing for equivalent inferences to be made across student classifications. However, given the typically small size of special…
Descriptors: English Language Learners, Language Proficiency, Student Evaluation, Evaluation Methods
Wyse, Adam E. – Educational Measurement: Issues and Practice, 2017
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
Descriptors: Cutting Scores, Item Response Theory, Bayesian Statistics, Maximum Likelihood Statistics
Liu, Xiaolu; Keating, Xiaofen D.; Shangguan, Rulan – ICHPER-SD Journal of Research, 2017
This study examined changes in China's college student fitness test batteries since its inception in 1954. Using the constant content comparison method, the testing components, testing items and related cut-off values, testing methods, testing results utility, and testing material distribution were examined to identify the salient trends. The…
Descriptors: Foreign Countries, Physical Fitness, Tests, College Students
Bhat, Bilal A.; Siddiqui, Mujibul Hasan – Online Submission, 2017
The paper is based on the construction and evaluation of scientific creativity test devised for the senior secondary school students. It is an attempt that was made to evaluate validity, reliability and to determine the appropriate standards for the interpretation of the scores obtained from the scientific creativity test devised for science…
Descriptors: Foreign Countries, Test Construction, Science Tests, Creativity Tests
Wyse, Adam E. – Applied Measurement in Education, 2018
This article discusses regression effects that are commonly observed in Angoff ratings where panelists tend to think that hard items are easier than they are and easy items are more difficult than they are in comparison to estimated item difficulties. Analyses of data from two credentialing exams illustrate these regression effects and the…
Descriptors: Regression (Statistics), Test Items, Difficulty Level, Licensing Examinations (Professions)
Kilgus, Stephen P.; Taylor, Crystal N.; von der Embse, Nathaniel P. – School Psychology Quarterly, 2018
The purpose of this study was to support the identification of Social, Academic, and Emotional Behavior Risk Screener (SAEBRS) cut scores that could be used to detect high-risk students. Teachers rated students across two time points (Time 1 n = 1,242 students; Time 2 n = 704) using the SAEBRS and the Behavioral and Emotional Screening System…
Descriptors: Screening Tests, Behavior Problems, Risk, Cutting Scores
Bakhtiar, Mehdi; Wong, Min Ney; Tsui, Emily Ka Yin; McNeil, Malcolm R. – Journal of Speech, Language, and Hearing Research, 2020
Purpose: This study reports the psychometric development of the Cantonese versions of the English Computerized Revised Token Test (CRTT) for persons with aphasia (PWAs) and healthy controls (HCs). Method: The English CRTT was translated into standard Chinese for the Reading--Word Fade version (CRTT-R-[subscript WF]-Cantonese) and into formal…
Descriptors: Psychometrics, Sino Tibetan Languages, Computer Assisted Testing, Aphasia
Raza, Sarah; Sacrey, Lori-Ann R.; Zwaigenbaum, Lonnie; Bryson, Susan; Brian, Jessica; Smith, Isabel M.; Roberts, Wendy; Szatmari, Peter; Vaillancourt, Tracy; Roncadin, Caroline; Garon, Nancy – Journal of Autism and Developmental Disorders, 2020
Social-emotional behavior in autism spectrum disorder (ASD) was examined among high-risk (HR; siblings of children diagnosed with ASD) and low-risk (LR; no family history of ASD) toddlers. Caregivers completed the Infant-Toddler Social Emotional Assessment (ITSEA) at 18 months, and blind diagnostic assessment for ASD was conducted at 36 months.…
Descriptors: Autism, Pervasive Developmental Disorders, Genetics, Clinical Diagnosis
Stipek, Deborah – Policy Analysis for California Education, PACE, 2020
The use of the Quality Rating and Improvement System (QRIS) to improve early childhood education program quality is based in part on assumptions that the quality of programs can be measured and that quality ratings are associated with meaningful differences in learning outcomes for children. This report reviews all of the state QRIS validation…
Descriptors: Rating Scales, Educational Improvement, Early Childhood Education, Educational Quality

Direct link
Peer reviewed
