Publication Date
In 2024 | 0 |
Since 2023 | 11 |
Since 2020 (last 5 years) | 85 |
Since 2015 (last 10 years) | 312 |
Since 2005 (last 20 years) | 616 |
Descriptor
Source
ETS Research Report Series | 646 |
Author
Haberman, Shelby J. | 32 |
Kim, Sooyeon | 23 |
Deane, Paul | 22 |
von Davier, Alina A. | 21 |
Guo, Hongwen | 19 |
Dorans, Neil J. | 18 |
Liu, Ou Lydia | 18 |
von Davier, Matthias | 18 |
Attali, Yigal | 17 |
Moses, Tim | 17 |
Sinharay, Sandip | 17 |
More ▼ |
Publication Type
Education Level
Audience
Practitioners | 2 |
Policymakers | 1 |
Researchers | 1 |
Teachers | 1 |
Location
New Jersey | 23 |
United States | 13 |
China | 12 |
California | 11 |
Japan | 11 |
Pennsylvania | 10 |
Canada | 8 |
Mexico | 7 |
South Korea | 7 |
Texas | 7 |
Australia | 6 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 6 |
Every Student Succeeds Act… | 2 |
Bakke v Regents of University… | 1 |
Grutter et al v Bollinger et… | 1 |
Race to the Top | 1 |
Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Andrews-Todd, Jessica; Jackson, G. Tanner; Kurzum, Christopher – ETS Research Report Series, 2019
Collaborative problem solving (CPS) is an important 21st-century skill for academic and career success, and as a result, there is increased interest among businesses and educational institutions in the assessment and development of CPS skills. CPS skills are difficult to measure using traditional forms of assessment, and that difficulty has led to…
Descriptors: Problem Solving, 21st Century Skills, Academic Achievement, Cooperation
Guzman-Orth, Danielle; Song, Yi; Sparks, Jesse R. – ETS Research Report Series, 2019
In this study, we investigated the challenges and opportunities in developing a computer-delivered English language arts (ELA) task intended to improve the accessibility of the task for middle school English learners (ELs). Data from cognitive labs with 8 ELs with varying language proficiency levels provided rich insight to student-- task…
Descriptors: Formative Evaluation, Test Construction, Test Items, Persuasive Discourse
Haberman, Shelby J. – ETS Research Report Series, 2019
Measures of agreement are compared to measures of prediction accuracy within a general context. Differences in appropriate use are emphasized, and approaches are examined for both numerical and nominal variables. General estimation methods are developed, and their large-sample properties are compared.
Descriptors: Measurement Techniques, Classification, Prediction, Accuracy
Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019
We derive formulas for the differential item functioning (DIF) measures that two routinely used DIF statistics are designed to estimate. The DIF measures that match on observed scores are compared to DIF measures based on an unobserved ability (theta or true score) for items that are described by either the one-parameter logistic (1PL) or…
Descriptors: Scores, Test Bias, Statistical Analysis, Item Response Theory
Haberman, Shelby J.; Liu, Yang; Lee, Yi-Hsuan – ETS Research Report Series, 2019
Distractor analyses are routinely conducted in educational assessments with multiple-choice items. In this research report, we focus on three item response models for distractors: (a) the traditional nominal response (NR) model, (b) a combination of a two-parameter logistic model for item scores and a NR model for selections of incorrect…
Descriptors: Multiple Choice Tests, Scores, Test Reliability, High Stakes Tests
Goe, Laura; Roth, Amanda – ETS Research Report Series, 2019
As America's prekindergarten through 12th grade population becomes increasingly diverse, educator preparation programs (EPPs) are tasked with attracting, admitting, supporting, and graduating more teacher candidates from underrepresented groups. The research reported here was designed to improve our understanding of strategies that some EPPs have…
Descriptors: Diversity (Faculty), Teacher Education Programs, Preservice Teacher Education, Preservice Teachers
Wang, Lin – ETS Research Report Series, 2019
Rearranging response options in different versions of a test of multiple-choice items can be an effective strategy against cheating on the test. This study investigated if rearranging response options would affect item performance and test score comparability. A study test was assembled as the base version from which 3 variant versions were…
Descriptors: Multiple Choice Tests, Test Items, Test Format, Scores
Jewsbury, Paul A. – ETS Research Report Series, 2019
When an assessment undergoes changes to the administration or instrument, bridge studies are typically used to try to ensure comparability of scores before and after the change. Among the most common and powerful is the common population linking design, with the use of a linear transformation to link scores to the metric of the original…
Descriptors: Evaluation Research, Scores, Error Patterns, Error of Measurement
Seybert, Jacob; Becker, Dovid – ETS Research Report Series, 2019
Forced-choice (FC) measures are becoming increasingly common in the assessment of personality for high-stakes testing purposes in both educational and organizational settings. Despite this, there has been relatively little research into the reliability of scores obtained from these measures, particularly when administered as a computerized…
Descriptors: Test Reliability, Personality Measures, Measurement Techniques, Computer Assisted Testing
Ling, Guangming; Gu, Lin – ETS Research Report Series, 2019
While many researchers have studied the relationship of socioeconomic status (SES) to adult learners' English language proficiency levels, little is known about this relationship for young learners (i.e., teenagers). In this study, we investigated the degree to which access to English language learning, as reflected by learners' SES, is associated…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Language Proficiency
Wendler, Cathy; Glazer, Nancy; Cline, Frederick – ETS Research Report Series, 2019
One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as…
Descriptors: College Entrance Examinations, Graduate Study, Accuracy, Test Reliability
Schmidgall, Jonathan; Oliveri, Maria Elena; Duke, Trina; Grissom, Elizabeth Carter – ETS Research Report Series, 2019
One of the most critical steps in the test development process is defining the construct, or the knowledge, skills, or abilities, to be assessed. This foundational step provides the basis for initial assumptions about the meaning of test scores and serves as a reference for subsequent validity research. In this paper, we describe the purpose of…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Language Proficiency
Yao, Lili; Haberman, Shelby J.; Zhang, Mo – ETS Research Report Series, 2019
Many assessments of writing proficiency that aid in making high-stakes decisions consist of several essay tasks evaluated by a combination of human holistic scores and computer-generated scores for essay features such as the rate of grammatical errors per word. Under typical conditions, a summary writing score is provided by a linear combination…
Descriptors: Prediction, True Scores, Computer Assisted Testing, Scoring
Galikyan, Irena; Madyarov, Irshat; Gasparyan, Rubina – ETS Research Report Series, 2019
The broad range of English language teaching and learning contexts present in the world today necessitates high quality assessment instruments that can provide reliable and meaningful information about learners' English proficiency levels to relevant stakeholders. The "TOEFL Junior"® tests were recently introduced by Educational Testing…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Student Attitudes
Buzick, Heather M.; Rhoad-Drogalis, Anna; Laitusis, Cara C.; King, Teresa C. – ETS Research Report Series, 2019
A fundamental claim for Common Core State Standards (CCSS)-aligned assessments is that they will lead to better teaching practices. The purpose of this study is to seek evidence in support of this claim by surveying teachers about their instructional practices, test preparation strategies, and test score use both before and after the introduction…
Descriptors: Common Core State Standards, Teaching Methods, Teacher Attitudes, Alignment (Education)