NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20180
Since 20170
Since 2014 (last 5 years)0
Since 2009 (last 10 years)16
Since 1999 (last 20 years)101
Audience
Laws, Policies, & Programs
Individuals with Disabilities…1
What Works Clearinghouse Rating
Showing 1 to 15 of 101 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Bunderson, C. Victor – Journal of Applied Measurement, 2011
This article defines the concept of Domain Theory, or, when educational measurement is the goal, one might call it a "Learning Theory of Progressive Attainments in X Domain". The concept of Domain Theory is first shown to be rooted in validity theory, then the concept of domain theory is expanded to amplify its necessary but long neglected…
Descriptors: Measurement, Learning Theories, Oral Reading, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Andrich, David; Styles, Irene – Journal of Applied Measurement, 2011
There is a substantial literature on attempts to obtain information on the proficiency of respondents from distractors in multiple choice items. Information in a distractor implies that a person who chooses that distractor has greater proficiency than if the person chose another distractor with no information. A further implication is that the…
Descriptors: Multiple Choice Tests, Testing, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Weaver, Christopher – Journal of Applied Measurement, 2011
This study presents a systematic investigation concerning the performance of different rating scales used in the English section of a university entrance examination to assess 1,287 Japanese test takers' ability to write a third-person introduction speech. Although the rating scales did not conform to all of the expectations of the Rasch model,…
Descriptors: Rating Scales, English (Second Language), Language Tests, College Entrance Examinations
Peer reviewed Peer reviewed
Direct linkDirect link
Bassiri, Dina; Schulz, E. Mathew – Journal of Applied Measurement, 2011
In this study, the Rasch rating scale model (Andrich, 1978) was applied to college grades of four freshman cohorts from a large public university. After editing, the data represented approximately 34,000 students, 1,700 courses and 119 departments. The rating scale model analysis yielded measures of student achievement and course difficulty.…
Descriptors: Grade Point Average, Courses, Difficulty Level, Academic Achievement
Peer reviewed Peer reviewed
Direct linkDirect link
Lunz, Mary; Suanthong, Surintorn – Journal of Applied Measurement, 2011
The desirability of test equating to maintain the same criterion standard from test administration to test administration has long been accepted for multiple choice tests. The same consistency of expectations is desirable for performance tests, especially if they are part of a licensure or certification process or used for other high stakes…
Descriptors: Testing, Equated Scores, Performance Based Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Reeve, Suzanne; Kitchen, Elizabeth; Sudweeks, Richard R.; Bell, John D.; Bradshaw, William S. – Journal of Applied Measurement, 2011
This article describes the development of a ten-item scale to assess biology majors' self-efficacy towards the critical thinking and data analysis skills taught in an upper-division cell biology course. The original seven-item scale was expanded to include three additional items based on the results of item analysis. Evidence of reliability and…
Descriptors: Majors (Students), Self Efficacy, Measures (Individuals), Biology
Peer reviewed Peer reviewed
Direct linkDirect link
Sanchez, Juan D. – Journal of Applied Measurement, 2011
The San Francisco Unified School District (SFUSD) uses the Language and Literacy Assessment Rubric (LALAR) as the secondary measurement required by the No Child Left Behind (NCLB) Act to measure English proficiency of English language learners (ELLs). In this analysis, the Rasch model is used to identify whether the LALAR is a valid measurement…
Descriptors: Validity, English (Second Language), Language Proficiency, English Language Learners
Peer reviewed Peer reviewed
Direct linkDirect link
Mat Daud, Nuraihan; Abu Kassim, Noor Lide – Journal of Applied Measurement, 2011
Students' evaluations of teaching staff can be considered high-stakes, as they are often used to determine promotion, reappointment, and merit pay to academics. Using Facets, the reliability and validity of one student rating questionnaire is analysed. A total of 13,940 respondents of the Human Science Division of International Islamic University…
Descriptors: Student Evaluation of Teacher Performance, Questionnaires, Validity, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Schulz, E. Matthew; Mitzel, Howard C. – Journal of Applied Measurement, 2011
This article describes a Mapmark standard setting procedure, developed under contract with the National Assessment Governing Board (NAGB). The procedure enhances the bookmark method with spatially representative item maps, holistic feedback, and an emphasis on independent judgment. A rationale for these enhancements, and the bookmark method, is…
Descriptors: Standard Setting, Methods, National Competency Tests, Grade 12
Peer reviewed Peer reviewed
Direct linkDirect link
Babiar, Tasha Calvert – Journal of Applied Measurement, 2011
Traditionally, women and minorities have not been fully represented in science and engineering. Numerous studies have attributed these differences to gaps in science achievement as measured by various standardized tests. Rather than describe mean group differences in science achievement across multiple cultures, this study focused on an in-depth…
Descriptors: Test Bias, Science Achievement, Standardized Tests, Grade 8
Peer reviewed Peer reviewed
Direct linkDirect link
Huynh, Huynh; Rawls, Anita – Journal of Applied Measurement, 2011
There are at least two procedures to assess item difficulty stability in the Rasch model: robust z procedure and "0.3 Logit Difference" procedure. The robust z procedure is a variation of the z statistic that reduces dependency on outliers. The "0.3 Logit Difference" procedure is based on experiences in Rasch linking for tests…
Descriptors: Comparative Analysis, Item Response Theory, Test Items, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Draney, Karen; Wilson, Mark – Journal of Applied Measurement, 2011
In this paper, we describe a new method we have developed for setting cut scores between levels of a test. We outline the wide variety of potential methods that have been used for such a process, and emphasize the need for a coherent conceptual framework under which the variety of methods could be understood. We then describe our particular…
Descriptors: Item Response Theory, Probability, Computer Software, Cutting Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Elbaum, Batya; Fisher, William P., Jr.; Coulter, W. Alan – Journal of Applied Measurement, 2011
Indicator 8 of the State Performance Plan (SPP), developed under the 2004 reauthorization of the Individuals with Disabilities Education Act (IDEA 2004, Public Law 108-446) requires states to collect data and report findings related to schools' facilitation of parent involvement. The Schools' Efforts to Partner with Parents Scale (SEPPS) was…
Descriptors: Disabilities, Accountability, Stakeholders, Scaling
Peer reviewed Peer reviewed
Direct linkDirect link
Lick, David J.; Schmidt, Karen M.; Patterson, Charlotte J. – Journal of Applied Measurement, 2011
According to two decades of research, parental sexual orientation does not affect overall child development. Researchers have not found significant differences between offspring of heterosexual parents and those of lesbian and gay parents in terms of their cognitive, psychological, or emotional adjustment. Still, there are gaps in the literature…
Descriptors: Parent Child Relationship, Measures (Individuals), Emotional Adjustment, Homosexuality
Peer reviewed Peer reviewed
Direct linkDirect link
Nielsen, Tine; Kreiner, Svend – Journal of Applied Measurement, 2011
The Revised Danish Learning Styles Inventory (R-D-LSI) (Nielsen 2005), which is an adaptation of Sternberg-Wagner Thinking Styles Inventory (Sternberg, 1997), comprises 14 subscales, each measuring a separate learning style. Of these 14 subscales, 9 are eight items long and 5 are seven items long. For self-assessment, self-scoring and…
Descriptors: Cognitive Style, Foreign Countries, Test Items, Test Construction
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7