NotesFAQContact Us
Search Tips
Showing 1 to 15 of 17 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Schulte, Niklas; Holling, Heinz; Bürkner, Paul-Christian – Educational and Psychological Measurement, 2021
Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high.…
Descriptors: Questionnaires, Measurement Techniques, Test Format, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Balogh, Jennifer; Bernstein, Jared; Cheng, Jian; Van Moere, Alistair; Townshend, Brent; Suzuki, Masanori – Educational and Psychological Measurement, 2012
A two-part experiment is presented that validates a new measurement tool for scoring oral reading ability. Data collected by the U.S. government in a large-scale literacy assessment of adults were analyzed by a system called VersaReader that uses automatic speech recognition and speech processing technologies to score oral reading fluency. In the…
Descriptors: Reading Fluency, Measures (Individuals), Scoring, Reading Ability
Peer reviewed Peer reviewed
Pomplun, Mark; Omar, Md Hafidz – Educational and Psychological Measurement, 1997
Four threats to validity of an alternative objective test item format, the multiple-mark format, were studied with data from a state-mandated assessment with about 30,000 students at each of three grade levels. Reliability and validity coefficients show that the format has promise as an objective format that can be aligned with new curriculum…
Descriptors: Curriculum Development, Elementary School Students, Elementary Secondary Education, Objective Tests
Peer reviewed Peer reviewed
Luecht, Richard M. – Educational and Psychological Measurement, 1987
Test Pac, a test scoring and analysis computer program for moderate-sized sample designs using dichotomous response items, performs comprehensive item analyses and multiple reliability estimates. It also performs single-facet generalizability analysis of variance, single-parameter item response theory analyses, test score reporting, and computer…
Descriptors: Computer Assisted Testing, Computer Software, Computer Software Reviews, Item Analysis
Peer reviewed Peer reviewed
Kingma, Johannes; TenVergert, Elisabeth M. – Educational and Psychological Measurement, 1987
Two studies investigated the functional equivalence of three different scoring systems used to assess the child's ability to understand and carry out multiplicative classification tasks. All three scoring criteria produced reliable and homogeneous tests. Their factor matrices were similar, and the corresponding factor structures were invariant…
Descriptors: Classification, Cognitive Measurement, Comparative Analysis, Developmental Tasks
Peer reviewed Peer reviewed
Jaradat, Derar; Tollefson, Nona – Educational and Psychological Measurement, 1988
This study compared the reliability and validity indexes of randomly parallel tests administered under inclusion, exclusion, and correction for guessing directions, using 54 graduate students. It also compared the criterion-referenced grading decisions based on the different scoring methods. (TJH)
Descriptors: Criterion Referenced Tests, Grading, Graduate Students, Guessing (Tests)
Peer reviewed Peer reviewed
Cuenot, Randall G.; Darbes, Alex – Educational and Psychological Measurement, 1982
Thirty-one clinical psychologists scored Comprehension, Similarities, and Vocabulary subtest items common to the Wechsler Intelligence Scale for Children (WISC) and the Wechsler Intelligence Scale for Children, Revised (WISC-R). The results on interrater scoring agreement suggest that the scoring of these subtests may be less subjective than…
Descriptors: Clinical Psychology, Intelligence Tests, Psychologists, Scoring
Peer reviewed Peer reviewed
DeShields, Shirley M.; And Others – Educational and Psychological Measurement, 1984
The Standardized Test of Essential Writing Skills (STEWS) has been developed as a valid and reliable means of assessing expository writing skill. Studies concerning the reliability and validity of the STEWS are presented and their implications for use of the STEWS are discussed. (Author/DWH)
Descriptors: Expository Writing, Higher Education, Scoring, Standardized Tests
Peer reviewed Peer reviewed
Wilcox, Rand R. – Educational and Psychological Measurement, 1981
This paper describes and compares procedures for estimating the reliability of proficiency tests that are scored with latent structure models. Results suggest that the predictive estimate is the most accurate of the procedures. (Author/BW)
Descriptors: Criterion Referenced Tests, Scoring, Test Reliability
Peer reviewed Peer reviewed
Tinsley, Howard E. A.; And Others – Educational and Psychological Measurement, 1981
Two procedures for scoring the Recreation Experience Preference scales were investigated using data obtained from respondents engaged in outdoor recreational activities. Both procedures yielded acceptable levels of reliability and concurrent validity. When time is unimportant, the scale score strategy is preferred over the domain score strategy.…
Descriptors: Methods, Outdoor Activities, Participant Satisfaction, Recreational Activities
Peer reviewed Peer reviewed
Burton, Nancy W. – Educational and Psychological Measurement, 1981
This study was concerned with selecting a measure of scorer agreement for use with the National Assessment of Educational Progress. The simple percent of agreement and Cohen's kappa were compared. It was concluded that Cohen's kappa does not add sufficient information to make its calculation worthwhile. (Author/BW)
Descriptors: Educational Assessment, Elementary Secondary Education, Quality Control, Scoring
Peer reviewed Peer reviewed
And Others; Michael, William B. – Educational and Psychological Measurement, 1980
Ratings of student performance for two essay questions rendered by professors of English and by professors in other disciplines were compared for reliability and concurrent validity. It was concluded that the reliability and validity of the ratings of the two groups were nearly comparable. (Author/BW)
Descriptors: College Faculty, English Instruction, Essay Tests, Higher Education
Peer reviewed Peer reviewed
Veldman, Donald J.; Sheffield, John R. – Educational and Psychological Measurement, 1979
A sociometric nominations instrument called Guess Who was administered to 13,045 elementary school children and then subjected to an image analysis. Four factors were extracted--disruptive, bright, dull, and quiet/well-behaved--and related to teacher ratings, self-reports and other measures. (Author/JKS)
Descriptors: Elementary Education, Factor Structure, Peer Evaluation, Rating Scales
Peer reviewed Peer reviewed
Mitchell, Karen; Anderson, Judy – Educational and Psychological Measurement, 1986
This study examined the reliability of holistic scoring for a sample of essays written during the Spring 1985 MCAT administration. Analysis of variance techniques was used to estimate the reliability of scoring and to partition score variance into that due to level differences between papers and to context-specific factors. (Author/LMO)
Descriptors: Analysis of Variance, Essay Tests, Holistic Evaluation, Medical Education
Peer reviewed Peer reviewed
Educational and Psychological Measurement, 1986
The issue addressed by this study was whether the Jesness Inventory Classification System as revised for hand-scoring can provide useful information about the personality and behavior of nonoffenders. Data on the System's construct validity showed that youths of the nine subtypes differed on many background, personality, attitude, and behavioral…
Descriptors: Achievement Tests, Attitude Measures, Behavior Rating Scales, Construct Validity
Previous Page | Next Page »
Pages: 1  |  2