NotesFAQContact Us
Search Tips
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ1154998
Record Type: Journal
Publication Date: 2017-Oct
Pages: 17
Abstractor: As Provided
ISSN: ISSN-0265-5322
Grounding Lexical Diversity in Human Judgments
Jarvis, Scott
Language Testing, v34 n4 p537-553 Oct 2017
The present study discusses the relevance of measures of lexical diversity (LD) to the assessment of learner corpora. It also argues that existing measures of LD, many of which have become specialized for use with language corpora, are fundamentally measures of lexical repetition, are based on an etic perspective of language, and lack construct validity. The proposed solution draws from Zipf's (1935) emic perspective of language, which views LD as a matter of perception, but which also assumes that competent speakers of a common language share similar perceptions. The present study tests whether this is true and specifically whether untrained human raters will show high levels of inter-rater reliability in their judgments of the levels of LD found in 60 texts extracted from a corpus of narratives written in English by a mix of language learners and native speakers. The results confirm Zipf's assertion, but also indicate that a relatively large number of motivated raters are needed to demonstrate this tendency. The remainder of the study discusses the implications these results have for the development of an automated measure of LD to be used with learner corpora. The proposed method begins with human judgments of a representative subsample of a corpus, proceeds to a statistical model of objective measures that accurately predicts the human judgments, and ends with a multidimensional, corpus-specific automated measure that outputs reliable estimates of how a reliable group of human judges would rate the levels of LD in the texts of that corpus.
SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail:; Web site:
Publication Type: Journal Articles; Reports - Research
Education Level: Higher Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Location: Ohio
Grant or Contract Numbers: N/A