NotesFAQContact Us
Search Tips
ERIC Number: ED546044
Record Type: Non-Journal
Publication Date: 2012
Pages: 176
Abstractor: As Provided
Reference Count: N/A
ISBN: 978-1-2675-8752-7
The Challenge of Constructing a Reliable Word List: An Exploratory Corpus-Based Analysis of Introductory Psychology Textbooks
Miller, Donald Patrick
ProQuest LLC, Ph.D. Dissertation, Northern Arizona University
Acknowledging the important role of reading in the university curriculum, and the important role that vocabulary plays in successful reading comprehension, researchers have directed a great deal of effort toward identifying ways that teachers and learners of English for Academic Purposes (EAP) can maximize vocabulary development efforts. Due to the exceptionally large stock of vocabulary in active use in English, these efforts include several important areas of inquiry. Two notable areas pose the following questions: a.) How many words do learners need in order to accomplish target language use tasks?; and b.) What words will learners most likely and most frequently encounter in the course of interacting in their target language use domain? The answer to this first question reveals the nature of the vocabulary learning challenge and can help educators set curricular goals. Answers to the second question can help identify the words that merit instructional and learning focus. The broad goal of this dissertation is to highlight the methodological challenges inherent in identifying the scope of the lexical challenge and in reliably capturing meaningful sets of words for instructional focus. This goal is accomplished through a corpus-based analysis of lexical distributions in one narrow academic domain: introductory psychology textbooks. A 3.1 million-word corpus of 10 complete introductory psychology textbooks was compiled with the goal of representing lexical distributions in my target domain: required readings in introductory psychology coursework at the undergraduate level. The corpus was then analyzed to determine the extent to which it captures the lexical diversity that students might encounter in introductory psychology textbooks. Additionally, different size samples from the corpus were analyzed to determine the extent to which they could capture a reliable set of "important" words--words that students will most likely and most frequently encounter while reading these textbooks. Findings from this study suggest that, while comparatively large, and, thus, presumably representative of the lexical variability in introductory psychology textbooks, this corpus does not in fact completely represent the lexical variability the target domain. There is likely lexical diversity in this domain that is not captured by this corpus. Additionally, no sample from the corpus, regardless of size, was able to represent the lexical distribution of the whole corpus, demonstrated by the samples' inability to reliably capture the "important" words identified in the whole corpus. Implications of these findings are discussed in relation to previous corpus-based vocabulary research. Specifically, findings raise questions regarding the representativeness of previously designed academic corpora and the reliability of previously proposed word lists based on these academic corpora. Important implications for corpus-based vocabulary research as well as for academic vocabulary instruction and learning are also discussed. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A