ERIC Number: EJ1061317
Record Type: Journal
Publication Date: 2015-Feb
Abstractor: As Provided
Reference Count: N/A
Is There a Core General Vocabulary? Introducing the "New General Service List"
Brezina, Vaclav; Gablasova, Dana
Applied Linguistics, v36 n1 p1-22 Feb 2015
The current study presents a "New General Service List (new-GSL)", which is a result of robust comparison of four language corpora ("LOB," "BNC," "BE06," and "EnTenTen12") of the total size of over 12 billion running words. The four corpora were selected to represent a variety of corpus sizes and approaches to representativeness and sampling. In particular, the study investigates the lexical overlap among the corpora in the top 3,000 words based on the "average reduced frequency (ARF)", which is a measure that takes into consideration both frequency and dispersion of lexical items. The results show that there exists a stable vocabulary core of 2,122 items (70.7%) among the four corpora. Moreover, these vocabulary items occur with comparable ranks in the individual wordlists. In producing the "new-GSL", the core vocabulary items were combined with new items frequently occurring in the corpora representing current language use ("BE06" and "EnTenTen12"). The final product of the study, the "new-GSL", consists of 2,494 lemmas and covers between 80.1 and 81.7 per cent of the text in the source corpora.
Descriptors: Comparative Analysis, Computational Linguistics, Vocabulary, Language Usage, Word Frequency, Word Lists, Language Research
Oxford University Press. Great Clarendon Street, Oxford, OX2 6DP, UK. Tel: +44-1865-353907; Fax: +44-1865-353485; e-mail: firstname.lastname@example.org; Web site: http://applij.oxfordjournals.org/
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Authoring Institution: N/A