NotesFAQContact Us
Search Tips
ERIC Number: ED343133
Record Type: Non-Journal
Publication Date: 1991
Pages: 18
Abstractor: N/A
Reference Count: N/A
Swedish TEFL Meets Reality.
Ljung, Magnus
The work reported in this paper is an offshoot of a project started in early 1986 and designed to produce a comparison between the vocabulary in the TEFL (Teaching English as a Foreign Language) texts and the vocabulary found in contemporary non-technical English writing. This study investigated the unique words and differences in frequency between shared words in two large corpora, the GYM corpus (from "gymnasium" the Swedish term for upper secondary education) and the COBUILD corpus, a huge collection of machine readable English texts collected at the University of Birmingham (England). The GYM corpus was created by converting the nearly 1.5 million words in 56 English TEFL texts used in Swedish upper secondary education into computer-readable form. The COBUILD corpus contains 18 million words used in the development of the COBUILD dictionary. Comparison of the 1000 most frequently used words in the two corpora indicated that they have 796 words in common. Comparison of the unique words indicate that: (1) words unique to the GYM corpus are concrete terms denoting physical objects and processes, physical characteristics, and emotions; and (2) words found only in the COBUILD corpus are predominantly abstract. Comparison of the words in both corpora indicated a larger number of contractions and use of third person pronouns in the GYM corpus than in the COBUILD corpus, indicating a preponderance of narrative text in the GYM corpus. Findings suggest that it is in fact possible to obtain a fair amount of information about texts merely by looking at word lists. (Seven tables of data are included; 13 references are attached.) (RS)
Publication Type: Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Location: Sweden