NotesFAQContact Us
Search Tips
ERIC Number: ED096840
Record Type: Non-Journal
Publication Date: 1974-May
Pages: 13
Abstractor: N/A
Reference Count: N/A
A Standard Sample of Present-Day Chinese for Use with Digital Computers. Final Report.
Wrenn, James J.
The final report on a project to develop a standard corpus of present-day Mandarin Chinese is presented. This corpus consists of words of running text of Chinese prose printed in the Republic of China during the calendar year 1968. The corpus, although originally planned to have a total of 500 samples of 2000 words each, has only 294 samples. Each sample starts at the beginning of a sentence, but not necessarily at the beginning of a paragraph or larger division. The samples represent a variety of styles of modern prose, selected for their representative quality rather than their literary merit. The collection consists primarily of samples from books and some major periodicals available through the library at the National Taiwan University and the National Central Library. For each sample collected, a copy was made and then transcribed into a modified Pin-yin romanization. For each sample, counts were taken of the following: names, formulae, figures, foreign strings, foreign words, words (in total), and syllables. After the samples were collected and romanized, they were then codified. A manual accompanies the corpus, which comprises one magnetic tape of about 1,200 feet, available in either 7-track or 9-track mode. (Author/LG)
Manual and Corpus available from the Department of Linguistics, Brown University, Providence, Rhode Island 02912 ($50.00 if tapes are furnished, $75.00 otherwise)
Publication Type: Reports - Research
Education Level: N/A
Audience: N/A
Language: N/A
Sponsor: Office of Education (DHEW), Washington, DC.
Authoring Institution: Brown Univ., Providence, RI.
Identifiers - Laws, Policies, & Programs: National Defense Education Act Title VI