ERIC Number: ED413755
Record Type: Non-Journal
Publication Date: 1995
Reference Count: N/A
On-Line Access to Linguistically Annotated Text Corpora of Dutch via Internet.
Kruyt, J. G.; Raaijmakers, S. A.; van der Kamp, P. H. J.; van Strien, R. J.
Corpora of present-day Dutch developed by the Institute for Dutch Lexicology include two linguistically annotated corpora that can be accessed via Internet: a 5-million word corpus covering a variety of topics and text types, and a 27-million word newspaper corpus. The texts of both were acquired in machine-readable form and have been lemmatized and tagged and loaded onto an online retrieval system. Queries may address the entire corpus or a subcorpus defined by the user. The present user interface appears complex, particularly for inexperienced users, due to a high degree of formalism, but efforts are being made to reduce formalism. A prototypical natural language interpreter is under development. Copyright restrictions limit the transfer of information to the user's electronic mail. Access to the corpora is free for non-commercial research purposes with a signed personal user agreement. (MSE)
Publication Type: Reports - Descriptive; Speeches/Meeting Papers
Education Level: N/A
Authoring Institution: N/A
Identifiers - Location: Netherlands