ERIC Number: ED424729
Record Type: Non-Journal
Publication Date: 1998-Sep-21
Reference Count: N/A
Combining the Bourne-Shell, sed and awk in the UNIX Environment for Language Analysis.
Schmitt, Lothar M.; Christianson, Kiel T.
This document describes how to construct tools for language analysis in research and teaching using the Bourne-shell, sed, and awk, three search tools, in the UNIX operating system. Applications include: searches for words, phrases, grammatical patterns, and phonemic patterns in text; statistical analysis of text in regard to such searches, transformation of phonetic, phonemic, or typographic transcriptions; comparison of texts in various respects; lexical-etymological analysis; concordance; assistance in translating text; assistance in learning languages; assistance in teaching languages; and text processing and formatting. The latter includes generation of on-line dictionaries for the Internet from files that were generated with what-you-see-is-what-you-get editors representing only the linear structure of the dictionary. All of the above can be achieved with particularly simple and short code. In that regard, it is shown how sed and awk can be combined in the pipe mechanism of UNIX to create very powerful processing devices. Notes include a short introduction to programming the Bourne-shell and brief but complete descriptions of sed and awk customized for language analysis. Contains 51 references. (Author/MSE)
Publication Type: Guides - Non-Classroom
Education Level: N/A
Authoring Institution: N/A