ERIC Number: ED033728
Record Type: RIE
Publication Date: 1966-Apr
Reference Count: N/A
Research on Automatic Classification, Indexing and Extracting. Annual Progress Report.
Baker, F.T.; And Others
In order to contribute to the success of several studies for automatic classification, indexing and extracting currently in progress, as well as to further the theoretical and practical understanding of textual item distributions, the development of a frequency program capable of supplying these types of information was undertaken. The program planned for the IBM System/360, which comprises a compatible set of central processors and auxiliary units, will provide numerous user options covering the format of the input text, the definition of a countable item (e.g., a word), the definition of a textual unit over which frequencies are to be subtotaled (e.g., sentence, paragraph, or document), the types of output data, and the machine configuration to be used. Progress was also made on the design of the Dictionary Build module of the frequency program. The main purpose of the program is the provision of an output containing an ordered list of the items, their frequencies, and any special tags desired by the user. For the processing of large input texts, efficient utilization of storage devices by rapid dictionary search and storage techniques were considered essential. The Dictionary Build module has therefore received special attention. Contained in this report are descriptions of the requirements generated for the System/360 Frequency Program, status report on program design and documentation of dictionary construction methods. (Author/RM)
Descriptors: Automation, Classification, Computer Programs, Indexing, Information Processing, Program Design
Clearinghouse for Federal Scientific and Technical Information, Springfield, Va. 22151 (AD 485 188, MF-$0.65, HC-$3.00)
Publication Type: N/A
Education Level: N/A
Sponsor: Office of Naval Research, Washington, DC. Information Systems Research.
Authoring Institution: International Business Machines Corp., Gaithersburg, MD. Federal Systems Div.
Note: A previous report is ED 017 286.