ERIC Number: ED027023
Record Type: RIE
Publication Date: 1966-Jul
Reference Count: N/A
Breaking the Cost Barrier in Automatic Classification.
Doyle, L. B.
A low-cost automatic classification method is reported that uses computer time in proportion to NlogN, where N is the number of information items and the base is a parameter, some barriers besides cost are treated briefly in the opening section, including types of intellectual resistance to the idea of doing classification by content-word similarity. The second section explains the basic processes of document grouping by similarity, and discusses the advantages of the reported method over methods commonly experimented with. The operation of an iterative procedure using word profiles to progressively improve the grouping of content-word lists is described. Then some possible applications aside from document classification are enumerated. The final section begins by presenting theoretical underpinnings that explain the form taken by the components of the method. An account of the struggle to make the method work is sketched, followed by a cycle-by-cycle description of a feasibility demonstration. The conclusion states that mere cheapness is not enough and analyzes what researchers and developers might have to do before user acceptance of automatic classification can be assured. (Author)
Descriptors: Automation, Classification, Computers, Content Analysis, Costs, Indexing, Statistical Analysis
Clearinghouse for Federal Scientific and Technical Information, Springfield, Va. 22151 (AD-636 837, MF-$0.65, HC-$3.00).
Publication Type: N/A
Education Level: N/A
Sponsor: Rome Air Development Center, Griffiss AFB, NY.
Authoring Institution: System Development Corp., Santa Monica, CA.