NotesFAQContact Us
Collection
Advanced
Search Tips
Peer reviewed Peer reviewed
PDF on ERIC Download full text
ERIC Number: EJ1020841
Record Type: Journal
Publication Date: 2014-Mar
Pages: 22
Abstractor: As Provided
Reference Count: 61
ISBN: N/A
ISSN: ISSN-1368-1613
A Survey of Stemming Algorithms in Information Retrieval
Moral, Cristian; de Antonio, Angélica; Imbert, Ricardo; Ramírez, Jaime
Information Research: An International Electronic Journal, v19 n1 Mar 2014
Background: During the last fifty years, improved information retrieval techniques have become necessary because of the huge amount of information people have available, which continues to increase rapidly due to the use of new technologies and the Internet. Stemming is one of the processes that can improve information retrieval in terms of accuracy and performance. Aim: This paper provides a detailed assessment of the current status of the stemming process framed in an information retrieval application field by tracing its historical evolution. Method: Papers presenting the first approaches for stemming were reviewed to extract their main features, benefits and drawbacks. Additionally, papers dealing with stemmers for non-English languages or with some more recent proposals were also consulted and compiled. Finally, experimental papers defining the most well-known methods and metrics aimed at evaluating and classifying stemmers were also taken into account to expose their contributions and results. Results: Even if not all researchers agree on the benefits and drawbacks of using stemming in an information retrieval process in general terms, many of them agree on its benefits in specific contexts, such as when the language is highly inflective, when documents are short or when there is limited space for storing data. Some researchers also state that the nature of the documents can influence the performance and the accuracy of the stemmer. Conclusions: Despite many researchers having investigated this field over many years, there are still some open questions, such as how to evaluate a stemmer independently of the information retrieval process, or how much a stemmer improves an information retrieval application in terms of speed. As a summary, some guidelines are also provided to help readers to determine which is the best stemmer for their needs and the tasks they have to carry out.
Thomas D. Wilson. 9 Broomfield Road, Broomhill, Sheffield, S10 2SE, UK. Web site: http://informationr.net/ir
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A