NotesFAQContact Us
Search Tips
ERIC Number: ED553746
Record Type: Non-Journal
Publication Date: 2013
Pages: 178
Abstractor: As Provided
Reference Count: N/A
ISBN: 978-1-3031-0336-0
Query Enhancement with Topic Detection and Disambiguation for Robust Retrieval
Zhang, Hui
ProQuest LLC, Ph.D. Dissertation, Indiana University
With the rapid increase in the amount of available information, people nowadays rely heavily on information retrieval (IR) systems such as web search engine to fulfill their information needs. However, due to the lack of domain knowledge and the limitation of natural language such as synonyms and polysemes, many system users cannot formulate their needs into effective queries. In consequence, one of the main challenges in modern IR research is "robust retrieval," which demands improving retrieval performance and usability over the "weak" user queries. Existing retrieval techniques such as query expansion have limited success with ineffective user queries. According to recent studies, searchers tend to formulate their queries around topics (e.g., phrases, named entities, and nouns), and IR researchers believe that recognizing these embedded topics will generate a significant retrieval improvement. However, current topic extraction techniques often fail for this task because they are developed in the field of natural language process where extensive context and a certain grammar are expected. This research implements a new query segmentation approach based on language model with the motivation of incorporating multiple contexts such as term co-occurrence and structural Wikipedia knowledge for ranking purpose. In addition, this research systematically examines the effectiveness of using detected topics and the impact of query disambiguation to IR improvement. The findings uphold the hypotheses that the new query segmentation approach is more effective than traditional methods and the topic-based retrieval is also more effective than approaches based on the bag-of-words model. On the other hand, results from this research conclude that query disambiguation with structural knowledge from Wikipedia does not significantly improve the retrieval performance, which highlights the weakness of existing knowledge bases for query disambiguation. Finally, through further analysis on the experimental results, this research proves that topic-based retrieval performance has associations with factors such as baseline retrieval and query segmentation accuracy. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A