ERIC Number: EJ613445
Record Type: Journal
Publication Date: 2000
Reference Count: N/A
A Comparison of Techniques To Find Mirrored Hosts on the WWW.
Bharat, Krishna; Broder, Andrei; Dean, Jefferey; Henzinger, Monika R.
Journal of the American Society for Information Science, v51 n12 p1114-1122 Oct 2000
Compares several "top-down" algorithms for identifying mirrored hosts on the Web. The algorithms operate on the basis of URL strings and linkage data: the type of information about Web pages easily available from Web proxies and crawlers. Results reveal that the best approach is a combination of five algorithms: on test data this approach achieved a precision of 0.57 for a recall of 0.86 considering 100,000 results. (Contains 19 references.) (Author/AEF)
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Authoring Institution: N/A