The importance of prior probabilities for entry page search

W Kraaij, T Westerveld, D Hiemstra - Proceedings of the 25th annual …, 2002 - dl.acm.org
W Kraaij, T Westerveld, D Hiemstra
Proceedings of the 25th annual international ACM SIGIR conference on …, 2002dl.acm.org
An important class of searches on the world-wide-web has the goal to find an entry page
(homepage) of an organisation. Entry page search is quite different from Ad Hoc search.
Indeed a plain Ad Hoc system performs disappointingly. We explored three non-content
features of web pages: page length, number of incoming links and URL form. Especially the
URL form proved to be a good predictor. Using URL form priors we found over 70% of all
entry pages at rank 1, and up to 89% in the top 10. Non-content features can easily be …
An important class of searches on the world-wide-web has the goal to find an entry page (homepage) of an organisation. Entry page search is quite different from Ad Hoc search. Indeed a plain Ad Hoc system performs disappointingly. We explored three non-content features of web pages: page length, number of incoming links and URL form. Especially the URL form proved to be a good predictor. Using URL form priors we found over 70% of all entry pages at rank 1, and up to 89% in the top 10. Non-content features can easily be embedded in a language model framework as a prior probability.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果