作者
Frans Coetzee, Andries Kruger, C Lee Giles, Steve Lawrence, Gary Flake, Christian W Omlin
简介
We present a methodology for rapid implementation of specialized search engines. To catalog data, these search engines interpret and classify the content of web material to identify different representations of common domain-related elements. While designers can typically develop multiple partial solutions for interpreting the data, acceptable relevance determination requires the appropriate integration of all of these solutions. We present a method for automatically integrating such partial solutions in a Bayesian framework. The Bayesian framework produces a search engine where each user can control the false alarm rate in an intuitive yet rigorous fashion. We discuss the use of this technique in the construction of DEADLINER, a search engine that catalogs conference and seminar material found on the web.
学术搜索中的文章