Scalable, generic, and adaptive systems for focused crawling

M Kumar, R Bhatia, D Rattan - Wiley Interdisciplinary Reviews …, 2017 - Wiley Online Library

Performance of any search engine relies heavily on its Web crawler. Web crawlers are the
programs that get webpages from the Web by following hyperlinks. These webpages are …

被引用次数：113 相关文章所有 2 个版本

[PDF] hal.science

Regularized cost-model oblivious database tuning with reinforcement learning

D Basu, Q Lin, W Chen, HT Vo, Z Yuan… - Transactions on Large …, 2016 - Springer

In this paper, we propose a learning approach to adaptive performance tuning of database
applications. The objective is to validate the opportunity to devise a tuning strategy that does …

被引用次数：32 相关文章所有 9 个版本

[PDF] hal.science

Focused crawling through reinforcement learning

M Han, PH Wuillemin, P Senellart - … , ICWE 2018, Cáceres, Spain, June 5 …, 2018 - Springer

Focused crawling aims at collecting as many Web pages relevant to a target topic as
possible while avoiding irrelevant pages, reflecting limited resources available to a Web …

被引用次数：23 相关文章所有 7 个版本

[PDF] arxiv.org

Selective harvesting over networks

F Murai, D Rennó, B Ribeiro, GL Pappa… - Data Mining and …, 2018 - Springer

Active search on graphs focuses on collecting certain labeled nodes (targets) given global
knowledge of the network topology and its edge weights (encoding pairwise similarities) …

被引用次数：21 相关文章所有 10 个版本

[PDF] arxiv.org

Tree-based focused web crawling with reinforcement learning

A Kontogiannis, D Kelesis, V Pollatos… - arXiv preprint arXiv …, 2021 - arxiv.org

A focused crawler aims at discovering as many web pages relevant to a target topic as
possible, while avoiding irrelevant ones. Reinforcement Learning (RL) has been utilized to …

被引用次数：7 相关文章所有 4 个版本

[PDF] hal.science

Reinforcement learning approaches in dynamic environments

M Han - 2018 - inria.hal.science

Reinforcement learning is learning from interaction with an environment to achieve a goal. It
is an efficient framework to solve sequential decision-making problems, using Markov …

被引用次数：9 相关文章所有 3 个版本

[PDF] hal.science

A Frequent Named Entities Based Approach for Interpreting Reputation in Twitter

NB Seghouani, F Bugiotti… - Data Science and …, 2018 - inria.hal.science

Twitter is a social network that provides a powerful source of data. The analysis of those data
offers many challenges among those stands out the opportunity to find reputation of a …

被引用次数：10 相关文章所有 13 个版本

[PDF] arxiv.org

Smart crawling: a new approach toward focus crawling from Twitter

A Khazaie, NB Seghouani, F Bugiotti - arXiv preprint arXiv:2110.06022, 2021 - arxiv.org

Twitter is a social network that offers a rich and interesting source of information challenging
to retrieve and analyze. Twitter data can be accessed using a REST API. The available …

被引用次数：2 相关文章所有 9 个版本

[PDF] mdpi.com

ARCOMEM crawling architecture

V Plachouras, F Carpentier, M Faheem, J Masanès… - Future internet, 2014 - mdpi.com

The World Wide Web is the largest information repository available today. However, this
information is very volatile and Web archiving is essential to preserve it for the future …

被引用次数：9 相关文章所有 18 个版本

[PDF] hal.science

Interpreting reputation through frequent named entities in twitter

N Bennacer, F Bugiotti, M Hewasinghage, S Isaj… - … Engineering–WISE 2017 …, 2017 - Springer

Twitter is a social network that provides a powerful source of data. The analysis of those data
offers many challenges among those stands out the opportunity to find the reputation of a …

被引用次数：4 相关文章所有 8 个版本

高级搜索

QQ 群