Web spam classification: a few features worth more

M Erdélyi, A Garzó, AA Benczúr - Proceedings of the 2011 Joint WICOW …, 2011 - dl.acm.org
In this paper we investigate how much various classes of Web spam features, some
requiring very high computational effort, add to the classification accuracy. We realize that …

[PDF][PDF] Temporal Analysis for Web Spam Detection: An Overview.

M Erdélyi, AA Benczúr - TWAW, 2011 - Citeseer
Temporal Analysis for Web Spam Detection: An Overview∗ Page 1 Temporal Analysis for Web
Spam Detection: An Overview∗ Miklós Erdélyi1,2 András A. Benczúr1 1Institute for Computer …

Link based small sample learning for web spam detection

GG Geng, Q Li, X Zhang - … of the 18th international conference on World …, 2009 - dl.acm.org
Robust statistical learning based web spam detection system often requires large amounts
of labeled training data. However, labeled samples are more difficult, expensive and time …

Evaluating web content quality via multi-scale features

GG Geng, XB Jin, XC Zhang, DX Zhang - arXiv preprint arXiv:1304.6181, 2013 - arxiv.org
Web content quality measurement is crucial to various web content processing applications.
This paper will explore multi-scale features which may affect the quality of a host, and …

Content-based trust and bias classification via biclustering

D Siklósi, B Daróczy, AA Benczúr - Proceedings of the 2nd Joint WICOW …, 2012 - dl.acm.org
In this paper we improve trust, bias and factuality classification over Web data on the domain
level. Unlike the majority of literature in this area that aims at extracting opinion and handling …

Co-training based semi-supervised Web spam detection

W Wang, XD Lee, AL Hu… - 2013 10th International …, 2013 - ieeexplore.ieee.org
Traditional Web spam classifiers use only labeled data (feature/label pairs) to train. Labeled
spam instances, however, are very difficult, expensive, or time consuming to obtain, as they …

Web spam challenge proposal for filtering in archives

AA Benczúr, M Erdélyi, J Masanes… - Proceedings of the 5th …, 2009 - dl.acm.org
In this paper we propose new tasks for a possible future Web Spam Challenge motivated by
the needs of the archival community. The Web archival community consists of several …

The classification power of Web features

M Erdélyi, AA Benczúr, B Daróczy, A Garzó… - Internet …, 2014 - Taylor & Francis
In this article we give a comprehensive overview of features devised for web spam detection
and investigate how much various classes, some requiring very high computational effort …

[PDF][PDF] Web Spam: a Survey with Vision for the Archivist

AABDS Jácint, SIBZ Fekete, M Kurucz, A Pereszlényi… - researchgate.net
While Web archive quality is endangered by Web spam, a side effect of the high commercial
value of top-ranked search-engine results, so far Web spam filtering technologies are rarely …

Spectrum of complex networks

D Montealegre, V Vu - arXiv preprint arXiv:1809.05469, 2018 - arxiv.org
The study of complex networks has been one of the most active fields in science in recent
decades. Spectral properties of networks (or graphs that represent them) are of fundamental …