Web page segmentation revisited: evaluation framework and dataset

J Kiesel, F Kneist, L Meyer, K Komlossy… - Proceedings of the 29th …, 2020 - dl.acm.org
Each web page can be segmented into semantically coherent units that fulfill specific
purposes. Though the task of automatic web page segmentation was introduced two …

Extracting the main content of web pages using the First Impression Area

G Jung, S Han, H Kim, K Kim, J Cha - IEEE Access, 2022 - ieeexplore.ieee.org
Extracting the main content from a web page is essential in various applications such as
web crawlers and browser reader modes. Existing extraction methods using text-based …

Multimodal web page segmentation using self-organized multi-objective clustering

SR Jayashree, G Dias, JJ Andrew, S Saha… - ACM Transactions on …, 2022 - dl.acm.org
Web page segmentation (WPS) aims to break a web page into different segments with
coherent intra-and inter-semantics. By evidencing the morpho-dispositional semantics of a …

WebSAM-Adapter: Adapting Segment Anything Model for Web Page Segmentation

B Ren, Z Qian, Y Sun, C Gao, C Zhang - European Conference on …, 2024 - Springer
With the advancement of internet technology, web page segmentation, which aims to divide
web pages into semantically coherent units, has become increasingly crucial for web-related …

Web page content block identification with extended block properties

K Griazev, S Ramanauskaitė - Applied Sciences, 2023 - mdpi.com
Web page segmentation is one of the most influential factors for the automated integration of
web page content with other systems. Existing solutions are focused on segmentation but do …

Concurrent speech synthesis to improve document first glance for the blind

F Maurel, G Dias, S Ferrari, JJ Andrew… - … on Document Analysis …, 2019 - ieeexplore.ieee.org
Skimming and scanning are two well-known reading processes, which are combined to
access the document content as quickly and efficiently as possible. While both are available …

[PDF][PDF] Harnessing Web Archives to Tackle Selected Societal Challenges

J Kiesel - 2022 - downloads.webis.de
L isten L ive on i H ea rt RADIO N ews R adi o 6 1 0 WTVN-N ews, T raffic, W eather-C
olumbus, OH O nA ir N ews P odcasts M ed ia C onnect C ontests F lashback: S uperc urof E …

Multi-purpose dataset of webpages and its content blocks: design and structure validation

K Griazev, S Ramanauskaitė - Applied Sciences, 2021 - mdpi.com
The need for automated data extraction is continuously growing due to the constant addition
of information to the worldwide web. Researchers are developing new data extraction …

Model-driven web page segmentation for non visual access

JJ Andrew, S Ferrari, F Maurel, G Dias… - … Conference of the Pacific …, 2020 - Springer
Web page segmentation aims to break a large page into smaller blocks, in which contents
with coherent semantics are kept together. Within this context, a great deal of approaches …

[PDF][PDF] DOM-Based Clustering Approach for Web Page Segmentation: A Comparative Study.

A Sterca, O Nourescu, A Guran, C Serban - WEBIST, 2023 - scitepress.org
Web page segmentation plays a crucial role in analyzing and understanding the content of
web pages, enabling various web-related tasks. The approaches based on computer vision …