A novel web scraping approach using the additional information obtained from web pages

E Uzun - IEEE Access, 2020 - ieeexplore.ieee.org
Web scraping is a process of extracting valuable and interesting text information from web
pages. Most of the current studies targeting this task are mostly about automated web data …

A comprehensive survey on web content extraction algorithms and techniques

SM Al-Ghuribi, S Alshomrani - 2013 International Conference …, 2013 - ieeexplore.ieee.org
Web Content Extraction is an important problem that has been studied through different
approaches and algorithms. It is interested in extracting meaningful and useful data from the …

Deriving custom post types from digital mockups

A Murolo, MC Norrie - Engineering the Web in the Big Data Era: 15th …, 2015 - Springer
Interface-driven approaches to web development often migrate digital mockups defining the
presentation, structure and client-side functionality of a website to platforms such as …

VEDD-a visual wrapper for extraction of data using DOM tree

AK Tripathy, N Joshi, S Thomas… - 2012 International …, 2012 - ieeexplore.ieee.org
The World Wide Web plays an important role while searching for information in the data
network. Users are constantly exposed to an ever-growing flood of information. A wrapper is …

Extraire des données textuelles pour l'analyse du discours: le Détricoteur

R Dalodiere, M Jordan - Corpus, 2025 - journals.openedition.org
Il existe aujourd'hui de nombreux outils en matière d'extraction du contenu textuel sur
Internet. Beaucoup de ceux-ci ont été conçus à l'initiative de chercheurs travaillant en …

Data extraction from online social networks using application programming interface in a multi agent system approach

R Abdulrahman, D Neagu, DRW Holton… - … Collective Intelligence XI, 2013 - Springer
Abstract In recent years, Online Social Networks (OSNs) have attracted a significant
increased number of users. New methods for extracting data are required to deal with the …

Cross domain assessment of document to html conversion tools to quantify text and structural loss during document analysis

K Goslin, M Hofmann - 2013 European Intelligence and …, 2013 - ieeexplore.ieee.org
During forensic text analysis, the automation of the process is key when working with large
quantities of documents. As documents often come in a wide variety of different file types …

Revisiting web data extraction using in-browser structural analysis and visual cues in modern web designs

A Murolo, MC Norrie - … : 16th International Conference, ICWE 2016, Lugano …, 2016 - Springer
Recent trends in website design have an impact on methods used for web data extraction.
Many existing methods rely on structural analysis of web pages and, with the introduction of …

Bi-languages mining algorithm for extraction useful web contents (BiLEx)

SM AL-Ghuribi, S Alshomrani - Arabian Journal for Science and …, 2015 - Springer
Extracting useful Web content is a major step in data mining. The Web content extraction
process is very important for many technologies or uses as a preprocessing of many …

Improved depth first algorithm and its application in information retrieval

J Xiong, Z Li, L Fan, D Liu - 2010 IEEE Fifth International …, 2010 - ieeexplore.ieee.org
With the rapid growth of the network information, how to search the relative information in the
web becomes a new challenge. The algorithm of search engine is the key to search on web …