Natural language processing

KR Chowdhary, KR Chowdhary - Fundamentals of artificial intelligence, 2020 - Springer
The abundant volume of natural language text in the connected world, though having a
large content of knowledge, but it is becoming increasingly difficult to disseminate it by a …

Web data extraction, applications and techniques: A survey

E Ferrara, P De Meo, G Fiumara… - Knowledge-based …, 2014 - Elsevier
Abstract Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many approaches to extracting …

Information extraction

S Sarawagi - Foundations and Trends® in Databases, 2008 - nowpublishers.com
The automatic extraction of information from unstructured sources has opened up new
avenues for querying, organizing, and analyzing data by drawing upon the clean semantics …

Private data discovery for privacy compliance in collaborative environments

L Korba, Y Wang, L Geng, R Song, G Yee… - … , and Engineering: 5th …, 2008 - Springer
With the growing use of computers and the Internet, it has become difficult for organizations
to locate and effectively manage sensitive personally identifiable information (PII). This …

Discovering interacting artifacts from ERP systems

X Lu, M Nagelkerke, D Van De Wiel… - IEEE Transactions on …, 2015 - ieeexplore.ieee.org
Enterprise Resource Planning (ERP) systems are widely used to manage business
documents along a business processes and allow very detailed recording of event data of …

UIMA Ruta: Rapid development of rule-based information extraction applications

P Kluegl, M Toepfer, PD Beck, G Fette… - Natural Language …, 2016 - cambridge.org
Rule-based information extraction is an important approach for processing the increasingly
available amount of unstructured data. The manual creation of rule-based applications is a …

A survey on region extractors from web documents

HA Sleiman, R Corchuelo - IEEE Transactions on Knowledge …, 2012 - ieeexplore.ieee.org
Extracting information from web documents has become a research area in which new
proposals sprout out year after year. This has motivated several researchers to work on …

WIERT: web information extraction via render tree

Z Li, B Shao, L Shou, M Gong, G Li… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Web information extraction (WIE) is a fundamental problem in web document understanding,
with a significant impact on various applications. Visual information plays a crucial role in …

Ontology-based information extraction and integration from heterogeneous data sources

P Buitelaar, P Cimiano, A Frank, M Hartung… - International Journal of …, 2008 - Elsevier
In this paper we present the design, implementation and evaluation of SOBA, a system for
ontology-based information extraction from heterogeneous data resources, including plain …

Enabling information extraction by inference of regular expressions from sample entities

F Brauer, R Rieger, A Mocan… - Proceedings of the 20th …, 2011 - dl.acm.org
Regular expressions are the dominant technique to extract business relevant entities (eg,
invoice numbers or product names) from text data (eg, invoices), since these entity types …