Information extraction

S Sarawagi - Foundations and Trends® in Databases, 2008 - nowpublishers.com
The automatic extraction of information from unstructured sources has opened up new
avenues for querying, organizing, and analyzing data by drawing upon the clean semantics …

Unsupervised named-entity extraction from the web: An experimental study

O Etzioni, M Cafarella, D Downey, AM Popescu… - Artificial intelligence, 2005 - Elsevier
The KnowItAll system aims to automate the tedious process of extracting large collections of
facts (eg, names of scientists or politicians) from the Web in an unsupervised, domain …

Downloading textual hidden web content through keyword queries

A Ntoulas, P Zerfos, J Cho - Proceedings of the 5th ACM/IEEE-CS joint …, 2005 - dl.acm.org
An ever-increasing amount of information on the Web today is available only through search
interfaces: the users have to type in a set of keywords in a search form in order to access the …

From information to knowledge: harvesting entities and relationships from web sources

G Weikum, M Theobald - Proceedings of the twenty-ninth ACM SIGMOD …, 2010 - dl.acm.org
There are major trends to advance the functionality of search engines to a more expressive
semantic level. This is enabled by the advent of knowledge-sharing communities such as …

Connections between the lines: augmenting social networks with text

J Chang, J Boyd-Graber, DM Blei - Proceedings of the 15th ACM …, 2009 - dl.acm.org
Network data is ubiquitous, encoding collections of relationships between entities such as
people, places, genes, or corporations. While many resources for networks of interesting …

[PDF][PDF] Issues and challenges in marathi named entity recognition

N Patil, AS Patil, BV Pawar - International Journal on Natural …, 2016 - researchgate.net
Information Extraction (IE) is a sub discipline of Artificial Intelligence. IE identifies information
in unstructured information source that adheres to predefined semantics ie people, location …

Supporting database applications as a service

M Hui, D Jiang, G Li, Y Zhou - 2009 IEEE 25th International …, 2009 - ieeexplore.ieee.org
Multi-tenant data management is a form of software as a service (SaaS), whereby a third
party service provider hosts databases as a service and provides its customers with …

QProber: A system for automatic classification of hidden-web databases

L Gravano, PG Ipeirotis, M Sahami - ACM Transactions on Information …, 2003 - dl.acm.org
The contents of many valuable Web-accessible databases are only available through
search interfaces and are hence invisible to traditional Web" crawlers." Recently …

[PDF][PDF] Learning text patterns for web information extraction and assessment

D Downey, O Etzioni, S Soderland… - AAAI-04 workshop on …, 2004 - cdn.aaai.org
Learning text patterns that suggest a desired type of information is a common strategy for
extracting information from unstructured text on the Web. In this paper, we introduce the idea …

[图书][B] Text analysis pipelines: towards ad-hoc large-scale text mining

H Wachsmuth - 2015 - books.google.com
This monograph proposes a comprehensive and fully automatic approach to designing text
analysis pipelines for arbitrary information needs that are optimal in terms of run-time …