[PDF][PDF] Minimally-supervised extraction of entities from text advertisements

S Singh, D Hillard, C Leggetter - … of the North American Chapter of …, 2010 - aclanthology.org
S Singh, D Hillard, C Leggetter
Human Language Technologies: The 2010 Annual Conference of the North …, 2010aclanthology.org
Extraction of entities from ad creatives is an important problem that can benefit many
computational advertising tasks. Supervised and semi-supervised solutions rely on labeled
data which is expensive, time consuming, and difficult to procure for ad creatives. A small set
of manually derived constraints on feature expectations over unlabeled data can be used to
partially and probabilistically label large amounts of data. Utilizing recent work in constraint-
based semi-supervised learning, this paper injects light weight supervision specified as …
Abstract
Extraction of entities from ad creatives is an important problem that can benefit many computational advertising tasks. Supervised and semi-supervised solutions rely on labeled data which is expensive, time consuming, and difficult to procure for ad creatives. A small set of manually derived constraints on feature expectations over unlabeled data can be used to partially and probabilistically label large amounts of data. Utilizing recent work in constraint-based semi-supervised learning, this paper injects light weight supervision specified as these “constraints” into a semi-Markov conditional random field model of entity extraction in ad creatives. Relying solely on the constraints, the model is trained on a set of unlabeled ads using an online learning algorithm. We demonstrate significant accuracy improvements on a manually labeled test set as compared to a baseline dictionary approach. We also achieve accuracy that approaches a fully supervised classifier.
aclanthology.org
以上显示的是最相近的搜索结果。 查看全部搜索结果