作者
Lillian Lee
发表日期
2001
研讨会论文
Proceedings of AISTATS
页码范围
65--72
简介
Estimating word co-occurrence probabilities is a problem underlying many applications in statistical natural language processing. Distance-weighted (or similarityweighted) averaging has been shown to be a promising approach to the analysis of novel co-occurrences. Many measures of distributional similarity have been proposed for use in the distance-weighted averaging framework; here, we empirically study their stability properties, finding that similarity-based estimation appears to make more efficient use of more reliable portions of the training data. We also investigate properties of the skew divergence, a weighted version of the KullbackLeibler (KL) divergence; our results indicate that the skew divergence yields better results than the KL divergence even when the KL divergence is applied to more sophisticated probability estimates.
引用总数
学术搜索中的文章
L Lee - International Workshop on Artificial Intelligence and …, 2001