作者
Yingbo Gao, Christian Herold, Weiyue Wang, Hermann Ney
发表日期
2019/10/28
期刊
arXiv preprint arXiv:1910.12554
简介
Prominently used in support vector machines and logistic regressions, kernel functions (kernels) can implicitly map data points into high dimensional spaces and make it easier to learn complex decision boundaries. In this work, by replacing the inner product function in the softmax layer, we explore the use of kernels for contextual word classification. In order to compare the individual kernels, experiments are conducted on standard language modeling and machine translation tasks. We observe a wide range of performances across different kernel settings. Extending the results, we look at the gradient properties, investigate various mixture strategies and examine the disambiguation abilities.
引用总数
20182019202020212022112
学术搜索中的文章