Exploring the limits of language modeling- 学术资源搜索

文章

学术资源搜索

Exploring the limits of language modeling

R Jozefowicz, O Vinyals, M Schuster… - arXiv preprint arXiv …, 2016 - arxiv.org

R Jozefowicz, O Vinyals, M Schuster, N Shazeer, Y Wu

arXiv preprint arXiv:1602.02410, 2016•arxiv.org

In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding. We extend current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language. We perform an exhaustive study on techniques such as character Convolutional Neural Networks or Long-Short Term Memory, on the One Billion Word Benchmark. Our best single model significantly improves state-of-the-art perplexity from 51.3 down to 30.0 (whilst reducing the number of parameters by a factor of 20), while an ensemble of models sets a new record by improving perplexity from 41.0 down to 23.7. We also release these models for the NLP and ML community to study and improve upon.

arxiv.org

展开收起

被引用次数：1462 相关文章所有 8 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Exploring the limits of language modeling

引用