Log-based anomaly detection without log parsing

VH Le, H Zhang - … 36th IEEE/ACM International Conference on …, 2021 - ieeexplore.ieee.org
Software systems often record important runtime information in system logs for
troubleshooting purposes. There have been many studies that use log data to construct …

Imputing out-of-vocabulary embeddings with love makes language models robust with little cost

L Chen, G Varoquaux, FM Suchanek - arXiv preprint arXiv:2203.07860, 2022 - arxiv.org
State-of-the-art NLP systems represent inputs with word embeddings, but these are brittle
when faced with Out-of-Vocabulary (OOV) words. To address this issue, we follow the …

A systematic study of leveraging subword information for learning word representations

Y Zhu, I Vulić, A Korhonen - arXiv preprint arXiv:1904.07994, 2019 - arxiv.org
The use of subword-level information (eg, characters, character n-grams, morphemes) has
become ubiquitous in modern word representation learning. Its importance is attested …

Graph-based Relation Mining for Context-free Out-of-vocabulary Word Embedding Learning

Z Liang, Y Lu, HG Chen, Y Rao - … of the 61st Annual Meeting of …, 2023 - aclanthology.org
The out-of-vocabulary (OOV) words are difficult to represent while critical to the performance
of embedding-based downstream models. Prior OOV word embedding learning methods …

[HTML][HTML] Out-of-vocabulary word embedding learning based on reading comprehension mechanism

Z Zhuang, Z Liang, Y Rao, H Xie, FL Wang - Natural Language Processing …, 2023 - Elsevier
Currently, most natural language processing tasks use word embeddings as the
representation of words. However, when encountering out-of-vocabulary (OOV) words, the …

Bridging subword gaps in pretrain-finetune paradigm for natural language generation

X Liu, B Yang, D Liu, H Zhang, W Luo, M Zhang… - arXiv preprint arXiv …, 2021 - arxiv.org
A well-known limitation in pretrain-finetune paradigm lies in its inflexibility caused by the one-
size-fits-all vocabulary. This potentially weakens the effect when applying pretrained models …

Multi-level embeddings for processing Arabic social media contents

L Moudjari, F Benamara, K Akli-Astouati - Computer Speech & Language, 2021 - Elsevier
Embeddings are very popular representations that allow computing semantic and syntactic
similarities between linguistic units from text co-occurrence matrix. Units can vary from …

Encoding multi-granularity structural information for joint Chinese word segmentation and POS tagging

L Zhao, A Zhang, Y Liu, H Fei - Pattern Recognition Letters, 2020 - Elsevier
Recent studies show that the joint Chinese word segmentation and POS tagging can
enhance the mutual interaction and yield better performances for two tasks. However …

Subword-based compact reconstruction of word embeddings

S Sasaki, J Suzuki, K Inui - … of the 2019 Conference of the North …, 2019 - aclanthology.org
The idea of subword-based word embeddings has been proposed in the literature, mainly
for solving the out-of-vocabulary (OOV) word problem observed in standard word-based …

Robust backed-off estimation of out-of-vocabulary embeddings

N Fukuda, N Yoshinaga… - Findings of the …, 2020 - aclanthology.org
Abstract Out-of-vocabulary (oov) words cause serious troubles in solving natural language
tasks with a neural network. Existing approaches to this problem resort to using subwords …