An overview on machine translation evaluation

L Han - arXiv preprint arXiv:2202.11027, 2022 - arxiv.org
Since the 1950s, machine translation (MT) has become one of the important tasks of AI and
development, and has experienced several different periods and stages of development …

Breaking the representation bottleneck of Chinese characters: Neural machine translation with stroke sequence modeling

Z Wang, X Liu, M Zhang - arXiv preprint arXiv:2211.12781, 2022 - arxiv.org
Existing research generally treats Chinese character as a minimum unit for representation.
However, such Chinese character representation will suffer two bottlenecks: 1) Learning …

Measuring uncertainty in translation quality evaluation (TQE)

S Gladkoff, I Sorokina, L Han, A Alekseeva - arXiv preprint arXiv …, 2021 - arxiv.org
From both human translators (HT) and machine translation (MT) researchers' point of view,
translation quality evaluation (TQE) is an essential task. Translation service providers (TSPs) …

[HTML][HTML] Neural machine translation of clinical text: an empirical investigation into multilingual pre-trained language models and transfer-learning

L Han, S Gladkoff, G Erofeev, I Sorokina… - Frontiers in Digital …, 2024 - frontiersin.org
Clinical text and documents contain very rich information and knowledge in healthcare, and
their processing using state-of-the-art language technology becomes very important for …

Topic modelling of swedish newspaper articles about coronavirus: a case study using latent dirichlet allocation method

B Griciūtė, L Han, G Nenadic - 2023 IEEE 11th International …, 2023 - ieeexplore.ieee.org
Topic Modelling (TM) is a natural language processing (NLP) method for discovering topics
in a collection of documents. Being an unsupervised method, it is a valuable tool when trying …

Password cracking and countermeasures in computer security: A survey

L Han - arXiv preprint arXiv:1411.7803, 2014 - arxiv.org
With the rapid development of internet technologies, social networks, and other related
areas, user authentication becomes more and more important to protect the data of users …

HilMeMe: A Human-in-the-Loop Machine Translation Evaluation Metric Looking into Multi-Word Expressions

L Han - arXiv preprint arXiv:2211.05201, 2022 - arxiv.org
With the fast development of Machine Translation (MT) systems, especially the new boost
from Neural MT (NMT) models, the MT output quality has reached a new level of accuracy …

On Cross-Domain Pre-Trained Language Models for Clinical Text Mining: How Do They Perform on Data-Constrained Fine-Tuning?

S Belkadi, L Han, Y Wu, V Antonini… - arXiv preprint arXiv …, 2022 - arxiv.org
Fine-tuning Large Language Models (LLMs) pre-trained from general or related domain
data to a specific domain and task using a limited amount of resources available in the new …

Identification and extraction of multiword expressions from Hindi & Urdu language in natural language processing

V Gupta, N Joshi - International Journal of Advanced …, 2022 - search.proquest.com
Text can be translated from one language to another using statistical machine translation,
but there are still gaps in the translations because of a lack of language resource material …

CC-Riddle: A Question Answering Dataset of Chinese Character Riddles

F Xu, Y Zhang, X Wan - arXiv preprint arXiv:2206.13778, 2022 - arxiv.org
Chinese character riddle is a challenging riddle game which takes a single character as the
solution. The riddle describes the pronunciation, shape and meaning of the solution …