Data curation–the process of discovering, integrating, and cleaning data–is one of the oldest, hardest, yet inevitable data management problems. Despite decades of efforts from …
J Yang, D Wang, W Zhou, W Qian, X Wang… - Proceedings of the 30th …, 2021 - dl.acm.org
Entity alignment aims to match synonymous entities across different knowledge graphs, which is a fundamental task for knowledge integration. Recently, researchers have devoted …
JP Awick, G Schumann… - … Conference on Big Data …, 2023 - ieeexplore.ieee.org
Data integration is utilized to integrate heterogeneous data from multiple sources, representing a crucial step to improve information value in data analysis and mining …
Deep learning techniques have been used with promising results for data integration problems. Some methods use pre-trained embeddings that were trained on a large corpus …
Deep learning based techniques have been recently used with promising results for data integration problems. Some methods directly use pre-trained embeddings that were trained …
Lexical Similarity (LS) between two languages uncovers many interesting linguistic insights such as phylogenetic relationship, mutual intelligibility, common etymology, and loan words …
Data retention is a pervasive and far-reaching topic, affecting everything from academia to industry. Current solutions rely on manual work by domain users, but they are not adequate …
Data cleaning is one of the most important but time-consuming tasks for data scientists. The data cleaning task consists of two major steps:(1) error detection and (2) error correction …
X Chen, L Wang, A Dutta - US Patent 11,423,072, 2022 - Google Patents
Respective text feature sets and non-text feature sets are generated corresponding to individual pairs of a plurality of record pairs. At least one text feature is based on whether a …