Inclusion dependencies (INDs) are a well-known type of data dependency, specifying that the values of one column are contained in those of another column. INDs can be used for …
C Chenli, W Tang, F Gomulka, T Jung - Journal of Parallel and Distributed …, 2022 - Elsevier
Data sharing is increasingly popular especially for scientific research and business fields where large volume of datasets need to be used, but it involves data security and privacy …
I Roy, R Agarwal, S Chakrabarti… - Advances in Neural …, 2023 - proceedings.neurips.cc
In many search applications related to passage retrieval, text entailment, and subgraph search, the query and each'document'is a set of elements, with a document being relevant if …
Metrics for set similarity are a core aspect of several data mining tasks. To remove duplicate results in a Web search, for example, a common approach looks at the Jaccard index …
Scalar field comparison is a fundamental task in scientific visualization. In topological data analysis, we compare topological descriptors of scalar fields—such as persistence diagrams …
SCAN (Structural Clustering Algorithm for Networks) is a well-studied, widely used graph clustering algorithm. For large graphs, however, sequential SCAN variants are prohibitively …
Concept drift is a major challenge faced by machine learning-based malware detectors when deployed in practice. While existing works have investigated methods to detect …
We present a new approach for independently computing compact sketches that can be used to approximate the inner product between pairs of high-dimensional vectors. Based on …
W Wang, Y Jin, B Cao - … Conference on Privacy, Security & Trust …, 2022 - ieeexplore.ieee.org
The growing power of cloud computing prompts data owners to outsource their databases to the cloud. In order to meet the demand of multi-dimensional data processing in big data era …