Deckard: Scalable and accurate tree-based detection of code clones

L Jiang, G Misherghi, Z Su… - … Conference on Software …, 2007 - ieeexplore.ieee.org
Detecting code clones has many software engineering applications. Existing approaches
either do not scale to large code bases or are not robust against minor code modifications. In …

Comparing stars: On approximating graph edit distance

Z Zeng, AKH Tung, J Wang, J Feng… - Proceedings of the VLDB …, 2009 - dl.acm.org
Graph data have become ubiquitous and manipulating them based on similarity is essential
for many applications. Graph edit distance is one of the most widely accepted measures to …

XML data clustering: An overview

A Algergawy, M Mesiti, R Nayak, G Saake - ACM Computing Surveys …, 2011 - dl.acm.org
In the last few years we have observed a proliferation of approaches for clustering XML
documents and schemas based on their structure and content. The presence of such a huge …

Scalable detection of semantic clones

M Gabel, L Jiang, Z Su - … of the 30th international conference on …, 2008 - dl.acm.org
Several techniques have been developed for identifying similar code fragments in programs.
These similar fragments, referred to as code clones, can be used to identify redundant code …

RTED: a robust algorithm for the tree edit distance

M Pawlik, N Augsten - arXiv preprint arXiv:1201.0230, 2011 - arxiv.org
We consider the classical tree edit distance between ordered labeled trees, which is defined
as the minimum-cost sequence of node edit operations that transform one tree into another …

Efficient computation of the tree edit distance

M Pawlik, N Augsten - ACM Transactions on Database Systems (TODS), 2015 - dl.acm.org
We consider the classical tree edit distance between ordered labelled trees, which is
defined as the minimum-cost sequence of node edit operations that transform one tree into …

Tree2Vector: learning a vectorial representation for tree-structured data

H Zhang, S Wang, X Xu, TWS Chow… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
The tree structure is one of the most powerful structures for data organization. An efficient
learning framework for transforming tree-structured data into vectorial representations is …

An efficient graph indexing method

X Wang, X Ding, AKH Tung, S Ying… - 2012 IEEE 28th …, 2012 - ieeexplore.ieee.org
Graphs are popular models for representing complex structure data and similarity search for
graphs has become a fundamental research problem. Many techniques have been …

The pq-gram distance between ordered labeled trees

N Augsten, M Böhlen, J Gamper - ACM Transactions on Database …, 2008 - dl.acm.org
When integrating data from autonomous sources, exact matches of data items that represent
the same real-world object often fail due to a lack of common keys. Yet in many cases …

Substructure similarity measurement in chinese recipes

L Wang, Q Li, N Li, G Dong, Y Yang - Proceedings of the 17th …, 2008 - dl.acm.org
Improving the precision of information retrieval has been a challenging issue on Chinese
Web. As exemplified by Chinese recipes on the Web, it is not easy/natural for people to use …