Tree edit distance: Robust and memory-efficient

M Pawlik, N Augsten - Information Systems, 2016 - Elsevier
Hierarchical data are often modelled as trees. An interesting query identifies pairs of similar
trees. The standard approach to tree similarity is the tree edit distance, which has …

RTED: a robust algorithm for the tree edit distance

M Pawlik, N Augsten - arXiv preprint arXiv:1201.0230, 2011 - arxiv.org
We consider the classical tree edit distance between ordered labeled trees, which is defined
as the minimum-cost sequence of node edit operations that transform one tree into another …

Efficient computation of the tree edit distance

M Pawlik, N Augsten - ACM Transactions on Database Systems (TODS), 2015 - dl.acm.org
We consider the classical tree edit distance between ordered labelled trees, which is
defined as the minimum-cost sequence of node edit operations that transform one tree into …

[PDF][PDF] Approximate matching of hierarchical data using pq-grams

N Augsten, MH Böhlen, J Gamper - VLDB, 2005 - cosy.sbg.ac.at
When integrating data from autonomous sources, exact matches of data items that represent
the same real world object often fail due to a lack of common keys. Yet in many cases …

The pq-gram distance between ordered labeled trees

N Augsten, M Böhlen, J Gamper - ACM Transactions on Database …, 2008 - dl.acm.org
When integrating data from autonomous sources, exact matches of data items that represent
the same real-world object often fail due to a lack of common keys. Yet in many cases …

[PDF][PDF] S2MP: similarity measure for sequential patterns

H Saneifar, S Bringay, A Laurent… - Proceedings of the 7th …, 2008 - Citeseer
In data mining, computing the similarity of objects is an essential task, for example to identify
regularities or to build homogeneous clusters of objects. In the case of sequential data seen …

Approximate joins for data-centric XML

N Augsten, M Bohlen, C Dyreson… - 2008 IEEE 24th …, 2008 - ieeexplore.ieee.org
In data integration applications, a join matches elements that are common to two data
sources. Often, however, elements are represented slightly different in each source, so an …

Supporting refactoring activities using histories of program modification

S Hayashi, M Saeki, M Kurihara - IEICE transactions on …, 2006 - search.ieice.org
Refactoring is one of the promising techniques for improving program design by means of
program transformation with preserving behavior, and is widely applied in practice …

Rws-diff: flexible and efficient change detection in hierarchical data

JP Finis, M Raiber, N Augsten, R Brunel… - Proceedings of the …, 2013 - dl.acm.org
The problem of generating a cost-minimal edit script between two trees has many important
applications. However, finding such a cost-minimal script is computationally hard, thus the …

Bridging the gap between tracking and detecting changes in XML

P Ciancarini, AD Iorio, C Marchetti… - Software: Practice …, 2016 - Wiley Online Library
There are two main approaches to manage changes in XML documents, change‐tracking
and diff. Change‐tracking tools, which record edit actions while they are performed on the …