[PDF][PDF] Permutation tests for studying classifier performance.

M Ojala, GC Garriga - Journal of machine learning research, 2010 - jmlr.org
We explore the framework of permutation-based p-values for assessing the performance of
classifiers. In this paper we study two simple permutation tests. The first test assess whether …

Maximum entropy models and subjective interestingness: an application to tiles in binary databases

T De Bie - Data Mining and Knowledge Discovery, 2011 - Springer
Recent research has highlighted the practical benefits of subjective interestingness
measures, which quantify the novelty or unexpectedness of a pattern when contrasted with …

An information theoretic framework for data mining

T De Bie - Proceedings of the 17th ACM SIGKDD international …, 2011 - dl.acm.org
We formalize the data mining process as a process of information exchange, defined by the
following key components. The data miner's state of mind is modeled as a probability …

Randomization techniques for graphs

S Hanhijärvi, GC Garriga, K Puolamäki - Proceedings of the 2009 SIAM …, 2009 - SIAM
Mining graph data is an active research area. Several data mining methods and algorithms
have been proposed to identify structures from graphs; still, the evaluation of those results is …

BSig: evaluating the statistical significance of biclustering solutions

R Henriques, SC Madeira - Data Mining and Knowledge Discovery, 2018 - Springer
Statistical evaluation of biclustering solutions is essential to guarantee the absence of
spurious relations and to validate the high number of scientific statements inferred from …

The blind men and the elephant: on meeting the problem of multiple truths in data from clustering and pattern mining perspectives

A Zimek, J Vreeken - Machine Learning, 2015 - Springer
In this position paper, we discuss how different branches of research on clustering and
pattern mining, while rather different at first glance, in fact have a lot in common and can …

Tell me something I don't know: randomization strategies for iterative data mining

S Hanhijärvi, M Ojala, N Vuokko, K Puolamäki… - Proceedings of the 15th …, 2009 - dl.acm.org
There is a wide variety of data mining methods available, and it is generally useful in
exploratory data analysis to use many different methods for the same dataset. This, however …

ROhAN: Row-order agnostic null models for statistically-sound knowledge discovery

M Abuissa, A Lee, M Riondato - Data Mining and Knowledge Discovery, 2023 - Springer
We introduce a novel class of null models for the statistical validation of results obtained
from binary transactional and sequence datasets. Our null models are Row-Order Agnostic …

A framework for mining interesting pattern sets

T De Bie, KN Kontonasios, E Spyropoulou - ACM SIGKDD Explorations …, 2011 - dl.acm.org
This paper suggests a framework for mining subjectively interesting pattern sets that is
based on two components:(1) the encoding of prior information in a model for the data …

Research of agile software development based on formal methods

A Zuo, J Yang, X Chen - 2010 International Conference on …, 2010 - ieeexplore.ieee.org
Agile software development is a kind of lightweight development method, which can satisfy
to the changes of requirements. This paper applies formal methods into agile software …