Fast and reliable anomaly detection in categorical data- 学术资源搜索

Fast and reliable anomaly detection in categorical data

L Akoglu, H Tong, J Vreeken, C Faloutsos - Proceedings of the 21st ACM …, 2012 - dl.acm.org

L Akoglu, H Tong, J Vreeken, C Faloutsos

Proceedings of the 21st ACM international conference on Information and …, 2012•dl.acm.org

Spotting anomalies in large multi-dimensional databases is a crucial task with many
applications in finance, health care, security, etc. We introduce COMPREX, a new approach
for identifying anomalies using pattern-based compression. Informally, our method finds a
collection of dictionaries that describe the norm of a database succinctly, and subsequently
flags those points dissimilar to the norm---with high compression cost---as anomalies. Our
approach exhibits four key features: 1) it is parameter-free; it builds dictionaries directly from …

Spotting anomalies in large multi-dimensional databases is a crucial task with many applications in finance, health care, security, etc. We introduce COMPREX, a new approach for identifying anomalies using pattern-based compression. Informally, our method finds a collection of dictionaries that describe the norm of a database succinctly, and subsequently flags those points dissimilar to the norm---with high compression cost---as anomalies.

Our approach exhibits four key features: 1) it is parameter-free; it builds dictionaries directly from data, and requires no user-specified parameters such as distance functions or density and similarity thresholds, 2) it is general; we show it works for a broad range of complex databases, including graph, image and relational databases that may contain both categorical and numerical features, 3) it is scalable; its running time grows linearly with respect to both database size as well as number of dimensions, and 4) it is effective; experiments on a broad range of datasets show large improvements in both compression, as well as precision in anomaly detection, outperforming its state-of-the-art competitors.

ACM Digital Library

展开收起

被引用次数：161 相关文章所有 11 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Fast and reliable anomaly detection in categorical data

引用