Redirigiendo al acceso original de articulo en 16 segundos...
ARTÍCULO
TITULO

Comparing sets of patterns with the Jaccard index

Sam Fletcher    
Md Zahidul Islam    

Resumen

The ability to extract knowledge from data has been the driving force of Data Mining since its inception, and of statistical modeling long before even that. Actionable knowledge often takes the form of patterns, where a set of antecedents can be used to infer a consequent. In this paper we offer a solution to the problem of comparing different sets of patterns. Our solution allows comparisons between sets of patterns that were derived from different techniques (such as different classification algorithms), or made from different samples of data (such as temporal data or data perturbed for privacy reasons). We propose using the Jaccard index to measure the similarity between sets of patterns by converting each pattern into a single element within the set. Our measure focuses on providing conceptual simplicity, computational simplicity, interpretability, and wide applicability. The results of this measure are compared to prediction accuracy in the context of a real-world data mining scenario.

 Artículos similares

       
 
Jinjun Li, Zhihao He, Chunde Piao, Weiqi Chi and Yi Lu    
The development height and settlement prediction of water-conducting fracture zones caused by coal seam mining play an important role in the stability of overburden aquifers and the safety of roadways. Based on the engineering geological data of the J60 ... ver más
Revista: Water

 
Junyi Yang, Yutong Yao and Donghe Yang    
Due to the complexity of the underwater environment, tracking underwater targets via traditional particle filters is a challenging task. To resolve the problem that the tracking accuracy of a traditional particle filter is low due to the sample impoveris... ver más

 
Katarzyna Pajak, Magdalena Idzikowska and Kamil Kowalczyk    
The sea surface is variable in time and space; therefore, many researchers are currently interested in searching for dependencies and connections with the elements influencing this diversity, e.g., with the seabed topography. An important problem is comb... ver más

 
Wenbo Chen, Dingli Zhang, Qian Fang, Xuanhao Chen and Tong Xu    
The small strain theory underestimates the self-bearing capacity of rock masses, especially for a soft rock tunnel under high geostress. To perform an efficient and accurate calculation and provide a reference for the stiffness design of a tunnel, the fi... ver más
Revista: Applied Sciences

 
Elisabetta Franchi, Meri Barbafieri, Gianniantonio Petruzzelli, Sergio Ferro and Marco Vocciante    
Arsenic (As) is one of the most common inorganic pollutants; unfortunately, it is also one of the most toxic and is therefore a cause of great concern for the health risks that could result from it. Removing arsenic from the soil using phytoremediation a... ver más
Revista: Applied Sciences