Redirigiendo al acceso original de articulo en 21 segundos...
ARTÍCULO
TITULO

AN EMPIRICAL ANALYSIS OF SIMILARITY MEASURES FOR UNSTRUCTURED DATA

Mausumi Goswami    
B.S Purkayastha    

Resumen

With fast growth in size of digital text documents over internet and digital repositories, the pools of digital document is piling up day by day. Due to this digital revolution and growth, an efficient and effective technique is required to handle such an enormous amount of data. It is extremely important to understand the documents properly to mine them. To find coherence among documents text similarity measurement pays a humongous role.  The goal of similarity computation is to identify cohesion among text documents and to make the text ready for the required applications such as document organization, plagiarism detection, query matching etc. This task is one of the most fundamental task in the area of information retrieval, information extraction, document organization, plagiarism detection and text mining problems. But effectiveness of document clustering is highly dependent on this task.  In this paper four similarity measures are implemented and their descriptive statistics is compared. The results are found to be satisfactory. Graphs are drawn for visualization of results.

 Artículos similares

       
 
Nenad Marku? and Mirko Su?njevic    
Recently, there has been renewed interest in signed distance bound representations due to their unique properties for 3D shape modelling. This is especially the case for deep learning-based bounds. However, it is beneficial to work with polygons in most ... ver más
Revista: Algorithms

 
Gleice Kelly Barbosa Souza, Samara Oliveira Silva Santos, André Luiz Carvalho Ottoni, Marcos Santos Oliveira, Daniela Carine Ramires Oliveira and Erivelton Geraldo Nepomuceno    
Reinforcement learning is an important technique in various fields, particularly in automated machine learning for reinforcement learning (AutoRL). The integration of transfer learning (TL) with AutoRL in combinatorial optimization is an area that requir... ver más
Revista: Algorithms

 
Tushar Ganguli and Edwin K. P. Chong    
We present a novel technique for pruning called activation-based pruning to effectively prune fully connected feedforward neural networks for multi-object classification. Our technique is based on the number of times each neuron is activated during model... ver más
Revista: Algorithms

 
Yalin Dai, Zhouwei Fan, Jian Xu, You He and Xiongqing Yu    
A special feature of airbreathing hypersonic aircraft is the complex coupling between aerodynamic and propulsive performances. This study presents a rapid analysis methodology for the integration of these two critical aspects in the conceptual design of ... ver más
Revista: Aerospace

 
Weidong Zhao, Bernt Johan Leira, Knut Vilhelm Høyland, Ekaterina Kim, Guoqing Feng and Huilong Ren    
This paper presents a framework for structural analysis of icebreakers during ramming of first-year ice ridges. The framework links the ice-ridge load and the structural analysis based on the physical characteristics of ship?ice-ridge interactions. A shi... ver más