REVISTA
Information

TODAS

Redirigiendo al acceso original de articulo en 19 segundos...

Inicio / Information / Vol: 12 Par: 12 (2021) / Artículo

ARTÍCULO

TITULO

Multi-Keyword Classification: A Case Study in Finnish Social Sciences Data Archive

Erjon Skenderi

Jukka Huhtamäki and Kostas Stefanidis

Resumen

In this paper, we consider the task of assigning relevant labels to studies in the social science domain. Manual labelling is an expensive process and prone to human error. Various multi-label text classification machine learning approaches have been proposed to resolve this problem. We introduce a dataset obtained from the Finnish Social Science Archive and comprised of 2968 research studies? metadata. The metadata of each study includes attributes, such as the ?abstract? and the ?set of labels?. We used the Bag of Words (BoW), TF-IDF term weighting and pretrained word embeddings obtained from FastText and BERT models to generate the text representations for each study?s abstract field. Our selection of multi-label classification methods includes a Naive approach, Multi-label k Nearest Neighbours (ML-kNN), Multi-Label Random Forest (ML-RF), X-BERT and Parabel. The methods were combined with the text representation techniques and their performance was evaluated on our dataset. We measured the classification accuracy of the combinations using Precision, Recall and F1 metrics. In addition, we used the Normalized Discounted Cumulative Gain to measure the label ranking performance of the selected methods combined with the text representation techniques. The results showed that the ML-RF model achieved a higher classification accuracy with the TF-IDF features and, based on the ranking score, the Parabel model outperformed the other methods.

Palabras claves

multi-label classification - supervised learning - text representation - text feature extraction

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 12 Parte: 12 (2021)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Information
Applied Sciences
Algorithms

DOI

https://doi.org/10.3390/info12120491

Artículos similares

Information Retrieval and Machine Learning Methods for Academic Expert Finding

Acceso

Luis M. de Campos, Juan M. Fernández-Luna, Juan F. Huete, Francisco J. Ribadas-Pena and Néstor Bolaños

In the context of academic expert finding, this paper investigates and compares the performance of information retrieval (IR) and machine learning (ML) methods, including deep learning, to approach the problem of identifying academic figures who are expe... ver más

Revista: Algorithms

Explainable Machine Learning Method for Aesthetic Prediction of Doors and Home Designs

Acceso

Jean-Sébastien Dessureault, Félix Clément, Seydou Ba, François Meunier and Daniel Massicotte

The field of interior home design has witnessed a growing utilization of machine learning. However, the subjective nature of aesthetics poses a significant challenge due to its variability among individuals and cultures. This paper proposes an applied ma... ver más

Revista: Information

A Survey of AI Techniques in IoT Applications with Use Case Investigations in the Smart Environmental Monitoring and Analytics in Real-Time IoT Platform

Acceso

Yohanes Yohanie Fridelin Panduman, Nobuo Funabiki, Evianita Dewi Fajrianti, Shihao Fang and Sritrusta Sukaridhoto

In this paper, we have developed the SEMAR (Smart Environmental Monitoring and Analytics in Real-Time) IoT application server platform for fast deployments of IoT application systems. It provides various integration capabilities for the collection, displ... ver más

Revista: Information

Leveraging Semantic Text Analysis to Improve the Performance of Transformer-Based Relation Extraction

Acceso

Marie-Therese Charlotte Evans, Majid Latifi, Mominul Ahsan and Julfikar Haider

Keyword extraction from Knowledge Bases underpins the definition of relevancy in Digital Library search systems. However, it is the pertinent task of Joint Relation Extraction, which populates the Knowledge Bases from which results are retrieved. Recent ... ver más

Revista: Information

A Radio Frequency Fingerprinting-Based Aircraft Identification Method Using ADS-B Transmissions

Acceso

Gursu Gurer, Yaser Dalveren, Ali Kara and Mohammad Derawi

The automatic dependent surveillance broadcast (ADS-B) system is one of the key components of the next generation air transportation system (NextGen). ADS-B messages are transmitted in unencrypted plain text. This, however, causes significant security vu... ver más

Revista: Aerospace

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas