REVISTA
Information

TODAS

Inicio / Information / Vol: 13 Par: 10 (2022) / Artículo

ARTÍCULO

TITULO

Zero-Shot Topic Labeling for Hazard Classification

Andrea Rondinelli

Lorenzo Bongiovanni and Valerio Basile

Resumen

Topic classification is the task of mapping text onto a set of meaningful labels known beforehand. This scenario is very common both in academia and industry whenever there is the need of categorizing a big corpus of documents according to set custom labels. The standard supervised approach, however, requires thousands of documents to be manually labelled, and additional effort every time the label taxonomy changes. To obviate these downsides, we investigated the application of a zero-shot approach to topic classification. In this setting, a subset of these topics, or even all of them, is not seen at training time, challenging the model to classify corresponding examples using additional information. We first show how zero-shot classification can perform the topic-classification task without any supervision. Secondly, we build a novel hazard-detection dataset by manually selecting tweets gathered by LINKS Foundation for this task, where we demonstrate the effectivenes of our cost-free method on a real-world problem. The idea is to leverage a pre-trained text-embedder (MPNet) to map both text and topics into the same semantic vector space where they can be compared. We demonstrate that these semantic spaces are better aligned when their dimension is reduced, keeping only the most useful information. We investigated three different dimensionality reduction techniques, namely, linear projection, autoencoding and PCA. Using the macro F1-score as the standard metric, it was found that PCA is the best performing technique, recording improvements for each dataset in comparison with the performance on the baseline.

Palabras claves

zero-shot - topic labeling - hazard classification

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 13 Parte: 10 (2022)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Water
Information
Applied Sciences

DOI

https://doi.org/10.3390/info13100444

Artículos similares

Enhancing Heartbeat Classification through Cascading Next Generation and Conventional Reservoir Computing

Acceso

Khaled Arbateni and Amir Benzaoui

Electrocardiography (ECG) is a simple and safe tool for detecting heart conditions. Despite the diaspora of existing heartbeat classifiers, improvements such as real-time heartbeat identification and patient-independent classification persist. Reservoir ... ver más

Revista: Applied Sciences

A Study on the Emotional Tendency of Aquatic Product Quality and Safety Texts Based on Emotional Dictionaries and Deep Learning

Acceso

Xingxing Tong, Ming Chen and Guofu Feng

The issue of aquatic product quality and safety has gradually become a focal point of societal concern. Analyzing textual comments from people about aquatic products aids in promptly understanding the current sentiment landscape regarding the quality and... ver más

Revista: Applied Sciences

AI-Driven Precision Clothing Classification: Revolutionizing Online Fashion Retailing with Hybrid Two-Objective Learning

Acceso

Waseem Abbas, Zuping Zhang, Muhammad Asim, Junhong Chen and Sadique Ahmad

In the ever-expanding online fashion market, businesses in the clothing sales sector are presented with substantial growth opportunities. To utilize this potential, it is crucial to implement effective methods for accurately identifying clothing items. T... ver más

Revista: Information

Bridging the Gap: Exploring Interpretability in Deep Learning Models for Brain Tumor Detection and Diagnosis from MRI Images

Acceso

Wandile Nhlapho, Marcellin Atemkeng, Yusuf Brima and Jean-Claude Ndogmo

The advent of deep learning (DL) has revolutionized medical imaging, offering unprecedented avenues for accurate disease classification and diagnosis. DL models have shown remarkable promise for classifying brain tumors from Magnetic Resonance Imaging (M... ver más

Revista: Information

Information Retrieval and Machine Learning Methods for Academic Expert Finding

Acceso

Luis M. de Campos, Juan M. Fernández-Luna, Juan F. Huete, Francisco J. Ribadas-Pena and Néstor Bolaños

In the context of academic expert finding, this paper investigates and compares the performance of information retrieval (IR) and machine learning (ML) methods, including deep learning, to approach the problem of identifying academic figures who are expe... ver más

Revista: Algorithms

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas