REVISTA
Information

TODAS

Redirigiendo al acceso original de articulo en 24 segundos...

Inicio / Information / Vol: 14 Par: 11 (2023) / Artículo

ARTÍCULO

TITULO

Deep Learning Approaches for Big Data-Driven Metadata Extraction in Online Job Postings

Panagiotis Skondras

Nikos Zotos

Dimitris Lagios

Panagiotis Zervas

Konstantinos C. Giotopoulos and Giannis Tzimas

Resumen

This article presents a study on the multi-class classification of job postings using machine learning algorithms. With the growth of online job platforms, there has been an influx of labor market data. Machine learning, particularly NLP, is increasingly used to analyze and classify job postings. However, the effectiveness of these algorithms largely hinges on the quality and volume of the training data. In our study, we propose a multi-class classification methodology for job postings, drawing on AI models such as text-davinci-003 and the quantized versions of Falcon 7b (Falcon), Wizardlm 7B (Wizardlm), and Vicuna 7B (Vicuna) to generate synthetic datasets. These synthetic data are employed in two use-case scenarios: (a) exclusively as training datasets composed of synthetic job postings (situations where no real data is available) and (b) as an augmentation method to bolster underrepresented job title categories. To evaluate our proposed method, we relied on two well-established approaches: the feedforward neural network (FFNN) and the BERT model. Both the use cases and training methods were assessed against a genuine job posting dataset to gauge classification accuracy. Our experiments substantiated the benefits of using synthetic data to enhance job posting classification. In the first scenario, the models? performance matched, and occasionally exceeded, that of the real data. In the second scenario, the augmented classes consistently outperformed in most instances. This research confirms that AI-generated datasets can enhance the efficacy of NLP algorithms, especially in the domain of multi-class classification job postings. While data augmentation can boost model generalization, its impact varies. It is especially beneficial for simpler models like FNN. BERT, due to its context-aware architecture, also benefits from augmentation but sees limited improvement. Selecting the right type and amount of augmentation is essential.

Palabras claves

metadata extraction - online job postings - big data - web crawling - data preprocessing - ChatGPT - deep learning - embeddings - labor market analysis

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 14 Parte: 11 (2023)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Water
Applied Sciences
Applied System Innovation

DOI

https://doi.org/10.3390/info14110585

Artículos similares

CRAS: Curriculum Regularization and Adaptive Semi-Supervised Learning with Noisy Labels

Acceso

Ryota Higashimoto, Soh Yoshida and Mitsuji Muneyasu

This paper addresses the performance degradation of deep neural networks caused by learning with noisy labels. Recent research on this topic has exploited the memorization effect: networks fit data with clean labels during the early stages of learning an... ver más

Revista: Applied Sciences

Identification of Time-Varying Conceptual Hydrological Model Parameters with Differentiable Parameter Learning

Acceso

Xie Lian, Xiaolong Hu, Liangsheng Shi, Jinhua Shao, Jiang Bian and Yuanlai Cui

The parameters of the GR4J-CemaNeige coupling model (GR4neige) are typically treated as constants. However, the maximum capacity of the production store (parX1) exhibits time-varying characteristics due to climate variability and vegetation coverage chan... ver más

Revista: Water

Assessing Objective Functions in Streamflow Prediction Model Training Based on the Naïve Method

Acceso

Yongen Lin, Dagang Wang, Tao Jiang and Aiqing Kang

Reliable streamflow forecasting is a determining factor for water resource planning and flood control. To better understand the strengths and weaknesses of newly proposed methods in streamflow forecasting and facilitate comparisons of different research ... ver más

Revista: Water

Deep Learning Method to Detect Missing Welds for Joist Assembly Line

Acceso

Hamed Raoofi, Asa Sabahnia, Daniel Barbeau and Ali Motamedi

Traditional methods of supervision in the construction industry are time-consuming and costly, requiring significant investments in skilled labor. However, with advancements in artificial intelligence, computer vision, and deep learning, these methods ca... ver más

Revista: Applied System Innovation

YOLOTransfer-DT: An Operational Digital Twin Framework with Deep and Transfer Learning for Collision Detection and Situation Awareness in Urban Aerial Mobility

Acceso

Nan Lao Ywet, Aye Aye Maw, Tuan Anh Nguyen and Jae-Woo Lee

Urban Air Mobility (UAM) emerges as a transformative approach to address urban congestion and pollution, offering efficient and sustainable transportation for people and goods. Central to UAM is the Operational Digital Twin (ODT), which plays a crucial r... ver más

Revista: Aerospace

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas