Redirigiendo al acceso original de articulo en 24 segundos...
Inicio  /  Algorithms  /  Vol: 16 Par: 7 (2023)  /  Artículo
ARTÍCULO
TITULO

Learning from Imbalanced Datasets: The Bike-Sharing Inventory Problem Using Sparse Information

Giovanni Ceccarelli    
Guido Cantelmo    
Marialisa Nigro and Constantinos Antoniou    

Resumen

In bike-sharing systems, the inventory level is defined as the daily number of bicycles required to optimally meet the demand. Estimating these values is a major challenge for bike-sharing operators, as biased inventory levels lead to a reduced quality of service at best and a loss of customers and system failure at worst. This paper focuses on using machine learning (ML) classifiers, most notably random forest and gradient tree boosting, for estimating the inventory level from available features including historical data. However, while similar approaches adopted in the context of bike sharing assume the data to be well-balanced, this assumption is not met in the case of the inventory problem. Indeed, as the demand for bike sharing is sparse, datasets become biased toward low demand values, and systematic errors emerge. Thus, we propose to include a new iterative resampling procedure in the classification problem to deal with imbalanced datasets. The proposed model, tested on the real-world data of the Citi Bike operator in New York, allows to (i) provide upper-bound and lower-bound values for the bike-sharing inventory problem, accurately predicting both predominant and rare demand values; (ii) capture the main features that characterize the different demand classes; and (iii) work in a day-to-day framework. Finally, successful bike-sharing systems grow rapidly, opening new stations every year. In addition to changes in the mobility demand, an additional problem is that we cannot use historical information to predict inventory levels for new stations. Therefore, we test the capability of our model to predict inventory levels when historical data is not available, with a specific focus on stations that were not available for training.

 Artículos similares

       
 
Ugur Ercan, Onder Kabas and Georgiana Moiceanu    
Alfalfa holds an extremely significant place in animal nutrition when it comes to providing essential nutrients. The leaves of alfalfa specifically boast the highest nutritional value, containing a remarkable 70% of crude protein and an impressive 90% of... ver más
Revista: Applied Sciences

 
Suryakant Tyagi and Sándor Szénási    
Machine learning and speech emotion recognition are rapidly evolving fields, significantly impacting human-centered computing. Machine learning enables computers to learn from data and make predictions, while speech emotion recognition allows computers t... ver más
Revista: Algorithms

 
Jingwen Yang and Ruohua Zhou    
Whisper speaker recognition (WSR) has received extensive attention from researchers in recent years, and it plays an important role in medical, judicial, and other fields. Among them, the establishment of a whisper dataset is very important for the study... ver más
Revista: Information

 
Bata Hena, Ziang Wei, Luc Perron, Clemente Ibarra Castanedo and Xavier Maldague    
Industrial radiography is a pivotal non-destructive testing (NDT) method that ensures quality and safety in a wide range of industrial sectors. Conventional human-based approaches, however, are prone to challenges in defect detection accuracy and efficie... ver más
Revista: Information

 
Jin Wang, Peng Zhao, Zhe Zhang, Ting Yue, Hailiang Liu and Lixin Wang    
The upset state is an unexpected flight state, which is characterized by an unintentional deviation from normal operating parameters. It is difficult for the pilot to recover the aircraft from the upset state accurately and quickly. In this paper, an ups... ver más
Revista: Aerospace