REVISTA
AI

   
Inicio  /  AI  /  Vol: 2 Par: 4 (2021)  /  Artículo
ARTÍCULO
TITULO

Chinese Comma Disambiguation in Math Word Problems Using SMOTE and Random Forests

Jingxiu Huang    
Qingtang Liu    
Yunxiang Zheng and Linjing Wu    

Resumen

Natural language understanding technologies play an essential role in automatically solving math word problems. In the process of machine understanding Chinese math word problems, comma disambiguation, which is associated with a class imbalance binary learning problem, is addressed as a valuable instrument to transform the problem statement of math word problems into structured representation. Aiming to resolve this problem, we employed the synthetic minority oversampling technique (SMOTE) and random forests to comma classification after their hyperparameters were jointly optimized. We propose a strict measure to evaluate the performance of deployed comma classification models on comma disambiguation in math word problems. To verify the effectiveness of random forest classifiers with SMOTE on comma disambiguation, we conducted two-stage experiments on two datasets with a collection of evaluation measures. Experimental results showed that random forest classifiers were significantly superior to baseline methods in Chinese comma disambiguation. The SMOTE algorithm with optimized hyperparameter settings based on the categorical distribution of different datasets is preferable, instead of with its default values. For practitioners, we suggest that hyperparameters of a classification models be optimized again after parameter settings of SMOTE have been changed.