Semi-Supervised Classification with A*: A Case Study on Electronic Invoicing

Bernardo Panichi and Alessandro Lazzeri

Resumen

This paper addresses the time-intensive task of assigning accurate account labels to invoice entries within corporate bookkeeping. Despite the advent of electronic invoicing, many software solutions still rely on rule-based approaches that fail to address the multifaceted nature of this challenge. While machine learning holds promise for such repetitive tasks, the presence of low-quality training data often poses a hurdle. Frequently, labels pertain to invoice rows at a group level rather than an individual level, leading to the exclusion of numerous records during preprocessing. To enhance the efficiency of an invoice entry classifier within a semi-supervised context, this study proposes an innovative approach that combines the classifier with the A* graph search algorithm. Through experimentation across various classifiers, the results consistently demonstrated a noteworthy increase in accuracy, ranging between 1% and 4%. This improvement is primarily attributed to a marked reduction in the discard rate of data, which decreased from 39% to 14%. This paper contributes to the literature by presenting a method that leverages the synergy of a classifier and A* graph search to overcome challenges posed by limited and group-level label information in the realm of electronic invoicing classification.

Palabras claves

multi-class classification - automatic invoice labeling - semi-supervised learning - graph search

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 7 Parte: 3 (2023)

MATERIAS

INFRAESTRUCTURA

REVISTAS SIMILARES

Future Internet
Big Data and Cognitive Computing
ISPRS International Journal of Geo-Information

DOI