REVISTA
Information

TODAS

Inicio / Information / Vol: 11 Par: 9 (2020) / Artículo

ARTÍCULO

TITULO

Identification of Malignancies from Free-Text Histopathology Reports Using a Multi-Model Supervised Machine Learning Approach

Victor Olago

Mazvita Muchengeti

Elvira Singh and Wenlong C. Chen

Resumen

We explored various Machine Learning (ML) models to evaluate how each model performs in the task of classifying histopathology reports. We trained, optimized, and performed classification with Stochastic Gradient Descent (SGD), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor (KNN), Adaptive Boosting (AB), Decision Trees (DT), Gaussian Naïve Bayes (GNB), Logistic Regression (LR), and Dummy classifier. We started with 60,083 histopathology reports, which reduced to 60,069 after pre-processing. The F1-scores for SVM, SGD KNN, RF, DT, LR, AB, and GNB were 97%, 96%, 96%, 96%, 92%, 96%, 84%, and 88%, respectively, while the misclassification rates were 3.31%, 5.25%, 4.39%, 1.75%, 3.5%, 4.26%, 23.9%, and 19.94%, respectively. The approximate run times were 2 h, 20 min, 40 min, 8 h, 40 min, 10 min, 50 min, and 4 min, respectively. RF had the longest run time but the lowest misclassification rate on the labeled data. Our study demonstrated the possibility of applying ML techniques in the processing of free-text pathology reports for cancer registries for cancer incidence reporting in a Sub-Saharan Africa setting. This is an important consideration for the resource-constrained environments to leverage ML techniques to reduce workloads and improve the timeliness of reporting of cancer statistics.

Palabras claves

machine learning - multi-model supervised machine learning - text mining - text classification - natural language processing - cancer coding - flagging malignant reports

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 11 Parte: 9 (2020)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Applied Sciences
Informatics
Algorithms

DOI

https://doi.org/10.3390/info11090455

Artículos similares

Convolutional Neural Networks in the Diagnosis of Colon Adenocarcinoma

Acceso

Marco Leo, Pierluigi Carcagnì, Luca Signore, Francesco Corcione, Giulio Benincasa, Mikko O. Laukkanen and Cosimo Distante

Colorectal cancer is one of the most lethal cancers because of late diagnosis and challenges in the selection of therapy options. The histopathological diagnosis of colon adenocarcinoma is hindered by poor reproducibility and a lack of standard examinati... ver más

Revista: AI

Method of Improving the Management of Cancer Risk Groups by Coupling a Features-Attention Mechanism to a Deep Neural Network

Acceso

Darian M. Onchis, Flavia Costi, Codruta Istin, Ciprian Cosmin Secasan and Gabriel V. Cozma

(1) Background: Lung cancers are the most common cancers worldwide, and prostate cancers are among the second in terms of the frequency of cancers diagnosed in men. Automatic ranking of the risk groups of such diseases is highly in demand, but the clinic... ver más

Revista: Applied Sciences

Application of Artificial Intelligence in the Mammographic Detection of Breast Cancer in Saudi Arabian Women

Acceso

Rowa Aljondi, Salem Saeed Alghamdi, Abdulrahman Tajaldeen, Shareefah Alassiri, Monagi H. Alkinani and Thomas Bertinotti

Background: Breast cancer has a 14.8% incidence rate and an 8.5% fatality rate in Saudi Arabia. Mammography is useful for the early detection of breast cancer. Researchers have been developing artificial intelligence (AI) algorithms for early breast canc... ver más

Revista: Applied Sciences

Synthesis and Preliminary Screening of the Biological Activity of Sulindac Sulfoximine Derivatives

Acceso

Cosimo Cardellicchio, Valentino Laquintana, Rosa Maria Iacobazzi, Nunzio Denora, Antonio Scilimati, Maria Grazia Perrone and Maria Annunziata M. Capozzi

Sulindac is a well-known anti-inflammatory agent, sometimes employed as an adjuvant in antitumor therapy. Due to the recent interest in sulfoximine for its potential chemotherapeutics, we decided to transform sulindac and its methyl ester into the corres... ver más

Revista: Applied Sciences

A Bioinformatics Analysis of Ovarian Cancer Data Using Machine Learning

Acceso

Vincent Schilling, Peter Beyerlein and Jeremy Chien

The identification of biomarkers is crucial for cancer diagnosis, understanding the underlying biological mechanisms, and developing targeted therapies. In this study, we propose a machine learning approach to predict ovarian cancer patients? outcomes an... ver más

Revista: Algorithms

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas