Advanced Ensemble Classifier Techniques for Predicting Tumor Viability in Osteosarcoma Histological Slide Images

Authors

  • Tahsien Al-Quraishi Victorian Institute of Technology, School of IT, Melbourne, Victoria, Australia
  • Chee Keong NG Victorian Institute of Technology, School of IT, Melbourne, Victoria, Australia
  • Osama A. Mahdi Melbourne Institute of Technology, School of IT, Melbourne, Victoria, Australia
  • Amoakoh Gyasi Melbourne Institute of Technology, School of IT, Melbourne, Victoria, Australia
  • Naseer Al-Quraishi Alayen Iraqi University, College of Computer Science, Computer Science Department, Nasiriyah, Iraq

DOI:

https://doi.org/10.58496/ADSA/2024/006

Keywords:

Osteosarcoma Cancer Histological, Slide Images Machine Learning, Feature Selection Ensemble Classifier, Voting Classifier, Multi-Layer Perceptron Random Forest, Logistic Regression AdaBoost Classifier, GB Classifier

Abstract

Background: Osteosarcoma is considered as the primary malignant tumor of the bone, emanating from primitive mesenchymal cells that form osteoid or immature bone. Accurate diagnosis and classification play a key role in management planning to achieve improved patient outcomes. Machine learning techniques may be used to augment and surpass existing conventional methods towards an analysis of medical data.

Methods: In the present study, the combination of feature selection techniques and classification methods was used in the development of predictive models of osteosarcoma cases. The techniques include L1 Regularization (Lasso), Recursive Feature Elimination (RFE), SelectKBest, Tree-based Feature Importance, while the following classification methods were applied: Voting Classifier, Decision Tree, Naive Bayes, Multi-Layer Perceptron, Random Forest, Logistic Regression, AdaBoost, and Gradient Boosting. Some model assessment was done by combining metrics such as accuracy, precision, recall, F1 score, AUC, and V score.

Results: The combination of the Tree-Based Feature Importance for feature selection and Voting Classifier with Decision Tree Classifier proved to be giving a higher performance compared to all other combinations, where such combinations helped in correct classification of positive instances and wonderful minimization of false positives. Other combinations also gave significant performances but slightly less effective, for example, L1 Regularization with the Voting Classifier, RFE with the Voting Classifier.

Conclusion: This work presents strong evidence that advanced machine learning with ensemble classifiers and robust feature selection can result in overall improvement of the diagnostic accuracy and robustness for the classification of osteosarcoma. Research on class imbalance and computational efficiency will be its future research priority.

Downloads

Download data is not yet available.

References

H. C. Beird et al., ‘Osteosarcoma’, Nature Reviews Disease Primers, vol. 8, no. 1, p. 77, 2022.

R. L. Siegel, K. D. Miller, N. S. Wagle, A. Jemal, and others, ‘Cancer statistics, 2023’, Ca Cancer J Clin, vol. 73, no. 1, pp. 17–48, 2023.

H. Williams and A. Davies, ‘The effect of X-rays on bone: a pictorial review’, European radiology, vol. 16, pp. 619– 633, 2006.

H. Chen, M. M. Rogalski, and J. N. Anker, ‘Advances in functional X-ray imaging techniques and contrast agents’, Physical Chemistry Chemical Physics, vol. 14, no. 39, pp. 13469–13486, 2012.

T. T. Miller, ‘Bone tumors and tumorlike conditions: analysis with conventional radiography’, Radiology, vol. 246, no. 3, pp. 662–674, 2008.

Z. S. Kundu, ‘Classification, imaging, biopsy and staging of osteosarcoma’, Indian journal of orthopaedics, vol. 48, no. 3, pp. 238–246, 2014.

V. Aran et al., ‘Osteosarcoma, chondrosarcoma, and Ewing sarcoma: Clinical aspects, biomarker discovery and liquid biopsy’, Critical Reviews in Oncology/Hematology, vol. 162, p. 103340, 2021.

H. B. Arunachalam et al., ‘Viable and necrotic tumor assessment from whole slide images of osteosarcoma using machine-learning and deep-learning models’, PloS one, vol. 14, no. 4, p. e0210706, 2019.

K. T. Schmidt, C. H. Chau, D. K. Price, and W. D. Figg, ‘Precision oncology medicine: the clinical relevance of patient- specific biomarkers used to optimize cancer treatment’, The Journal of Clinical Pharmacology, vol. 56, no. 12, pp. 1484–1499, 2016.

M. N. Gurcan, L. E. Boucheron, A. Can, A. Madabhushi, N. M. Rajpoot, and B. Yener, ‘Histopathological image analysis: A review’, IEEE reviews in biomedical engineering, vol. 2, pp. 147–171, 2009.

M. S. Kashaf and E. McGill, ‘Does shared decision making in cancer treatment improve quality of life? Systematic literature review’, Medical decision making, vol. 35, no. 8, pp. 1037–1048, 2015.

J. Amann, A. Blasimme, E. Vayena, D. Frey, V. I. Madai, and P. Consortium, ‘Explainability for artificial intelligence in healthcare: a multidisciplinary perspective’, BMC medical informatics and decision making, vol. 20, pp. 1–9, 2020.

S. L. Goldenberg, G. Nir, and S. E. Salcudean, ‘A new era: artificial intelligence and machine learning in prostate cancer’, Nature Reviews Urology, vol. 16, no. 7, pp. 391–403, 2019.

R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, ‘A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction’, Journal of Applied Science and Technology Trends, vol. 1, no. 1, pp. 56–70, 2020.

G. Seni and J. Elder, Ensemble methods in data mining: improving accuracy through combining predictions. Morgan & Claypool Publishers, 2010.

S. Gawade, A. Bhansali, K. Patil, and D. Shaikh, ‘Application of the convolutional neural networks and supervised deep-learning methods for osteosarcoma bone cancer detection’, Healthcare Analytics, vol. 3, p. 100153, 2023.

M. M. Ahsan, S. A. Luna, and Z. Siddique, ‘Machine-learning-based disease diagnosis: A comprehensive review’, in Healthcare, MDPI, 2022, p. 541.

H. B. Arunachalam et al., ‘Computer aided image segmentation and classification for viable and non-viable tumor identification in osteosarcoma’, in Pacific Symposium on Biocomputing 2017, World Scientific, 2017, pp. 195–206.

M. T. Aziz et al., ‘A Novel Hybrid Approach for Classifying Osteosarcoma Using Deep Feature Extraction and Multilayer Perceptron’, Diagnostics, vol. 13, no. 12, p. 2106, 2023.

I. A. Vezakis, G. I. Lambrou, and G. K. Matsopoulos, ‘Deep Learning Approaches to Osteosarcoma Diagnosis and Classification: A Comparative Methodological Approach’, Cancers, vol. 15, no. 8, p. 2290, 2023.

X. Zhou et al., ‘Emerging applications of deep learning in bone tumors: current advances and challenges’, Frontiers in Oncology, vol. 12, p. 908873, 2022.

T. Al-Quraishi, N. Al-Quraishi, H. AlNabulsi, H. AL-Qarishey, and A. H. Ali, ‘Big Data Predictive Analytics for Personalized Medicine: Perspectives and Challenges’, Applied Data Science and Analysis, pp. 32–38, 2024, doi: 10.58496/ADSA/2024/004.

P. Leavey, A. Sengupta, D. Rakheja, O. Daescu, H. Arunachalam, and R. Mishra, ‘Osteosarcoma data from ut southwestern/UT Dallas for viable and necrotic tumor assessment [data set]’, Cancer Imaging Arch, vol. 14, 2019.

A. Dal Pozzolo, O. Caelen, R. A. Johnson, and G. Bontempi, ‘Calibrating probability with undersampling for unbalanced classification’, in 2015 IEEE symposium series on computational intelligence, IEEE, 2015, pp. 159–166.

H. He and E. A. Garcia, ‘Learning from imbalanced data’, IEEE Transactions on knowledge and data engineering, vol. 21, no. 9, pp. 1263–1284, 2009.

J. Hua, Z. Xiong, J. Lowey, E. Suh, and E. R. Dougherty, ‘Optimal number of features as a function of sample size for various classification rules’, Bioinformatics, vol. 21, no. 8, pp. 1509–1515, 2005.

A. Gyasi-Agyei, T. Al-Quraishi, B. Das, and J. I. Agbinya, ‘Exploratory Analysis and Preprocessing of Dataset for the Classification of Osteosarcoma Types’, in Proceedings of International Conference for ICT (ICICT)-Zambia, 2023, pp. 36–43.

A. H. Farooqi, S. Akhtar, H. Rahman, T. Sadiq, and W. Abbass, ‘Enhancing network intrusion detection using an ensemble voting classifier for internet of things’, Sensors, vol. 24, no. 1, p. 127, 2023.

S. Alelyani, ‘Stable bagging feature selection on medical data’, Journal of Big Data, vol. 8, no. 1, p. 11, 2021.

B. Charbuty and A. Abdulazeez, ‘Classification based on decision tree algorithm for machine learning’, Journal of Applied Science and Technology Trends, vol. 2, no. 01, pp. 20–28, 2021.

C. ann" Ratanamahatana and D. Gunopulos, ‘Feature selection for the naive bayesian classifier using decision trees’, Applied artificial intelligence, vol. 17, no. 5–6, pp. 475–487, 2003.

B. Bai, Z. Wu, S. Weng, and Q. Yang, ‘Application of interpretable machine learning algorithms to predict distant metastasis in osteosarcoma’, Cancer Medicine, vol. 12, no. 4, pp. 5025–5034, 2023.

S. Ghimire et al., ‘Hybrid convolutional neural network-multilayer perceptron model for solar radiation prediction’, Cognitive Computation, vol. 15, no. 2, pp. 645–671, 2023.

M. Fratello, R. Tagliaferri, and others, ‘Decision trees and random forests’, Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, vol. 1, no. S 3, 2018.

M. A. A. Walid et al., ‘Adapted Deep Ensemble Learning-Based Voting Classifier for Osteosarcoma Cancer Classification’, Diagnostics, vol. 13, no. 19, p. 3155, 2023.

Y. Lou, R. Caruana, and J. Gehrke, ‘Intelligible models for classification and regression’, in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012, pp. 150–158.

G. Haixiang, L. Yijing, L. Yanan, L. Xiao, and L. Jinling, ‘BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification’, Engineering Applications of Artificial Intelligence, vol. 49, pp. 176–193, 2016.

H. A. A. Rahman, Y. B. Wah, H. He, and A. Bulgiba, ‘Comparisons of ADABOOST, KNN, SVM and logistic regression in classification of imbalanced dataset’, in Soft Computing in Data Science: First International Conference, SCDS 2015, Putrajaya, Malaysia, September 2-3, 2015, Proceedings 1, Springer, 2015, pp. 54–64.

T. Kavzoglu and A. Teke, ‘Predictive Performances of ensemble machine learning algorithms in landslide susceptibility mapping using random forest, extreme gradient boosting (XGBoost) and natural gradient boosting (NG Boost)’, Arabian Journal for Science and Engineering, vol. 47, no. 6, pp. 7367–7385, 2022.

A. Mayr, H. Binder, O. Gefeller, and M. Schmid, ‘The evolution of boosting algorithms’, Methods of information in medicine, vol. 53, no. 06, pp. 419–427, 2014.

Md. A. Parwez and Md. Abulaish, ‘Text Classification Based on Convolutional Neural Networks and Word Embedding for Low-Resource Languages: Tigrinya’, Information, vol. 12, no. 2, p. 52, 2019, doi: 10.3390/info12020052.

J. A. Hanley and B. J. McNeil, ‘The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve’, Radiology, vol. 143, no. 1, pp. 29–36, 1982, doi: 10.1148/radiology.143.1.7063747.

A. P. Bradley, ‘The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms’, Pattern Recognition, vol. 30, no. 7, pp. 1145–1159, 1997, doi: 10.1016/S0031-3203(96)00142-2.

B. Li, Y. Li, W. Wei, and Z. He, ‘A Comprehensive Evaluation Framework for Deep Model Robustness’, arXiv, vol. 2101.09617, 2021, [Online]. Available: https://arxiv.org/abs/2101.09617

T. Sadiq and W. Abbass, ‘An Ensemble-Based Multi-Classification Machine Learning Classifiers Approach to Detect Multiple Classes of Cyberbullying’, MAKE, vol. 12, no. 1, p. 127, 2024, doi: 10.3390/make12010127.

T. Sadiq and W. Abbass, ‘An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI’, Sensors, vol. 24, no. 1, p. 127, 2024, doi: 10.3390/s24010127.

E. Team, ‘Feature Selection (Intrinsic Methods) - An Introductory Guide to Data Science and Machine Learning’, Educative.io, 2023, [Online]. Available: https://www.educative.io/courses/intro-data-science-machine-learning/feature-selection-intrinsic-methods

M. Cueto-López, G. M. Foody, and M. Pal, ‘Selecting critical features for data classification based on machine learning methods’, Journal of Big Data, vol. 6, no. 1, p. 52, 2019, doi: 10.1186/s40537-019-0190-4.

Downloads

Published

2024-05-29

How to Cite

Al-Quraishi , T., Keong NG , C., Mahdi , O. A., Gyasi, A., & Al-Quraishi, N. (2024). Advanced Ensemble Classifier Techniques for Predicting Tumor Viability in Osteosarcoma Histological Slide Images. Applied Data Science and Analysis, 2024, 52–68. https://doi.org/10.58496/ADSA/2024/006
CITATION
DOI: 10.58496/ADSA/2024/006
Published: 2024-05-29

Issue

Section

Articles

Most read articles by the same author(s)