A breakthrough in machine learning (ML), an approach to achieve AI.2 came in 2012, when Krizhevsky, Sutskever, and Hinton from the University of Toronto presented their research at the Neural Information Processing Systems Conference.3 Their work used a deep convolutional neural network (CNN), a type of ML algorithm used for image classification, for the ImageNet Large Scale Visual Recognition Challenge (ILSVRC); their algorithm significantly outperformed the previous 2011 state-of-the-art algorithm (top-5 error rate 17% CNN vs. 26% previous state-of-the-art). The magnitude of improvement was impressive, relative to typical advancements in the field of ML. Since then, improvements in computer vision have been made with the current 2021 state-of-the-art algorithm achieving a top-5 error rate of just under 10% with the ImageNet dataset.4
ML for image classification has been applied in healthcare, with CNNs outperforming trained radiologists5 and pathologists;6 the current 2020 state-of-the-art ML algorithm for chest x-ray classification is better, on average, than 2.8 out of 3 radiologists in predicting cardiomegaly, edema, consolidation, atelectasis, and pleural effusion.5 ML has also been used to develop predictive models in healthcare.7,8 Advantages of a ML approach include the ability to learn complex, non-linear relationships between predictors and the outcome, and to adjust coefficients of the parameters to optimize model fit in a process known as regularization,9 all with minimal human involvement. Despite the potential benefits of ML, traditional statistical approaches may perform as well as ML in some contexts,10,11 and these scenarios cannot be predicted in advance.
In our recent study, we evaluated whether a ML approach to predict progression on active surveillance for prostate cancer would have superior performance compared to a traditional statistical approach.12 To date, predictive models in this setting have been based on traditional statistical approaches.13-15 We compared 4 different ML algorithms, specifically artificial neural network, support vector machine, random forest, and logistic regression (yes, a ML approach can be used for logistic regression16 ), with a traditional statistical approach, specifically logistic regression with backward elimination for variable selection. We found that the highest performing model, based on the F1 score, was the support vector machine.
Although we found that a ML approach improved model performance compared to a traditional statistical approach, performance of the support vector machine was still insufficient for clinical use with sensitivity and specificity of 72% and 68%, respectively12
ML is a method that can improve model performance, but achieving robust predictive performance requires adequate training. Our training sample of 632 patients with 13 features was inadequate. Developing robust ML algorithms for clinical use will likely require multi-institutional collaboration with sharing of abundant and informative parameters related to clinical, radiologic, pathologic, and genomic data. To safeguard the privacy of healthcare data among different institutions, federated learning, a form of ML that trains algorithms collaboratively without exchanging the data itself, has been proposed.17
ML models are being used with increasing frequency in healthcare research and the number of FDA approved ML algorithms is growing.18 Given the proliferation of ML for healthcare, our study also aims to introduce readers with some of the concepts related to ML such as hyperparameter tuning, performance metrics, and the different models. An understanding of ML among clinicians will facilitate collaborations with computer scientists, which is essential for the benefits of AI to be seen in improving patient care.
Written by: Madhur Nayan, MDCM, PhD, Department of Urology, Massachusetts General Hospital, Boston, MA, United States
References:
- McCarthy, J., Minsky, M. L., Rochester, N. et al.: A proposal for the dartmouth summer research project on artificial intelligence, august 31, 1955. AI magazine, 27: 12, 2006
- Copeland, M., 2016
- Krizhevsky, A., Sutskever, I., Hinton, G. E.: Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25: 1097, 2012
- Pham, H., Dai, Z., Xie, Q. et al.: Meta pseudo labels. Presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021
- Yuan, Z., Yan, Y., Sonka, M. et al.: Robust deep auc maximization: A new surrogate loss and empirical studies on medical image classification. arXiv preprint arXiv:2012.03173, 2020
- Nagpal, K., Foote, D., Liu, Y. et al.: Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer. NPJ digital medicine, 2: 1, 2019
- Mao, Q., Jay, M., Hoffman, J. L. et al.: Multicentre validation of a sepsis prediction algorithm using only vital sign data in the emergency department, general ward and ICU. BMJ open, 8: e017833, 2018
- Park, J. H., Cho, H. E., Kim, J. H. et al.: Machine learning prediction of incidence of Alzheimer’s disease using large-scale administrative health data. NPJ digital medicine, 3: 1, 2020
- James, G., Witten, D., Hastie, T. et al.: An introduction to statistical learning: Springer, 2013
- Kate, R. J., Perez, R. M., Mazumdar, D. et al.: Prediction and detection models for acute kidney injury in hospitalized older adults. BMC Med Inform Decis Mak, 16: 39, 2016
- Thottakkara, P., Ozrazgat-Baslanti, T., Hupf, B. B. et al.: Application of Machine Learning Techniques to High-Dimensional Clinical Data to Forecast Postoperative Complications. PLoS One, 11: e0155705, 2016
- Nayan, M., Salari, K., Bozzo, A. et al.: A machine learning approach to predict progression on active surveillance for prostate cancer. Urologic Oncology: Seminars and Original Investigations, 2021
- Mamawala, M. M., Rao, K., Landis, P. et al.: Risk prediction tool for grade re‐classification in men with favourable‐risk prostate cancer on active surveillance. BJU international, 120: 25, 2017
- Cooperberg, M. R., Brooks, J. D., Faino, A. V. et al.: Refined Analysis of Prostate-specific Antigen Kinetics to Predict Prostate Cancer Active Surveillance Outcomes. Eur Urol, 74: 211, 2018
- Cooperberg, M. R., Zheng, Y., Faino, A. V. et al.: Tailoring intensity of active surveillance for low-risk prostate cancer based on individualized prediction of risk stability. JAMA oncology, 6: e203187, 2020
- Uddin, S., Khan, A., Hossain, M. E. et al.: Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making, 19: 1, 2019
- Rieke, N., Hancox, J., Li, W. et al.: The future of digital health with federated learning. NPJ digital medicine, 3: 1, 2020
- Benjamens, S., Dhunnoo, P., Meskó, B.: The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ digital medicine, 3: 1, 2020
Read the Abstract