검색
검색 팝업 닫기

Ex) Article Title, Author, Keywords

Article

Split Viewer

Review Article

Progress in Medical Physics 2019; 30(2): 39-48

Published online June 30, 2019

https://doi.org/10.14316/pmp.2019.30.2.39

Copyright © Korean Society of Medical Physics.

Deep-Learning-Based Molecular Imaging Biomarkers: Toward Data-Driven Theranostics

Hongyoon Choi

Department of Nuclear Medicine, Seoul National University Hospital, Seoul, Korea

Correspondence to:Hongyoon Choi (chy1000@gmail.com)
Tel: 82-2-2072-3347
Fax: 82-2-745-7690

Received: April 19, 2019; Revised: May 11, 2019; Accepted: May 11, 2019

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Deep learning has been applied to various medical data. In particular, current deep learning models exhibit remarkable performance at specific tasks, sometimes offering higher accuracy than that of experts for discriminating specific diseases from medical images. The current status of deep learning applications to molecular imaging can be divided into a few subtypes in terms of their purposes: differential diagnostic classification, enhancement of image acquisition, and image-based quantification. As functional and pathophysiologic information is key to molecular imaging, this review will emphasize the need for accurate biomarker acquisition by deep learning in molecular imaging. Furthermore, this review addresses practical issues that include clinical validation, data distribution, labeling issues, and harmonization to achieve clinically feasible deep learning models. Eventually, deep learning will enhance the role of theranostics, which aims at precision targeting of pathophysiology by maximizing molecular imaging functional information.

KeywordsDeep learning, Molecular imaging, Theranostics, Medical imaging, Imaging biomarker

Deep learning rapidly begins to be applied in the medical field. Recently, several deep learning-related medical devices and softwares have been developed and started to be applied in the clinical fields.1) The major contribution of deep learning to medical data was to objectively evaluate high-dimensional medical data and remarkably reduce laborious works such as segmentation and object detection from high-resolution images. The major medical application is medical imaging fields as a boom of deep learning was started from the computer vision field initiated by ImageNet Challenge.2,3) The methods and neural network architectures developed for ImageNet Challenge have been applied to medial images including radiologic and pathologic exams as well as natural photographic images. These approaches based on computer vision fields have showed remarkable performance in differential diagnosis. For natural photographic images such as skin images and fundoscopy deep learning techniques were relatively easily adopted as convolutional neural network (CNN) models developed for ImageNet Challenge were directly transferred to such images.4,5) Moreover, CNN which show good performance on image classification and processing have been applied to radiologic exams such as chest X-ray and mammography.6-8) Subsequently, CNN models have been used for image-based diagnosis as well as image processing.9) The application of deep learning included 3-dimensional images such as computed tomography (CT), positron emission tomography (PET) and magnetic resonance imaging (MRI) data as well as 2-dimensional radiologic exams. The purpose of clinical use was also expanded to include various applications such as image-based differential diagnosis, segmentation, and image enhancement. Because of the substantial different features of molecular imaging including PET and single-photon emission computed tomography (SPECT) from natural images, there have been various concerns with regard to application of deep learning. Nonetheless, various deep learning techniques have suggested feasible applications to enhance molecular imaging and solved problems such as image resolution and sensitivity.10) In this review, current deep learning models for nuclear medicine and molecular imaging are summarized according to the clinical purposes. In order to develop robust deep learning models and guide their appropriate direction for clinical use, practical issues of current deep learning are introduced in this review.

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Current deep learning models particularly for molecular imaging have focused on various different applications: Image-based diagnosis, enhancing image reconstruction and image quality, and deep learning application for image-based quantification (Table 1).11-44)

Intuitively, one of the most important applications of deep learning in medical fields was differential diagnosis. For molecular imaging studies, as deep learning models generally require a large dataset for the training, several models have used PET or SPECT images which routinely acquired in the clinical setting. One of the major applications was differentiating disorders from normal status. Recently, using fluorodeoxyglucose (FDG) PET images, a few deep CNN models for the differential diagnosis were suggested. For example, using FDG PET images, a deep learning model was developed to differentiate metastatic mediastinal lymph nodes from benign lymph nodes in lung cancer.11) Using a deep CNN, diagnostic accuracy for differentiating metastatic lymph nodes was 86%, which was higher than conventional machine learning algorithms.11) Another CNN model to differentiate T-stages from lung cancer showed comparable results to identify pathologic T-staging.12) Area of receiver-operating-characteristic curve was 0.68 for differentiating advanced T-stage tumors in an independent test set. Deep CNN models have been developed for differential diagnosis of brain disorders using brain SPECT or PET images. As a binary classification problem, dopamine transporter imaging has been interpreted by experts' reading, thus, it was a good candidate for the deep CNN application. A 3-dimensional CNN model showed high accuracy for differentiating 123I-FP-CIT SPECT images of Parkinson's disease from those of controls.19) As accurate image-based diagnosis and the prediction of future cognitive decline in Alzheimer's disease (AD) and mild cognitive impairment (MCI) patients have been clinically important issues, several deep learning models using MRI and PET have been suggested. One of the first research of deep learning application to medical images was representation learning for PET and MRI images for diagnosing AD.17,18) Though these pioneer studies did not use CNN, regarded as a de facto standard model in recent application, these models extract discriminative features automatically and showed higher performance for classifying brain images of AD compared with conventional algorithms. Recently developed models use deep CNN models for differentiating AD from controls, and showed high accuracy for the differentiation.13,14)

Another important application is enhancement of image reconstruction and image quality. For example, CNN models were incorporated into iterative reconstruction framework and showed better performance than conventional denoising algorithms.27) As a generalized approach, deep learning was used to solve the inverse function of signals encoded by sensors including MRI and PET with regard to the image reconstruction, which resulted in fully-automated and flexible reconstruction framework.28) Furthermore, attenuation correction, a crucial step of PET image reconstruction, was aided by deep learning-based attenuation maps. While CT incorporated in fusion PET/CT scanners can provide attenuation information, recent PET/MR requires synthetic CT attenuation maps. Because of the difficulty in the estimation of attenuation map without CT, there have been various issues regarding PET quantification.45,46) Recently suggested deep learning-based CT image synthesis using MR or PET images is promising to solve the quantification issues caused by attenuation correction.30-34) Additionally, deep learning has been used to enhance image quality for low dose PET images.35-37) By combining the algorithms for image reconstruction with low-dose radiotracers and PET- or MR-based attenuation correction can dramatically reduce radiation exposure in the future. Such an ultra-low dose PET may be used for new clinical purposes including disease screening which has been difficult to obtain benefits due to radiation hazards.

As molecular imaging provides quantitative value related to pathophysiology, studies have focused on the application of deep learning to obtain accurate quantification. The most common application of deep learning to medical images is segmentation.9) The segmentation methods are usually based on anatomical images such as CT and MRI. As recent clinical molecular imaging modalities provides fusion images such as PET/CT, PET/MR, and SPECT/CT, deep learning-based segmentation methods can be used to calculate quantitative values such as the accumulation of radiotracer in a specific tissue delineated by anatomical imaging.39,47) The quantification can be improved by generative models such as generative adversarial networks (GAN). For example, pseudo-MR images were generated by AV-45 PET using GAN for the quantification of cortical radiotracer uptake without structural MR acquisition.43)

1. Necessity of deep learning-based biomarker

Even though various deep learning techniques have applied to molecular imaging for differential diagnosis, image enhancement, and accurate quantification, there are many issues that need to be solved in order to be clinically used. One of the gaps between deep learning approaches for natural image recognition and medical images, particularly molecular imaging, is placed on the purpose of imaging. While the image recognition task has simple labels, clinicians often require various types of information from medical images. They include prediction of prognostic outcome and treatment response as well as differential diagnosis.10) In a narrower range, differential diagnosis is similar with labels of natural images; however, many diagnostic classifications are not simple classification. Because many disorders have a spectrum ranged from healthy to fully-blown disease status, ground-truth labels widely used in deep learning training are ambiguous in medical images. Furthermore, a gold standard of diagnostic classification is variable according to disease types as well as clinical situations.48) Thus, if we think more deeply, the eventual purpose of deep learning application to the medical field is not just for simple diagnosis, but for looking to play a critical role in clinical decision.49) As molecular imaging intrinsically provides molecular and pathophysiologic properties with noninvasive manner, deep learning algorithms should more emphasize on the acquisition of objective quantitative value which can predict future outcome and treatment response. Instead of the achievement of the state-of-the-art in classification accuracy, we should find appropriate clinical application of the output of deep learning. For example, a deep learning model was developed for discriminating Alzheimer's disease and normal aged subjects, however, the importance of the application of this model was to transfer to the MCI subjects who would rapidly progress to full-blown dementia.13) The output of the CNN model represents a probability of Alzheimer's disease in a cohort consisting of Alzheimer's disease and normal subjects. As the output of the CNN was estimated by patterns of FDG and amyloid deposit in the brain, these patterns could be associated with a predictive biomarker for the outcome of MCI subjects (Fig. 1).

2. Data distribution and validation

Even though many deep learning models show remarkable performance on the classification problem, such as discriminating fundoscopy images or brain PET images, most models are not validated in the real-world clinical settings. It is related to the evaluation of the performance when a suggested deep learning model tries to be used in the clinical setting. To achieve this validation issue, deep learning models should be tested in an independent test set from the training and internal validation data. The most commonly used method is the application to datasets obtained from different centers.50) Even though deep learning models are validated in an external dataset and show good performance on diagnostic classification or prediction for clinical outcome, they can hardly guarantee the same performance in the heterogeneous clinical environment. That is because the cohort used for the development of deep learning models are different from clinical trials, in which subjects are recruited with specific criteria defined for a clinical setting.51) The problem is placed on the fact that patients in the clinical setting are highly heterogeneous and clinical decision should be made under various situations. For example, deep learning models were mostly developed by a training cohort which consists of patients with a particular disorder and healthy controls. Training and even more validation cohorts usually include similar number of patients and controls. However, in the clinical situation, differential diagnosis or clinical decision is made under the patients' symptoms and signs instead of the simple classification. There are different disorders similar to a given disease status which aims at a deep learning model, even more, a few types of rare disorders. The ratio of disease status and healthy status can be considerably different from the cohort for the training. The problem with data distribution is a bigger factor when we use the deep learning model for disease screening purposes in general population (Fig. 2). This is the reason why deep learning models should be subjected to clinical trials in spite of the high accuracy, and it is necessary to make appropriate use criteria and use it clinically under limited clinical situations.

3. Uncertainty and unseen data

The issues regarding data distribution and ‘unseen data’ in training cohorts can be extended to uncertainty. Under the current approaches of supervised learning from big data and their labels, deep learning-based diagnosis and clinical outcome prediction requires diagnostic uncertainty due to unseen and rare cases. Furthermore, clinical decision is not made by differential diagnosis of high probability, but the exclusion of critical diagnosis related to life-threatening. Lowering the uncertainty of a fatal disease is one of the most important factors in diagnostic testing and one of the most important elements of clinical decision to be achieved through biomarkers.52) Thus, deep learning models should provide uncertainty in its decision to determine whether subjects need additional diagnostic tests. Bayesian approximation with DL for uncertainty measurement is a good example for supervised learning models.53) Another way to bypass the issue regarding uncertainty and unseen data, particularly rare disorders, is to employ unsupervised learning for the anomaly detection. As deep learning is representation learning, latent features in imaging data could show distribution according to training datasets. After the definition of distribution of latent features in the training data, unseen data can be identified by the definition in the latent space.54,55) As conditional generative models such as conditional GAN or variational autoencoders synthesize virtual data of specific conditions, it can be used to define a population distribution of specific conditions. For example, by training a generative model for normal aging changes in brain metabolism, a pseudo-population distribution of brain metabolism at each age can be generated.56) This generated population distribution will be used to find abnormal patterns taking age information into consideration from a given brain image. This type of anomaly detection can bypass the issue related to deep learning models for heterogeneous disorders.

4. Labeling of data: leveraging unlabeled data

Unsupervised learning is an important approach to solve practical issues in labels of imaging data. The labeling of image data, particularly for medical imaging is expansive as well as time-consuming. It requires experts to interpret the images or to decide clinical diagnosis. To obtain ‘gold standard’ diagnosis, many cases require clinical follow-up interpretations, which need a complex professional review process for medical records. Obviously, ethical issues with regard to the acquisition of large data and their label are inevitable. It is a big obstacle to deep learning application that the data with such labels are limited and labeling as a large scale is much more difficult. In addition, many nuclear medicine and molecular imaging data are more difficult to obtain with large scale with labels as various imaging techniques are used according to the clinical purposes.

One of the ways to overcome this labeling issue will be found in the property of medical imaging data. It is relatively easy to collect heterogeneous image data obtained for clinical routine. By using these clinical routine data and unsupervised learning methods, representative features can be obtained. These representative features will be visualized by dimension reduction methods to intuitively identifying patterns of large imaging data. Furthermore, these features obtained by unsupervised learning can be transferred to relatively small datasets which contain both labels and images. This transfer learning can produce a robust deep learning model even if the well-labeled data is relatively small (Fig. 3).57,58) The flexible application of unsupervised learning and transfer learning can be extended to semi-supervised learning. As aforementioned, a database clinically routinely obtained can be relatively easily obtained and a few data in the large unlabeled data can be labeled with the clinical outcome or diagnosis. In spite of a small labeled samples, various deep learning approaches employ unlabeled data to find discriminative representations for small labeled samples.59,60) For example, a study was aimed at prediction of FDG uptake estimated by PET using gene expression data for lung cancer, while a small number of subjects include both PET and gene expression data. By employing a larger gene expression dataset without PET data, a prediction model of FDG uptake can be developed.61) As many clinical data are placed on the situation of ‘large unlabeled data and small labeled data’, the deep learning model which can enhance performance through unsupervised learning and unlabeled data will be widely used in future molecular imaging and medical data research.

Another feasible way to overcome the labeling issue is to employ multiple unstructured data corresponding to imaging data. For example, clinical imaging data include text reports which included human interpretation results with natural languages. Even though these reports are mostly unstructured, they have a lot of information of image labels, including differential diagnosis, abnormal findings and disease locations. Data mining of the semantic interactions of medical images and texts will be a feasible approach to develop a deep learning model based on real-world clinical data.62) As self-supervised learning of imaging representations using a deep learning model for semantic context can be already used in natural image data, medical imaging data will be trained by representations of text reports.63) The learning of representations of the imaging data and finding their clinical significance can be a data-driven approach to develop biomarker without a priori knowledge. The self-supervised learning will be one of the future directions of a data-driven approach and will be achieved by using a text report or intrinsic information, such as age and gender matched with image data.

5. Data harmonization

One of the overlooked practical issues is data harmonization. Molecular imaging routinely used in the clinical setting has various types. Numerous tracers can be used to obtain imaging data according to their clinical purposes. Furthermore, image acquisition protocols are varied according to the centers, which may reduce the accuracy of deep learning models when they aim at generalized application for multiple centers. Different imaging textures related to different detector types and image reconstruction algorithms can affect the performance of deep learning. Furthermore, the distribution of tracer has temporal dynamics, image acquisition at different time points may influence on the acquisition of deep learning-based biomarkers. Recently, deep learning has been used to analyze kinetics of dynamic imaging data,64) however, most imaging data routinely obtained in the clinic are static images, which require harmonization for multiple centers. The different tracers which aim at same molecular targets also cause a harmonization problem. For example, to obtain the information of brain amyloid deposits, several radiotracers are available, e.g., 11C-PIB, 18F-Florbetapir, 18F-Florbetaben, and 18F-Flutemetamol. These PET imaging show similar results though different quantification results.65,66) While classical amyloid quantification can be overcome by linear correction, deep learning models using heterogeneous image data with these different tracers are challenging.

In this review, current deep learning models developed for molecular imaging have been briefly introduced in terms of their purposes. As molecular imaging has information of molecular changes regarding pathophysiology, accurate and objective quantification is a critical step to use in the clinic. This quantitative information is linked to clinical decision and prediction of outcome as well as differential diagnosis. Thus, instead of simple diagnostic classification, we should focus on the discovery of biomarkers by extracting functional information of molecular imaging using deep learning. This information can contribute to theranostic approaches, which aim at the combination of diagnostics and therapeutics using same molecular targets. Deep learning models will summarize the status of patients with quantitative value. The models should be clinically validated under the clinical situation with unbiased data instead of limited datasets. Clinically validated molecular imaging-based biomarker can be used to monitor the disease status in terms of functional information. By predicting the outcome of the patient at the individual level using imaging data, therapeutic plans including dose and schedule as well as treatment methods can be personalized. To facilitate the clinically feasible deep learning models, it is promising to leverage unlabeled data and unsupervised learning. This approach will be used to considerably untangle the issues induced by supervised learning approaches which have been employed by most of deep learning models for imaging data. These issues included the heterogeneous data distribution, unseen data and uncertainty of decisions. Furthermore, unsupervised learning followed by transfer learning can develop various types of deep learning models with relatively small samples. Because of the distinctiveness of the medical field and the various purposes of molecular imaging, the development of a deep learning model that meets the particular clinical goals will be necessary, and the result will be an objective biomarker that plays an important role in objective clinical decision.

Types of current deep learning applications for nuclear medicine and molecular imaging

Types of applicationsExamplesReferences
Image-based diagnosisCancer staging (T- and N-staging)11,12
Diagnosis of Alzheimer’s disease using PET and/or MRI1318
Diagnosis of Parkinson’s disease using dopamine transporter imaging1921
Prediction of coronary heart disease2224
Enhancement of image reconstruction and image qualityImage reconstruction2529
Attenuation correction3034
Recovery of low-dose PET images3537
Image-based quantificationSegmentation3842
Image generation for quantification43,44
  1. Ravi D, Wong C, Deligianni F, Berthelot M, Andreu-Perez J, and Lo B, et al. Deep learning for health informatics. IEEE J Biomed Health Inform 2017;21:4-21.
    Pubmed CrossRef
  2. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, and Ma S, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis 2015;115:211-252.
    CrossRef
  3. Krizhevsky A, Sutskever I, and Hinton GE. ImageNet classification with deep convolutional neural networks, Paper presented at: 25th International Conference on Neural Information Processing Systems, 2012 Dec 3-6, Lake Tahoe, USA. p. 1097-1105.
  4. Weber GM, Mandl KD, and Kohane IS. Finding the missing link for big biomedical data. JAMA 2014;311:2479-2480.
    Pubmed CrossRef
  5. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, and Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017;542:115-118.
    Pubmed CrossRef
  6. Rajpurkar P, Irvin J, Ball RL, Zhu K, Yang B, and Mehta H, et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med 2018;15.
    Pubmed KoreaMed CrossRef
  7. Dhungel N, Carneiro G, and Bradley AP. Automated mass detection in mammograms using cascaded deep learning and random forests, Paper presented at: 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2015 Nov 23-25, Adelaide, Australia.
    CrossRef
  8. Lakhani P, and Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 2017;284:574-582.
    Pubmed CrossRef
  9. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, and Ghafoorian M, et al. A survey on deep learning in medical image analysis. Med Image Anal 2017;42:60-88.
    Pubmed CrossRef
  10. Choi H. Deep learning in nuclear medicine and molecular imaging: current perspectives and future directions. Nucl Med Mol Imaging 2018;52:109-118.
    Pubmed KoreaMed CrossRef
  11. Wang H, Zhou Z, Li Y, Chen Z, Lu P, and Wang W, et al. Comparison of machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer from 18F-FDG PET/CT images. EJNMMI Res 2017;7:11.
    Pubmed KoreaMed CrossRef
  12. Kirienko M, Sollini M, Silvestri G, Mognetti S, Voulaz E, and Antunovic L, et al. Convolutional neural networks promising in lung cancer T-parameter assessment on baseline FDG-PET/CT. Contrast Media Mol Imaging 2018;2018:1382309.
    Pubmed KoreaMed CrossRef
  13. Choi H, and Jin KH; Alzheimer's Disease Neuroimaging Initiative. Predicting cognitive decline with deep learning of brain metabolism and amyloid imaging. Behav Brain Res 2018;344:103-109.
    Pubmed CrossRef
  14. Ding Y, Sohn JH, Kawczynski MG, Trivedi H, Harnish R, and Jenkins NW, et al. A deep learning model to predict a diagnosis of Alzheimer disease by using 18F-FDG PET of the brain. Radiology 2019;290:456-464.
    Pubmed CrossRef
  15. Liu M, Cheng D, and Yan W, Alzheimer's Disease Neuroimaging Initiative. Classification of Alzheimer's disease by combination of convolutional and recurrent neural networks using FDG-PET images. Front Neuroinform 2018;12:35.
    Pubmed KoreaMed CrossRef
  16. Liu S, Liu S, Cai W, Che H, Pujol S, and Kikinis R, et al, ADNI. Multimodal neuroimaging feature learning for multiclass diagnosis of Alzheimer's disease. IEEE Trans Biomed Eng 2015;62:1132-1140.
    Pubmed KoreaMed CrossRef
  17. Suk HI, Lee SW, and Shen D, Alzheimer's Disease Neuroimaging Initiative. Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. Neuroimage 2014;101:569-582.
    Pubmed KoreaMed CrossRef
  18. Li F, Tran L, Thung KH, Ji S, Shen D, and Li J. A robust deep model for improved classification of AD/MCI patients. IEEE J Biomed Health Inform 2015;19:1610-1616.
    Pubmed KoreaMed CrossRef
  19. Choi H, Ha S, Im HJ, Paek SH, and Lee DS. Refining diagnosis of Parkinson's disease with deep learning-based interpretation of dopamine transporter imaging. Neuroimage Clin 2017;16:586-594.
    Pubmed KoreaMed CrossRef
  20. Martinez-Murcia FJ, Górriz JM, Ramírez J, and Ortiz A. Convolutional neural networks for neuroimaging in Parkinson's disease: is preprocessing needed?. Int J Neural Syst 2018;28:1850035.
    Pubmed CrossRef
  21. Kim DH, Wit H, and Thurston M. Artificial intelligence in the diagnosis of Parkinson's disease from ioflupane-123 single-photon emission computed tomography dopamine transporter scans using transfer learning. Nucl Med Commun 2018;39:887-893.
    Pubmed CrossRef
  22. Betancur J, Hu LH, Commandeur F, Sharir T, Einstein AJ, and Fish MB, et al. Deep learning analysis of upright-supine high-efficiency SPECT myocardial perfusion imaging for prediction of obstructive coronary artery disease: a multicenter study. J Nucl Med 2019;60:664-670.
    Pubmed KoreaMed CrossRef
  23. Xu C, Xu L, Gao Z, Zhao S, Zhang H, and Zhang Y et al. Direct detection of pixel-level myocardial infarction areas via a deep-learning algorithm, Paper presented at: International Conference on Medical Image Computing and Computer-Assisted Intervention 2017, 2017 Sep 11-13, Quebec, Canada. p. 240-249.
    CrossRef
  24. Betancur J, Commandeur F, Motlagh M, Sharir T, Einstein AJ, and Bokhari S, et al. Deep learning for prediction of obstructive disease from fast myocardial perfusion SPECT: a multicenter study. JACC Cardiovasc Imaging 2018;11:1654-1663.
    Pubmed KoreaMed CrossRef
  25. Kim K, Wu D, Gong K, Dutta J, Kim JH, and Son YD, et al. Penalized PET reconstruction using deep learning prior and local linear fitting. IEEE Trans Med Imaging 2018;37:1478-1487.
    Pubmed KoreaMed CrossRef
  26. Gong K, Catana C, Qi J, and Li Q. PET image reconstruction using deep image prior. IEEE Trans Med Imaging 2019;38:1655-1665.
    Pubmed CrossRef
  27. Gong K, Guan J, Kim K, Zhang X, Yang J, and Seo Y, et al. Iterative PET image reconstruction using convolutional neural network representation. IEEE Trans Med Imaging 2019;38:675-685.
    Pubmed CrossRef
  28. Zhu B, Liu JZ, Cauley SF, Rosen BR, and Rosen MS. Image reconstruction by domain-transform manifold learning. Nature 2018;555:487-492.
    Pubmed CrossRef
  29. Pfaehler E, De Jong JR, Dierckx RAJO, van Velden FHP, and Boellaard R. SMART (SiMulAtion and ReconsTruction) PET: an efficient PET simulation-reconstruction tool. EJNMMI Phys 2018;5:16.
    Pubmed KoreaMed CrossRef
  30. Hwang D, Kang SK, Kim KY, Seo S, Paeng JC, and Lee DS, et al. Generation of PET attenuation map for whole-body time-of-flight 18F-FDG PET/MRI using a deep neural network trained with simultaneously reconstructed activity and attenuation maps. J Nucl Med 2019;60:1183-1189.
    Pubmed CrossRef
  31. Han X. MR-based synthetic CT generation using a deep convolutional neural network method. Med Phys 2017;44:1408-1419.
    Pubmed CrossRef
  32. Liu F, Jang H, Kijowski R, Bradshaw T, and McMillan AB. Deep learning MR imaging-based attenuation correction for PET/MR imaging. Radiology 2018;286:676-684.
    Pubmed KoreaMed CrossRef
  33. Leynes AP, Yang J, Wiesinger F, Kaushik SS, Shanbhag DD, and Seo Y, et al. Zero-echo-time and dixond pseudo-CT (ZeDD CT): direct generation of pseudo-CT images for pelvic PET/MRI attenuation correction using deep convolutional neural networks with multiparametric MRI. J Nucl Med 2018;59:852-858.
    Pubmed KoreaMed CrossRef
  34. Hwang D, Kim KY, Kang SK, Seo S, Paeng JC, and Lee DS, et al. Improving the accuracy of simultaneously reconstructed activity and attenuation maps using deep learning. J Nucl Med 2018;59:1624-1629.
    Pubmed CrossRef
  35. Xiang L, Qiao Y, Nie D, An L, Wang Q, and Shen D. Deep auto-context convolutional neural networks for standard-dose PET image estimation from low-dose PET/MRI. Neurocomputing 2017;267:406-416.
    Pubmed KoreaMed CrossRef
  36. Chen KT, Gong E, de Carvalho Macruz FB, Xu J, Boumis A, and Khalighi M, et al. Ultra-low-dose 18F-Florbetaben amyloid PET imaging using deep learning with multi-contrast MRI inputs. Radiology 2019;290:649-656.
    Pubmed CrossRef
  37. Wang Y, Yu B, Wang L, Zu C, Lalush DS, and Lin W, et al. 3D conditional generative adversarial networks for high-quality PET image estimation at low dose. Neuroimage 2018;174:550-562.
    Pubmed KoreaMed CrossRef
  38. Wang T, Lei Y, Tang H, He Z, Castillo R, and Wang C, et al. A learning-based automatic segmentation and quantification method on left ventricle in gated myocardial perfusion SPECT imaging: a feasibility study. J Nucl Cardiol 2019 doi: 10.1007/s12350-019-01594-2.
    CrossRef
  39. Lindgren Belal S, Sadik M, Kaboteh R, Enqvist O, Ulén J, and Poulsen MH, et al. A learning-based automatic segmentation and quantification method on left ventricle in gated myocardial perfusion SPECT imaging: A feasibility study. Eur J Radiol 2019;113:89-95.
    Pubmed CrossRef
  40. Chen L, Shen C, Zhou Z, Maquilan G, Albuquerque K, and Folkert MR, et al. Automatic PET cervical tumor segmentation by combining deep learning and anatomic prior. Phys Med Biol 2019;64:085019.
    Pubmed CrossRef
  41. Zhong Z, Kim Y, Plichta K, Allen BG, Zhou L, and Buatti J, et al. Simultaneous cosegmentation of tumors in PET-CT images using deep fully convolutional networks. Med Phys 2019;46:619-633.
  42. Huang B, Chen Z, Wu PM, Ye Y, Feng ST, and Wong CO, et al. Fully automated delineation of gross tumor volume for head and neck cancer on PET-CT using deep learning: a dual-center study. Contrast Media Mol Imaging 2018;2018:8923028.
    Pubmed KoreaMed CrossRef
  43. Choi H, and Lee DS, Alzheimer's Disease Neuroimaging Initiative. Generation of structural MR images from Amyloid PET: application to MR-less quantification. J Nucl Med 2018;59:1111-1117.
    Pubmed KoreaMed CrossRef
  44. Kang SK, Seo S, Shin SA, Byun MS, Lee DY, and Kim YK, et al. Adaptive template generation for amyloid PET using a deep learning approach. Hum Brain Mapp 2018;39:3769-3778.
    Pubmed CrossRef
  45. Samarin A, Burger C, Wollenweber SD, Crook DW, Burger IA, and Schmid DT, et al. PET/MR imaging of bone lesions--implications for PET quantification from imperfect attenuation correction. Eur J Nucl Med Mol Imaging 2012;39:1154-1160.
    Pubmed CrossRef
  46. Choi H, Cheon GJ, Kim HJ, Choi SH, Lee JS, and Kim YI, et al. Segmentation-based MR attenuation correction including bones also affects quantitation in brain studies: an initial result of 18F-FP-CIT PET/MR for patients with parkinsonism. J Nucl Med 2014;55:1617-1622.
    Pubmed CrossRef
  47. Park J, Bae S, Seo S, Park S, Bang JI, and Han JH, et al. Measurement of glomerular filtration rate using quantitative SPECT/CT and deep-learning-based kidney segmentation. Sci Rep 2019;9:4223.
    Pubmed KoreaMed CrossRef
  48. Beam AL, and Kohane IS. Translating artificial intelligence into clinical care. JAMA 2016;316:2368-2369.
    Pubmed CrossRef
  49. He J, Baxter SL, Xu J, Xu J, Zhou X, and Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med 2019;25:30-36.
    Pubmed CrossRef
  50. Saria S, Butte A, and Sheikh A. Better medicine through machine learning: what's real, and what's artificial?. PLoS Med 2018;15.
    Pubmed KoreaMed CrossRef
  51. Park SH, and Han K. Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiology 2018;286:800-809.
    Pubmed CrossRef
  52. Redelmeier DA, and Shafir E. Medical decision making in situations that offer multiple alternatives. JAMA 1995;273:302-305.
    Pubmed CrossRef
  53. Gal Y, and Ghahramani Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning, Paper presented at: 33rd International Conference on Machine Learning, 2016 Jun 19-24, New York, USA.
  54. Wei Q, Ren Y, Hou R, Shi B, Lo JY, and Carin L. Anomaly detection for medical images based on a one-class classification, Paper presented at: SPIE Medical Imaging 2018: Computer-Aided Diagnosis, 2018 Feb 10-15, Houston, USA.
    CrossRef
  55. Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, and Langs G. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery.
    Pubmed CrossRef
  56. Choi H, Kang H, and Lee DS, Alzheimer's Disease Neuroimaging Initiative. Predicting aging of brain metabolic topography using variational autoencoder. Front Aging Neurosci 2018;10:212.
    Pubmed KoreaMed CrossRef
  57. Le QV, Ranzato MA, Monga R, Devin M, Chen K, and Corrado GS et al, . Building high-level features using large scale unsupervised learning. arXiv 2011 1112.6209 [Preprint] [cited 2019 Mar 1].
  58. Bengio Y. Deep learning of representations for unsupervised and transfer learning, Paper presented at: ICML Workshop on Unsupervised and Transfer Learning 2012, 2012 Jun 26-Jul 1, Edinburgh, UK. p. 17-37.
  59. Rasmus A, Valpola H, Honkala M, Berglund M, and Raiko T. Semi-supervised learning with Ladder networks, Paper presented at: 28th International Conference on Neural Information Processing Systems, 2015 Dec 7-12, Montréal, Canada.
  60. Odena A, . Semi-supervised learning with generative adversarial networks. arXiv 2016 1606.01583 [Preprint] [cited 2019 Mar 1].
  61. Choi H, and Na KJ. Integrative analysis of imaging and transcriptomic data of the immune landscape associated with tumor metabolism in lung adenocarcinoma: clinical and prognostic implications. Theranostics 2018;8:1956-1965.
    Pubmed KoreaMed CrossRef
  62. Shin HC, Lu L, Kim L, Seff A, Yao J, and Summers RM. Interleaved text/image deep mining on a large-scale radiology database for automated image interpretation, Paper presented at: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015 Jun 7-12, Boston, USA.
    CrossRef
  63. Gomez L, Patel Y, Rusiñol M, Karatzas D, and Jawahar CV. Self-supervised learning of visual features through embedding images into text topic spaces, Paper presented at: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 Jul 21-26, Honolulu, USA.
    CrossRef
  64. Pan L, Cheng C, Haberkorn U, and Dimitrakopoulou-Strauss A. Machine learning-based kinetic modeling: a robust and reproducible solution for quantitative analysis of dynamic PET data. Phys Med Biol 2017;62:3566-3581.
    Pubmed CrossRef
  65. Landau SM, Breault C, Joshi AD, Pontecorvo M, Mathis CA, and Jagust WJ, et al, Alzheimer's Disease Neuroimaging Initiative. Amyloid-β imaging with Pittsburgh compound B and florbetapir: comparing radiotracers and quantification methods. J Nucl Med 2013;54:70-77.
    Pubmed KoreaMed CrossRef
  66. Klunk WE, Koeppe RA, Price JC, Benzinger TL, Devous MD Sr, and Jagust WJ, et al. The Centiloid project: standardizing quantitative amyloid plaque estimation by PET. Alzheimer's & dementia. Alzheimers Dement 2015;11:1-15.
    Pubmed KoreaMed CrossRef

Article

Review Article

Progress in Medical Physics 2019; 30(2): 39-48

Published online June 30, 2019 https://doi.org/10.14316/pmp.2019.30.2.39

Copyright © Korean Society of Medical Physics.

Deep-Learning-Based Molecular Imaging Biomarkers: Toward Data-Driven Theranostics

Hongyoon Choi

Department of Nuclear Medicine, Seoul National University Hospital, Seoul, Korea

Correspondence to:Hongyoon Choi (chy1000@gmail.com)
Tel: 82-2-2072-3347
Fax: 82-2-745-7690

Received: April 19, 2019; Revised: May 11, 2019; Accepted: May 11, 2019

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Deep learning has been applied to various medical data. In particular, current deep learning models exhibit remarkable performance at specific tasks, sometimes offering higher accuracy than that of experts for discriminating specific diseases from medical images. The current status of deep learning applications to molecular imaging can be divided into a few subtypes in terms of their purposes: differential diagnostic classification, enhancement of image acquisition, and image-based quantification. As functional and pathophysiologic information is key to molecular imaging, this review will emphasize the need for accurate biomarker acquisition by deep learning in molecular imaging. Furthermore, this review addresses practical issues that include clinical validation, data distribution, labeling issues, and harmonization to achieve clinically feasible deep learning models. Eventually, deep learning will enhance the role of theranostics, which aims at precision targeting of pathophysiology by maximizing molecular imaging functional information.

Keywords: Deep learning, Molecular imaging, Theranostics, Medical imaging, Imaging biomarker

Introduction

Deep learning rapidly begins to be applied in the medical field. Recently, several deep learning-related medical devices and softwares have been developed and started to be applied in the clinical fields.1) The major contribution of deep learning to medical data was to objectively evaluate high-dimensional medical data and remarkably reduce laborious works such as segmentation and object detection from high-resolution images. The major medical application is medical imaging fields as a boom of deep learning was started from the computer vision field initiated by ImageNet Challenge.2,3) The methods and neural network architectures developed for ImageNet Challenge have been applied to medial images including radiologic and pathologic exams as well as natural photographic images. These approaches based on computer vision fields have showed remarkable performance in differential diagnosis. For natural photographic images such as skin images and fundoscopy deep learning techniques were relatively easily adopted as convolutional neural network (CNN) models developed for ImageNet Challenge were directly transferred to such images.4,5) Moreover, CNN which show good performance on image classification and processing have been applied to radiologic exams such as chest X-ray and mammography.6-8) Subsequently, CNN models have been used for image-based diagnosis as well as image processing.9) The application of deep learning included 3-dimensional images such as computed tomography (CT), positron emission tomography (PET) and magnetic resonance imaging (MRI) data as well as 2-dimensional radiologic exams. The purpose of clinical use was also expanded to include various applications such as image-based differential diagnosis, segmentation, and image enhancement. Because of the substantial different features of molecular imaging including PET and single-photon emission computed tomography (SPECT) from natural images, there have been various concerns with regard to application of deep learning. Nonetheless, various deep learning techniques have suggested feasible applications to enhance molecular imaging and solved problems such as image resolution and sensitivity.10) In this review, current deep learning models for nuclear medicine and molecular imaging are summarized according to the clinical purposes. In order to develop robust deep learning models and guide their appropriate direction for clinical use, practical issues of current deep learning are introduced in this review.

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Current Deep Learning Models for Molecular Imaging

Current deep learning models particularly for molecular imaging have focused on various different applications: Image-based diagnosis, enhancing image reconstruction and image quality, and deep learning application for image-based quantification (Table 1).11-44)

Intuitively, one of the most important applications of deep learning in medical fields was differential diagnosis. For molecular imaging studies, as deep learning models generally require a large dataset for the training, several models have used PET or SPECT images which routinely acquired in the clinical setting. One of the major applications was differentiating disorders from normal status. Recently, using fluorodeoxyglucose (FDG) PET images, a few deep CNN models for the differential diagnosis were suggested. For example, using FDG PET images, a deep learning model was developed to differentiate metastatic mediastinal lymph nodes from benign lymph nodes in lung cancer.11) Using a deep CNN, diagnostic accuracy for differentiating metastatic lymph nodes was 86%, which was higher than conventional machine learning algorithms.11) Another CNN model to differentiate T-stages from lung cancer showed comparable results to identify pathologic T-staging.12) Area of receiver-operating-characteristic curve was 0.68 for differentiating advanced T-stage tumors in an independent test set. Deep CNN models have been developed for differential diagnosis of brain disorders using brain SPECT or PET images. As a binary classification problem, dopamine transporter imaging has been interpreted by experts' reading, thus, it was a good candidate for the deep CNN application. A 3-dimensional CNN model showed high accuracy for differentiating 123I-FP-CIT SPECT images of Parkinson's disease from those of controls.19) As accurate image-based diagnosis and the prediction of future cognitive decline in Alzheimer's disease (AD) and mild cognitive impairment (MCI) patients have been clinically important issues, several deep learning models using MRI and PET have been suggested. One of the first research of deep learning application to medical images was representation learning for PET and MRI images for diagnosing AD.17,18) Though these pioneer studies did not use CNN, regarded as a de facto standard model in recent application, these models extract discriminative features automatically and showed higher performance for classifying brain images of AD compared with conventional algorithms. Recently developed models use deep CNN models for differentiating AD from controls, and showed high accuracy for the differentiation.13,14)

Another important application is enhancement of image reconstruction and image quality. For example, CNN models were incorporated into iterative reconstruction framework and showed better performance than conventional denoising algorithms.27) As a generalized approach, deep learning was used to solve the inverse function of signals encoded by sensors including MRI and PET with regard to the image reconstruction, which resulted in fully-automated and flexible reconstruction framework.28) Furthermore, attenuation correction, a crucial step of PET image reconstruction, was aided by deep learning-based attenuation maps. While CT incorporated in fusion PET/CT scanners can provide attenuation information, recent PET/MR requires synthetic CT attenuation maps. Because of the difficulty in the estimation of attenuation map without CT, there have been various issues regarding PET quantification.45,46) Recently suggested deep learning-based CT image synthesis using MR or PET images is promising to solve the quantification issues caused by attenuation correction.30-34) Additionally, deep learning has been used to enhance image quality for low dose PET images.35-37) By combining the algorithms for image reconstruction with low-dose radiotracers and PET- or MR-based attenuation correction can dramatically reduce radiation exposure in the future. Such an ultra-low dose PET may be used for new clinical purposes including disease screening which has been difficult to obtain benefits due to radiation hazards.

As molecular imaging provides quantitative value related to pathophysiology, studies have focused on the application of deep learning to obtain accurate quantification. The most common application of deep learning to medical images is segmentation.9) The segmentation methods are usually based on anatomical images such as CT and MRI. As recent clinical molecular imaging modalities provides fusion images such as PET/CT, PET/MR, and SPECT/CT, deep learning-based segmentation methods can be used to calculate quantitative values such as the accumulation of radiotracer in a specific tissue delineated by anatomical imaging.39,47) The quantification can be improved by generative models such as generative adversarial networks (GAN). For example, pseudo-MR images were generated by AV-45 PET using GAN for the quantification of cortical radiotracer uptake without structural MR acquisition.43)

Clinically Feasible Deep Learning-Based Biomarkers and Practical Issues

1. Necessity of deep learning-based biomarker

Even though various deep learning techniques have applied to molecular imaging for differential diagnosis, image enhancement, and accurate quantification, there are many issues that need to be solved in order to be clinically used. One of the gaps between deep learning approaches for natural image recognition and medical images, particularly molecular imaging, is placed on the purpose of imaging. While the image recognition task has simple labels, clinicians often require various types of information from medical images. They include prediction of prognostic outcome and treatment response as well as differential diagnosis.10) In a narrower range, differential diagnosis is similar with labels of natural images; however, many diagnostic classifications are not simple classification. Because many disorders have a spectrum ranged from healthy to fully-blown disease status, ground-truth labels widely used in deep learning training are ambiguous in medical images. Furthermore, a gold standard of diagnostic classification is variable according to disease types as well as clinical situations.48) Thus, if we think more deeply, the eventual purpose of deep learning application to the medical field is not just for simple diagnosis, but for looking to play a critical role in clinical decision.49) As molecular imaging intrinsically provides molecular and pathophysiologic properties with noninvasive manner, deep learning algorithms should more emphasize on the acquisition of objective quantitative value which can predict future outcome and treatment response. Instead of the achievement of the state-of-the-art in classification accuracy, we should find appropriate clinical application of the output of deep learning. For example, a deep learning model was developed for discriminating Alzheimer's disease and normal aged subjects, however, the importance of the application of this model was to transfer to the MCI subjects who would rapidly progress to full-blown dementia.13) The output of the CNN model represents a probability of Alzheimer's disease in a cohort consisting of Alzheimer's disease and normal subjects. As the output of the CNN was estimated by patterns of FDG and amyloid deposit in the brain, these patterns could be associated with a predictive biomarker for the outcome of MCI subjects (Fig. 1).

2. Data distribution and validation

Even though many deep learning models show remarkable performance on the classification problem, such as discriminating fundoscopy images or brain PET images, most models are not validated in the real-world clinical settings. It is related to the evaluation of the performance when a suggested deep learning model tries to be used in the clinical setting. To achieve this validation issue, deep learning models should be tested in an independent test set from the training and internal validation data. The most commonly used method is the application to datasets obtained from different centers.50) Even though deep learning models are validated in an external dataset and show good performance on diagnostic classification or prediction for clinical outcome, they can hardly guarantee the same performance in the heterogeneous clinical environment. That is because the cohort used for the development of deep learning models are different from clinical trials, in which subjects are recruited with specific criteria defined for a clinical setting.51) The problem is placed on the fact that patients in the clinical setting are highly heterogeneous and clinical decision should be made under various situations. For example, deep learning models were mostly developed by a training cohort which consists of patients with a particular disorder and healthy controls. Training and even more validation cohorts usually include similar number of patients and controls. However, in the clinical situation, differential diagnosis or clinical decision is made under the patients' symptoms and signs instead of the simple classification. There are different disorders similar to a given disease status which aims at a deep learning model, even more, a few types of rare disorders. The ratio of disease status and healthy status can be considerably different from the cohort for the training. The problem with data distribution is a bigger factor when we use the deep learning model for disease screening purposes in general population (Fig. 2). This is the reason why deep learning models should be subjected to clinical trials in spite of the high accuracy, and it is necessary to make appropriate use criteria and use it clinically under limited clinical situations.

3. Uncertainty and unseen data

The issues regarding data distribution and ‘unseen data’ in training cohorts can be extended to uncertainty. Under the current approaches of supervised learning from big data and their labels, deep learning-based diagnosis and clinical outcome prediction requires diagnostic uncertainty due to unseen and rare cases. Furthermore, clinical decision is not made by differential diagnosis of high probability, but the exclusion of critical diagnosis related to life-threatening. Lowering the uncertainty of a fatal disease is one of the most important factors in diagnostic testing and one of the most important elements of clinical decision to be achieved through biomarkers.52) Thus, deep learning models should provide uncertainty in its decision to determine whether subjects need additional diagnostic tests. Bayesian approximation with DL for uncertainty measurement is a good example for supervised learning models.53) Another way to bypass the issue regarding uncertainty and unseen data, particularly rare disorders, is to employ unsupervised learning for the anomaly detection. As deep learning is representation learning, latent features in imaging data could show distribution according to training datasets. After the definition of distribution of latent features in the training data, unseen data can be identified by the definition in the latent space.54,55) As conditional generative models such as conditional GAN or variational autoencoders synthesize virtual data of specific conditions, it can be used to define a population distribution of specific conditions. For example, by training a generative model for normal aging changes in brain metabolism, a pseudo-population distribution of brain metabolism at each age can be generated.56) This generated population distribution will be used to find abnormal patterns taking age information into consideration from a given brain image. This type of anomaly detection can bypass the issue related to deep learning models for heterogeneous disorders.

4. Labeling of data: leveraging unlabeled data

Unsupervised learning is an important approach to solve practical issues in labels of imaging data. The labeling of image data, particularly for medical imaging is expansive as well as time-consuming. It requires experts to interpret the images or to decide clinical diagnosis. To obtain ‘gold standard’ diagnosis, many cases require clinical follow-up interpretations, which need a complex professional review process for medical records. Obviously, ethical issues with regard to the acquisition of large data and their label are inevitable. It is a big obstacle to deep learning application that the data with such labels are limited and labeling as a large scale is much more difficult. In addition, many nuclear medicine and molecular imaging data are more difficult to obtain with large scale with labels as various imaging techniques are used according to the clinical purposes.

One of the ways to overcome this labeling issue will be found in the property of medical imaging data. It is relatively easy to collect heterogeneous image data obtained for clinical routine. By using these clinical routine data and unsupervised learning methods, representative features can be obtained. These representative features will be visualized by dimension reduction methods to intuitively identifying patterns of large imaging data. Furthermore, these features obtained by unsupervised learning can be transferred to relatively small datasets which contain both labels and images. This transfer learning can produce a robust deep learning model even if the well-labeled data is relatively small (Fig. 3).57,58) The flexible application of unsupervised learning and transfer learning can be extended to semi-supervised learning. As aforementioned, a database clinically routinely obtained can be relatively easily obtained and a few data in the large unlabeled data can be labeled with the clinical outcome or diagnosis. In spite of a small labeled samples, various deep learning approaches employ unlabeled data to find discriminative representations for small labeled samples.59,60) For example, a study was aimed at prediction of FDG uptake estimated by PET using gene expression data for lung cancer, while a small number of subjects include both PET and gene expression data. By employing a larger gene expression dataset without PET data, a prediction model of FDG uptake can be developed.61) As many clinical data are placed on the situation of ‘large unlabeled data and small labeled data’, the deep learning model which can enhance performance through unsupervised learning and unlabeled data will be widely used in future molecular imaging and medical data research.

Another feasible way to overcome the labeling issue is to employ multiple unstructured data corresponding to imaging data. For example, clinical imaging data include text reports which included human interpretation results with natural languages. Even though these reports are mostly unstructured, they have a lot of information of image labels, including differential diagnosis, abnormal findings and disease locations. Data mining of the semantic interactions of medical images and texts will be a feasible approach to develop a deep learning model based on real-world clinical data.62) As self-supervised learning of imaging representations using a deep learning model for semantic context can be already used in natural image data, medical imaging data will be trained by representations of text reports.63) The learning of representations of the imaging data and finding their clinical significance can be a data-driven approach to develop biomarker without a priori knowledge. The self-supervised learning will be one of the future directions of a data-driven approach and will be achieved by using a text report or intrinsic information, such as age and gender matched with image data.

5. Data harmonization

One of the overlooked practical issues is data harmonization. Molecular imaging routinely used in the clinical setting has various types. Numerous tracers can be used to obtain imaging data according to their clinical purposes. Furthermore, image acquisition protocols are varied according to the centers, which may reduce the accuracy of deep learning models when they aim at generalized application for multiple centers. Different imaging textures related to different detector types and image reconstruction algorithms can affect the performance of deep learning. Furthermore, the distribution of tracer has temporal dynamics, image acquisition at different time points may influence on the acquisition of deep learning-based biomarkers. Recently, deep learning has been used to analyze kinetics of dynamic imaging data,64) however, most imaging data routinely obtained in the clinic are static images, which require harmonization for multiple centers. The different tracers which aim at same molecular targets also cause a harmonization problem. For example, to obtain the information of brain amyloid deposits, several radiotracers are available, e.g., 11C-PIB, 18F-Florbetapir, 18F-Florbetaben, and 18F-Flutemetamol. These PET imaging show similar results though different quantification results.65,66) While classical amyloid quantification can be overcome by linear correction, deep learning models using heterogeneous image data with these different tracers are challenging.

Future Direction to Data-Driven Theranostics

In this review, current deep learning models developed for molecular imaging have been briefly introduced in terms of their purposes. As molecular imaging has information of molecular changes regarding pathophysiology, accurate and objective quantification is a critical step to use in the clinic. This quantitative information is linked to clinical decision and prediction of outcome as well as differential diagnosis. Thus, instead of simple diagnostic classification, we should focus on the discovery of biomarkers by extracting functional information of molecular imaging using deep learning. This information can contribute to theranostic approaches, which aim at the combination of diagnostics and therapeutics using same molecular targets. Deep learning models will summarize the status of patients with quantitative value. The models should be clinically validated under the clinical situation with unbiased data instead of limited datasets. Clinically validated molecular imaging-based biomarker can be used to monitor the disease status in terms of functional information. By predicting the outcome of the patient at the individual level using imaging data, therapeutic plans including dose and schedule as well as treatment methods can be personalized. To facilitate the clinically feasible deep learning models, it is promising to leverage unlabeled data and unsupervised learning. This approach will be used to considerably untangle the issues induced by supervised learning approaches which have been employed by most of deep learning models for imaging data. These issues included the heterogeneous data distribution, unseen data and uncertainty of decisions. Furthermore, unsupervised learning followed by transfer learning can develop various types of deep learning models with relatively small samples. Because of the distinctiveness of the medical field and the various purposes of molecular imaging, the development of a deep learning model that meets the particular clinical goals will be necessary, and the result will be an objective biomarker that plays an important role in objective clinical decision.

Conflicts of Interest

The author has nothing to disclose.

Availability of Data and Materials

All relevant data are within the paper and its Supporting Information files.

Tables

Types of current deep learning applications for nuclear medicine and molecular imaging

Types of applicationsExamplesReferences
Image-based diagnosisCancer staging (T- and N-staging)11,12
Diagnosis of Alzheimer’s disease using PET and/or MRI1318
Diagnosis of Parkinson’s disease using dopamine transporter imaging1921
Prediction of coronary heart disease2224
Enhancement of image reconstruction and image qualityImage reconstruction2529
Attenuation correction3034
Recovery of low-dose PET images3537
Image-based quantificationSegmentation3842
Image generation for quantification43,44

Fig 1.

Figure 1.

The output of deep learning model as a predictive biomarker. A deep convolutional neural network (CNN) model was developed to differentiate brain positron emission tomography of Alzheimer’s disease from healthy subjects. This model was applied to another cohort, mild cognitive impairment patients to predict future cognitive outcome. The output of the model represents a probability of Alzheimer’s disease, which can be used as a predictive biomarker for predicting cognitive outcome in preclinical disorders.

Progress in Medical Physics 2019; 30: 39-48https://doi.org/10.14316/pmp.2019.30.2.39

Fig 2.

Figure 2.

A gap between training and real-world data. Most of deep learning models are developed by patients’ data with specific disorders and controls. The problem of deep learning application to the clinic is the difference between real-world data and the training cohort. Real-world data in the clinic included heterogeneous patients different from training cohorts. Furthermore, the distribution of disease and normal is considerably different. This data distribution issue become a bigger factor when deep learning aims at general population.

Progress in Medical Physics 2019; 30: 39-48https://doi.org/10.14316/pmp.2019.30.2.39

Fig 3.

Figure 3.

Leveraging unlabeled data as a clinical routine for facilitating deep learning development. As labeling for medical data is too expensive and time-consuming, it is a bottleneck for developing deep learning models. Since it is relatively easy to collect heterogeneous image data obtained for clinical routine, unsupervised learning can leverage these unlabeled ‘dirty’ data. Unsupervised learning-based feature extraction can be transferred to relatively small cohorts which contain both labels and images to predict clinical outcome as well as differential diagnosis according to the clinical purposes.

Progress in Medical Physics 2019; 30: 39-48https://doi.org/10.14316/pmp.2019.30.2.39

Table 1 Types of current deep learning applications for nuclear medicine and molecular imaging

Types of applicationsExamplesReferences
Image-based diagnosisCancer staging (T- and N-staging)11,12
Diagnosis of Alzheimer’s disease using PET and/or MRI1318
Diagnosis of Parkinson’s disease using dopamine transporter imaging1921
Prediction of coronary heart disease2224
Enhancement of image reconstruction and image qualityImage reconstruction2529
Attenuation correction3034
Recovery of low-dose PET images3537
Image-based quantificationSegmentation3842
Image generation for quantification43,44

PET, positron emission tomography; MRI, magnetic resonance imaging.


References

  1. Ravi D, Wong C, Deligianni F, Berthelot M, Andreu-Perez J, and Lo B, et al. Deep learning for health informatics. IEEE J Biomed Health Inform 2017;21:4-21.
    Pubmed CrossRef
  2. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, and Ma S, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis 2015;115:211-252.
    CrossRef
  3. Krizhevsky A, Sutskever I, and Hinton GE. ImageNet classification with deep convolutional neural networks, Paper presented at: 25th International Conference on Neural Information Processing Systems, 2012 Dec 3-6, Lake Tahoe, USA. p. 1097-1105.
  4. Weber GM, Mandl KD, and Kohane IS. Finding the missing link for big biomedical data. JAMA 2014;311:2479-2480.
    Pubmed CrossRef
  5. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, and Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017;542:115-118.
    Pubmed CrossRef
  6. Rajpurkar P, Irvin J, Ball RL, Zhu K, Yang B, and Mehta H, et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med 2018;15.
    Pubmed KoreaMed CrossRef
  7. Dhungel N, Carneiro G, and Bradley AP. Automated mass detection in mammograms using cascaded deep learning and random forests, Paper presented at: 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2015 Nov 23-25, Adelaide, Australia.
    CrossRef
  8. Lakhani P, and Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 2017;284:574-582.
    Pubmed CrossRef
  9. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, and Ghafoorian M, et al. A survey on deep learning in medical image analysis. Med Image Anal 2017;42:60-88.
    Pubmed CrossRef
  10. Choi H. Deep learning in nuclear medicine and molecular imaging: current perspectives and future directions. Nucl Med Mol Imaging 2018;52:109-118.
    Pubmed KoreaMed CrossRef
  11. Wang H, Zhou Z, Li Y, Chen Z, Lu P, and Wang W, et al. Comparison of machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer from 18F-FDG PET/CT images. EJNMMI Res 2017;7:11.
    Pubmed KoreaMed CrossRef
  12. Kirienko M, Sollini M, Silvestri G, Mognetti S, Voulaz E, and Antunovic L, et al. Convolutional neural networks promising in lung cancer T-parameter assessment on baseline FDG-PET/CT. Contrast Media Mol Imaging 2018;2018:1382309.
    Pubmed KoreaMed CrossRef
  13. Choi H, and Jin KH; Alzheimer's Disease Neuroimaging Initiative. Predicting cognitive decline with deep learning of brain metabolism and amyloid imaging. Behav Brain Res 2018;344:103-109.
    Pubmed CrossRef
  14. Ding Y, Sohn JH, Kawczynski MG, Trivedi H, Harnish R, and Jenkins NW, et al. A deep learning model to predict a diagnosis of Alzheimer disease by using 18F-FDG PET of the brain. Radiology 2019;290:456-464.
    Pubmed CrossRef
  15. Liu M, Cheng D, and Yan W, Alzheimer's Disease Neuroimaging Initiative. Classification of Alzheimer's disease by combination of convolutional and recurrent neural networks using FDG-PET images. Front Neuroinform 2018;12:35.
    Pubmed KoreaMed CrossRef
  16. Liu S, Liu S, Cai W, Che H, Pujol S, and Kikinis R, et al, ADNI. Multimodal neuroimaging feature learning for multiclass diagnosis of Alzheimer's disease. IEEE Trans Biomed Eng 2015;62:1132-1140.
    Pubmed KoreaMed CrossRef
  17. Suk HI, Lee SW, and Shen D, Alzheimer's Disease Neuroimaging Initiative. Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. Neuroimage 2014;101:569-582.
    Pubmed KoreaMed CrossRef
  18. Li F, Tran L, Thung KH, Ji S, Shen D, and Li J. A robust deep model for improved classification of AD/MCI patients. IEEE J Biomed Health Inform 2015;19:1610-1616.
    Pubmed KoreaMed CrossRef
  19. Choi H, Ha S, Im HJ, Paek SH, and Lee DS. Refining diagnosis of Parkinson's disease with deep learning-based interpretation of dopamine transporter imaging. Neuroimage Clin 2017;16:586-594.
    Pubmed KoreaMed CrossRef
  20. Martinez-Murcia FJ, Górriz JM, Ramírez J, and Ortiz A. Convolutional neural networks for neuroimaging in Parkinson's disease: is preprocessing needed?. Int J Neural Syst 2018;28:1850035.
    Pubmed CrossRef
  21. Kim DH, Wit H, and Thurston M. Artificial intelligence in the diagnosis of Parkinson's disease from ioflupane-123 single-photon emission computed tomography dopamine transporter scans using transfer learning. Nucl Med Commun 2018;39:887-893.
    Pubmed CrossRef
  22. Betancur J, Hu LH, Commandeur F, Sharir T, Einstein AJ, and Fish MB, et al. Deep learning analysis of upright-supine high-efficiency SPECT myocardial perfusion imaging for prediction of obstructive coronary artery disease: a multicenter study. J Nucl Med 2019;60:664-670.
    Pubmed KoreaMed CrossRef
  23. Xu C, Xu L, Gao Z, Zhao S, Zhang H, and Zhang Y et al. Direct detection of pixel-level myocardial infarction areas via a deep-learning algorithm, Paper presented at: International Conference on Medical Image Computing and Computer-Assisted Intervention 2017, 2017 Sep 11-13, Quebec, Canada. p. 240-249.
    CrossRef
  24. Betancur J, Commandeur F, Motlagh M, Sharir T, Einstein AJ, and Bokhari S, et al. Deep learning for prediction of obstructive disease from fast myocardial perfusion SPECT: a multicenter study. JACC Cardiovasc Imaging 2018;11:1654-1663.
    Pubmed KoreaMed CrossRef
  25. Kim K, Wu D, Gong K, Dutta J, Kim JH, and Son YD, et al. Penalized PET reconstruction using deep learning prior and local linear fitting. IEEE Trans Med Imaging 2018;37:1478-1487.
    Pubmed KoreaMed CrossRef
  26. Gong K, Catana C, Qi J, and Li Q. PET image reconstruction using deep image prior. IEEE Trans Med Imaging 2019;38:1655-1665.
    Pubmed CrossRef
  27. Gong K, Guan J, Kim K, Zhang X, Yang J, and Seo Y, et al. Iterative PET image reconstruction using convolutional neural network representation. IEEE Trans Med Imaging 2019;38:675-685.
    Pubmed CrossRef
  28. Zhu B, Liu JZ, Cauley SF, Rosen BR, and Rosen MS. Image reconstruction by domain-transform manifold learning. Nature 2018;555:487-492.
    Pubmed CrossRef
  29. Pfaehler E, De Jong JR, Dierckx RAJO, van Velden FHP, and Boellaard R. SMART (SiMulAtion and ReconsTruction) PET: an efficient PET simulation-reconstruction tool. EJNMMI Phys 2018;5:16.
    Pubmed KoreaMed CrossRef
  30. Hwang D, Kang SK, Kim KY, Seo S, Paeng JC, and Lee DS, et al. Generation of PET attenuation map for whole-body time-of-flight 18F-FDG PET/MRI using a deep neural network trained with simultaneously reconstructed activity and attenuation maps. J Nucl Med 2019;60:1183-1189.
    Pubmed CrossRef
  31. Han X. MR-based synthetic CT generation using a deep convolutional neural network method. Med Phys 2017;44:1408-1419.
    Pubmed CrossRef
  32. Liu F, Jang H, Kijowski R, Bradshaw T, and McMillan AB. Deep learning MR imaging-based attenuation correction for PET/MR imaging. Radiology 2018;286:676-684.
    Pubmed KoreaMed CrossRef
  33. Leynes AP, Yang J, Wiesinger F, Kaushik SS, Shanbhag DD, and Seo Y, et al. Zero-echo-time and dixond pseudo-CT (ZeDD CT): direct generation of pseudo-CT images for pelvic PET/MRI attenuation correction using deep convolutional neural networks with multiparametric MRI. J Nucl Med 2018;59:852-858.
    Pubmed KoreaMed CrossRef
  34. Hwang D, Kim KY, Kang SK, Seo S, Paeng JC, and Lee DS, et al. Improving the accuracy of simultaneously reconstructed activity and attenuation maps using deep learning. J Nucl Med 2018;59:1624-1629.
    Pubmed CrossRef
  35. Xiang L, Qiao Y, Nie D, An L, Wang Q, and Shen D. Deep auto-context convolutional neural networks for standard-dose PET image estimation from low-dose PET/MRI. Neurocomputing 2017;267:406-416.
    Pubmed KoreaMed CrossRef
  36. Chen KT, Gong E, de Carvalho Macruz FB, Xu J, Boumis A, and Khalighi M, et al. Ultra-low-dose 18F-Florbetaben amyloid PET imaging using deep learning with multi-contrast MRI inputs. Radiology 2019;290:649-656.
    Pubmed CrossRef
  37. Wang Y, Yu B, Wang L, Zu C, Lalush DS, and Lin W, et al. 3D conditional generative adversarial networks for high-quality PET image estimation at low dose. Neuroimage 2018;174:550-562.
    Pubmed KoreaMed CrossRef
  38. Wang T, Lei Y, Tang H, He Z, Castillo R, and Wang C, et al. A learning-based automatic segmentation and quantification method on left ventricle in gated myocardial perfusion SPECT imaging: a feasibility study. J Nucl Cardiol 2019 doi: 10.1007/s12350-019-01594-2.
    CrossRef
  39. Lindgren Belal S, Sadik M, Kaboteh R, Enqvist O, Ulén J, and Poulsen MH, et al. A learning-based automatic segmentation and quantification method on left ventricle in gated myocardial perfusion SPECT imaging: A feasibility study. Eur J Radiol 2019;113:89-95.
    Pubmed CrossRef
  40. Chen L, Shen C, Zhou Z, Maquilan G, Albuquerque K, and Folkert MR, et al. Automatic PET cervical tumor segmentation by combining deep learning and anatomic prior. Phys Med Biol 2019;64:085019.
    Pubmed CrossRef
  41. Zhong Z, Kim Y, Plichta K, Allen BG, Zhou L, and Buatti J, et al. Simultaneous cosegmentation of tumors in PET-CT images using deep fully convolutional networks. Med Phys 2019;46:619-633.
  42. Huang B, Chen Z, Wu PM, Ye Y, Feng ST, and Wong CO, et al. Fully automated delineation of gross tumor volume for head and neck cancer on PET-CT using deep learning: a dual-center study. Contrast Media Mol Imaging 2018;2018:8923028.
    Pubmed KoreaMed CrossRef
  43. Choi H, and Lee DS, Alzheimer's Disease Neuroimaging Initiative. Generation of structural MR images from Amyloid PET: application to MR-less quantification. J Nucl Med 2018;59:1111-1117.
    Pubmed KoreaMed CrossRef
  44. Kang SK, Seo S, Shin SA, Byun MS, Lee DY, and Kim YK, et al. Adaptive template generation for amyloid PET using a deep learning approach. Hum Brain Mapp 2018;39:3769-3778.
    Pubmed CrossRef
  45. Samarin A, Burger C, Wollenweber SD, Crook DW, Burger IA, and Schmid DT, et al. PET/MR imaging of bone lesions--implications for PET quantification from imperfect attenuation correction. Eur J Nucl Med Mol Imaging 2012;39:1154-1160.
    Pubmed CrossRef
  46. Choi H, Cheon GJ, Kim HJ, Choi SH, Lee JS, and Kim YI, et al. Segmentation-based MR attenuation correction including bones also affects quantitation in brain studies: an initial result of 18F-FP-CIT PET/MR for patients with parkinsonism. J Nucl Med 2014;55:1617-1622.
    Pubmed CrossRef
  47. Park J, Bae S, Seo S, Park S, Bang JI, and Han JH, et al. Measurement of glomerular filtration rate using quantitative SPECT/CT and deep-learning-based kidney segmentation. Sci Rep 2019;9:4223.
    Pubmed KoreaMed CrossRef
  48. Beam AL, and Kohane IS. Translating artificial intelligence into clinical care. JAMA 2016;316:2368-2369.
    Pubmed CrossRef
  49. He J, Baxter SL, Xu J, Xu J, Zhou X, and Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med 2019;25:30-36.
    Pubmed CrossRef
  50. Saria S, Butte A, and Sheikh A. Better medicine through machine learning: what's real, and what's artificial?. PLoS Med 2018;15.
    Pubmed KoreaMed CrossRef
  51. Park SH, and Han K. Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiology 2018;286:800-809.
    Pubmed CrossRef
  52. Redelmeier DA, and Shafir E. Medical decision making in situations that offer multiple alternatives. JAMA 1995;273:302-305.
    Pubmed CrossRef
  53. Gal Y, and Ghahramani Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning, Paper presented at: 33rd International Conference on Machine Learning, 2016 Jun 19-24, New York, USA.
  54. Wei Q, Ren Y, Hou R, Shi B, Lo JY, and Carin L. Anomaly detection for medical images based on a one-class classification, Paper presented at: SPIE Medical Imaging 2018: Computer-Aided Diagnosis, 2018 Feb 10-15, Houston, USA.
    CrossRef
  55. Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, and Langs G. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery.
    Pubmed CrossRef
  56. Choi H, Kang H, and Lee DS, Alzheimer's Disease Neuroimaging Initiative. Predicting aging of brain metabolic topography using variational autoencoder. Front Aging Neurosci 2018;10:212.
    Pubmed KoreaMed CrossRef
  57. Le QV, Ranzato MA, Monga R, Devin M, Chen K, and Corrado GS et al, . Building high-level features using large scale unsupervised learning. arXiv 2011 1112.6209 [Preprint] [cited 2019 Mar 1].
  58. Bengio Y. Deep learning of representations for unsupervised and transfer learning, Paper presented at: ICML Workshop on Unsupervised and Transfer Learning 2012, 2012 Jun 26-Jul 1, Edinburgh, UK. p. 17-37.
  59. Rasmus A, Valpola H, Honkala M, Berglund M, and Raiko T. Semi-supervised learning with Ladder networks, Paper presented at: 28th International Conference on Neural Information Processing Systems, 2015 Dec 7-12, Montréal, Canada.
  60. Odena A, . Semi-supervised learning with generative adversarial networks. arXiv 2016 1606.01583 [Preprint] [cited 2019 Mar 1].
  61. Choi H, and Na KJ. Integrative analysis of imaging and transcriptomic data of the immune landscape associated with tumor metabolism in lung adenocarcinoma: clinical and prognostic implications. Theranostics 2018;8:1956-1965.
    Pubmed KoreaMed CrossRef
  62. Shin HC, Lu L, Kim L, Seff A, Yao J, and Summers RM. Interleaved text/image deep mining on a large-scale radiology database for automated image interpretation, Paper presented at: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015 Jun 7-12, Boston, USA.
    CrossRef
  63. Gomez L, Patel Y, Rusiñol M, Karatzas D, and Jawahar CV. Self-supervised learning of visual features through embedding images into text topic spaces, Paper presented at: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 Jul 21-26, Honolulu, USA.
    CrossRef
  64. Pan L, Cheng C, Haberkorn U, and Dimitrakopoulou-Strauss A. Machine learning-based kinetic modeling: a robust and reproducible solution for quantitative analysis of dynamic PET data. Phys Med Biol 2017;62:3566-3581.
    Pubmed CrossRef
  65. Landau SM, Breault C, Joshi AD, Pontecorvo M, Mathis CA, and Jagust WJ, et al, Alzheimer's Disease Neuroimaging Initiative. Amyloid-β imaging with Pittsburgh compound B and florbetapir: comparing radiotracers and quantification methods. J Nucl Med 2013;54:70-77.
    Pubmed KoreaMed CrossRef
  66. Klunk WE, Koeppe RA, Price JC, Benzinger TL, Devous MD Sr, and Jagust WJ, et al. The Centiloid project: standardizing quantitative amyloid plaque estimation by PET. Alzheimer's & dementia. Alzheimers Dement 2015;11:1-15.
    Pubmed KoreaMed CrossRef
Korean Society of Medical Physics

Vol.35 No.2
June 2024

pISSN 2508-4445
eISSN 2508-4453
Formerly ISSN 1226-5829

Frequency: Quarterly

Current Issue   |   Archives

Stats or Metrics

Share this article on :

  • line