Ex) Article Title, Author, Keywords
Ex) Article Title, Author, Keywords
Progress in Medical Physics 2023; 34(3): 23-32
Published online September 30, 2023
https://doi.org/10.14316/pmp.2023.34.3.23
Copyright © Korean Society of Medical Physics.
Ryohei Fukui , Ryutarou Matsuura , Katsuhiro Kida , Sachiko Goto
Correspondence to:Ryohei Fukui
(rfukui@okayama-u.ac.jp)
Tel: 81-86-235-6907
Fax: 81-86-222-3717
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Purpose: In radiomics analysis, to evaluate features, and predict genetic characteristics and survival time, the pixel values of lesions depicted in computed tomography (CT) and magnetic resonance imaging (MRI) images are used. CT and MRI offer three-dimensional images, thus producing three-dimensional features (Features_3d) as output. However, in reports, the superiority between Features_3d and two-dimensional features (Features_2d) is distinct. In this study, we aimed to investigate whether a difference exists in the prediction accuracy of radiomics analysis of lung cancer using Features_2d and Features_3d.
Methods: A total of 38 cases of large cell carcinoma (LCC) and 40 cases of squamous cell carcinoma (SCC) were selected for this study. Two- and three-dimensional lesion segmentations were performed. A total of 774 features were obtained. Using least absolute shrinkage and selection operator regression, seven Features_2d and six Features_3d were obtained.
Results: Linear discriminant analysis revealed that the sensitivities of Features_2d and Features_3d to LCC were 86.8% and 89.5%, respectively. The coefficients of determination through multiple regression analysis and the areas under the receiver operating characteristic curve (AUC) were 0.68 and 0.70 and 0.93 and 0.94, respectively. The P-value of the estimated AUC was 0.87.
Conclusions: No difference was found in the prediction accuracy for LCC and SCC between Features_2d and Features_3d.
KeywordsRadiomics, Computed tomography, Lung cancer, Least absolute shrinkage and selection operator, Linear discriminant analysis
Evidence-based cancer treatment plans are generally determined and recommended by relevant academic societies and other scientific organizations. In recent years, attempts have been made to personalize medicine by predicting the prognosis of each patient according to blood, tissue, imaging, and genomic information, an approach referred to as precision medicine [1,2]. In precision medicine, biopsy of cancerous tissues is essential; however, it is highly invasive, and mutations with good and poor prognoses are deemed to coexist in the same cancer tissue [3]. Recently, radiomics, which is an image analysis technique, has attracted the attention of researchers [4]. The term “radiomics” was coined from “radiology” and “-omics” which refers to a science that deals with a large amount of information in a systematic manner. Radiomics analysis uses the pixel values of tissues and lesions as diagnostic radiological images, such as those of computed tomography (CT) and magnetic resonance imaging (MRI). Features, such as lesion shape, histogram, and texture, are obtained from these pixel values. A prediction model is then constructed via the radiomics analysis using features that are highly correlated with the cancer type and genomic information. Numerous studies have reported on the utility of radiomics in lung cancer, brain tumor, and breast cancer [5-7]. Construction of a survival prediction model by analyzing the relationship between features and survival time has also been reported [8,9]. A main challenge before a successful construction of a prediction model in radiomics is the segmentation of the radiological image to specify a given lesion. To simplify this task, a three-dimensional (3D) Slicer is often used as a segmentation tool [10]. The Cancer Imaging Archive (TCIA; Frederick National Laboratory for Cancer Research, Frederick, MD, USA)—a public database of CT images—provides presegmented samples that can be used to initiate a study easily [11]. However, in practice, to improve study originality, a study is often initiated using radiological images from the same institution, and lesion segmentation is an unavoidable task. In cases of CT and MRI image, segmentation in only one slice of the lesion can yield two-dimensional features (Features_2d). Extracting three-dimensional features (Features_3d) by performing segmentation in three dimensions is possible as these images offer 3D images. However, 3D segmentation requires significantly more work than two-dimensional (2D) segmentation. Theoretically, if no differences exist in the prediction model accuracy between Features_3d and Features_2d, the workload involved in starting a radiomics study may be reduced. Therefore, in this study, we aimed to investigate whether a difference exists in the accuracy of radiomics analysis of lung cancer CT using Features_2d and Features_3d.
CT images of lung cancer cases from non-small cell lung cancer (NSCLC)-Radiomics, a public database of TCIA, were used. The database contains 422 patients with NSCLC. Among them, 114 were diagnosed with large cell carcinoma (LCC) and 152 with squamous cell carcinoma (SCC). Other adenocarcinoma cases were also registered; however in this study, LCC as a case of non-SCC and SCC were included. By constructing prediction models for LCC and SCC, the accuracies of Features_2d and Features_3d were verified. In this study, some cases from the LCC and SCC datasets were excluded. The exclusion criteria were as follows: patients who underwent contrast-enhanced imaging, patients with poor breath-holding, and patients with lesions surrounded with pleural effusion or pneumonia. After excluding patients who satisfied the above criteria, 38, and 40 patients were selected for LCC and SCC, respectively. Details of the selected patients are shown in Table 1. The American Joint Committee on Cancer staging criteria were used in Table 1 [12]. These patients were imaged using CT equipment provided by Siemens Healthineers (Erlangen, Germany) or CMS Imaging (North Charleston, SC, USA). The tube voltage was 120 kV and auto exposure control was used for all cases. The matrix size was 512×512 pixels, while the field of view (FOV), slice thickness, and slice spacing varied depending on the facility where the images were taken. Therefore, because the sampling intervals in the x, y, and z directions differed in the cases, isotropic voxelization was performed. Isotropic voxelization includes the nearest neighbor interpolation, linear interpolation, and cubic interpolation methods. Among these methods, the cubic interpolation method was employed as isotropic voxelization is reported to be able to calculate features while maintaining texture [13,14]. In this step, the voxel size was 2×2×2 mm in all cases. Images were reconstructed using the filtered back projection method and the kernel to observe the lung fields. The CT values were the pixel values of the CT images. Therefore, the pixel values of the images were used without any correction because they were comparable even if the equipment and imaging conditions were different. An ethical review for the conduction of this study was not sought owing to the public availability of the image data used in this study.
Table 1 Details of the selected cases
Age (y) | Sex (cases) | Stage (cases) | ||||||
---|---|---|---|---|---|---|---|---|
Male | Female | I | II | IIIa | IIIb | |||
LCC | 65.6±10.6 | 22 | 16 | 11 | 3 | 12 | 12 | |
SCC | 70.0±10.3 | 24 | 16 | 11 | 8 | 10 | 11 |
The 3D Slicer (v5.2.1; Brigham and Women’s Hospital, Boston, MA, USA) was used for lesion segmentation. An example of a lesion segmentation using the 3D Slicer is shown in Fig. 1. One of the authors (a radiological technologist with 14 years of experience) performed the lesion segmentation, which was confirmed by another author. The 2D segmentation was performed on the slice with the largest tumor diameter, which was selected, and agreed by all the authors. For 2D segmentation, we used the level tracing function of the 3D Slicer, which only automatically selects the pixel values closest to the lesion. The region where the boundary of the level tracing function matched the tumor boundary the most was adopted as the segmentation (Fig. 1a). For semiautomatic 3D segmentation, the GrowCut method was used [15]. GrowCut automatically divides the tumor and its surroundings by manually selecting points in and around the tumor. If the entire tumor is not well selected, another point can be selected to achieve whole-tumor segmentation. The accuracy of the segmentation was confirmed through a review from three directions with the co-author (Fig. 1b). The aforementioned techniques provided images of lesions segmented in two and three dimensions.
To calculate the radiomic features, PyRadiomics (Harvard Medical School, Boston, MA, USA), a Python package, was used [16]. In this study, 19 first-order, 16 shape, and 75 texture features were calculated. In Table 2, all calculated radiomic features are shown. The same features were obtained for both Features_2d and Features_3d. In wavelet-transformed features, wavelet image processing was added to these features. Consequently, 774 features were obtained. They were standardized using the following formula:
Table 2 All radiomic features used in this study
Category | Feature name | Category | Feature name |
---|---|---|---|
Shape | Elongation | GLSZM | GrayLevelNonUniformity |
Flatness | GrayLevelNonUniformityNormalized | ||
LeastAxisLength | GrayLevelVariance | ||
MajorAxisLength | HighGrayLevelZoneEmphasis | ||
Maximum2DDiameterColumn | LargeAreaEmphasis | ||
Maximum2DDiameterRow | LargeAreaHighGrayLevelEmphasis | ||
Maximum2DDiameterSlice | LargeAreaLowGrayLevelEmphasis | ||
Maximum3DDiameter | LowGrayLevelZoneEmphasis | ||
MeshVolume | SizeZoneNonUniformity | ||
MinorAxisLength | SizeZoneNonUniformityNormalized | ||
Sphericity | SmallAreaEmphasis | ||
SurfaceArea | SmallAreaHighGrayLevelEmphasis | ||
SurfaceVolumeRatio | SmallAreaLowGrayLevelEmphasis | ||
VoxelVolume | ZoneEntropy | ||
Firstorder | 10Percentile | ZonePercentage | |
90Percentile | ZoneVariance | ||
Energy | GLRLM | GrayLevelNonUniformity | |
Entropy | GrayLevelNonUniformityNormalized | ||
InterquartileRange | GrayLevelVariance | ||
Kurtosis | HighGrayLevelRunEmphasis | ||
Maximum | LongRunEmphasis | ||
MeanAbsoluteDeviation | LongRunHighGrayLevelEmphasis | ||
Mean | LongRunLowGrayLevelEmphasis | ||
Median | LowGrayLevelRunEmphasis | ||
Minimum | RunEntropy | ||
Range | RunLengthNonUniformity | ||
RobustMeanAbsoluteDeviation | RunLengthNonUniformityNormalized | ||
RootMeanSquared | RunPercentage | ||
Skewness | RunVariance | ||
TotalEnergy | ShortRunEmphasis | ||
Uniformity | ShortRunHighGrayLevelEmphasis | ||
Variance | ShortRunLowGrayLevelEmphasis | ||
GLCM | Autocorrelation | GLDM | DependenceEntropy |
JointAverage | DependenceNonUniformity | ||
ClusterProminence | DependenceNonUniformityNormalized | ||
ClusterShade | DependenceVariance | ||
ClusterTendency | GrayLevelNonUniformity | ||
Contrast | GrayLevelVariance | ||
Correlation | HighGrayLevelEmphasis | ||
DifferenceAverage | LargeDependenceEmphasis | ||
DifferenceEntropy | LargeDependenceHighGrayLevelEmphasis | ||
DifferenceVariance | LargeDependenceLowGrayLevelEmphasis | ||
JointEnergy | LowGrayLevelEmphasis | ||
JointEntropy | SmallDependenceEmphasis | ||
Imc1 | SmallDependenceHighGrayLevelEmphasis | ||
Imc2 | SmallDependenceLowGrayLevelEmphasis | ||
GLCM | Idm | ||
Idmn | |||
Id | |||
Idn | |||
InverseVariance | |||
MaximumProbability | |||
SumEntropy | |||
SumSquares |
where
Least absolute shrinkage and selection operator (LASSO) regression was used to extract useful features to discriminate between LCC and SCC from the output features [17]. LASSO regression is a regularized linear regression method that adds the sum of the weights (L1 regularization term) to the least-squares cost function, as follows:
where
Table 3 Selected features using least absolute shrinkage and selection operator regression
Features_2d (lambda value=0.08) | Features_3d (lambda value=0.09) |
---|---|
wavelet-LHH_glrlm_RunVariance | original_glrlm_ShortRunHighGrayLevelEmphasis |
wavelet-LHH_gldm_LargeDependenceLowGrayLevelEmphasis | original_glszm_HighGrayLevelZoneEmphasis |
wavelet-HLL_glrlm_ShortRunEmphasis | wavelet-HLH_firstorder_Kurtosis |
wavelet-HLH_glcm_Imc2 | wavelet-HHH_glcm_Imc2 |
wavelet-HHL_firstorder_90Percentile | wavelet-HHH_gldm_DependenceVariance |
wavelet-HHL_glrlm_LongRunEmphasis | wavelet-LLL_gldm_LargeDependenceHighGrayLevelEmphasis |
wavelet-HHL_glrlm_RunVariance |
Using Fisher’s linear discriminant analysis (LDA), LCC, and SCC were classified from each feature, with Features_2d and Features_3d selected by LASSO regression as input [23]. Because LDA generates linear decision boundaries, its output represents the degree to which the features classify the two groups. To calculate the root mean square error (RMSE) and the coefficient of determination (R2), multiple regression analysis was performed. In this regression, 70% of the cases were used as training data (27 LCC and 28 SCC cases), and 30% were validation data (11 LCC and 12 SCC cases). The receiver operating characteristic (ROC) curve was calculated using the values of the features with the highest regression coefficient
where
where
The two features with the highest regression coefficients in Features_2d and Features_3d are shown in a scatterplot to show that the selected features can classify LCC and SCC, and the results of drawing the decision boundaries by LDA are shown in Fig. 2. The features with the highest regression coefficients were “wavelet-HHL_firstorder_90Percentile” (wHHL_f_90P) and “wavelet-HHL_glrlm_RunVariance” (wHHL_RV) in Features_2d. In Features_3d, the highest regression coefficients were “original_glszm_HighGrayLevelZoneEmphasis” (o_HGLZE) and “original_glrlm_ShortRunHighGrayLevelEmphasis” (o_SRHGLE). In Table 4, the classification of the LCCs and SCCs using LDA are presented. The sensitivity and specificity (sensitivity to SCC) of the LCC classification in Features_2d were 86.8% and 77.5%, respectively. The sensitivity and specificity (sensitivity to SCC) of the LCC classification in Features_3d were 89.5% and 75.0%, respectively. Additionally, two images of each case classified by the prediction model are shown in Fig. 3. In Feature_2d, cases with high wHHL_RV and low wHHL_f_90P were expected to be SCC. In Features_3d, cases with high o_HGLZE and o_SRHGLE were expected to be SCC. The RMSE and R2 results from the multiple regression analysis are shown in Table 5, with similar RMSE and R2 values for Features_2d and Features_3d. Furthermore, ROC curves were created using wHHL_RV and o_SRHGLE, which had the highest regression coefficients in Features_2d and Features_3d (Fig. 4). The AUCs were 0.93 and 0.94, respectively. AUCs were analyzed using the DeLong method, with a
Table 4 Estimation of sensitivity and specificity using the linear discriminant analysis analysis
Features_2d output | Features_3d output | |||||
---|---|---|---|---|---|---|
LCC | SCC | LCC | SCC | |||
Truth | LCC | 86.8% (33/38) | 13.2% (5/38) | 89.5% (34/38) | 10.5% (4/38) | |
SCC | 22.5% (9/40) | 77.5% (31/40) | 25.0% (10/40) | 75.0% (30/40) |
Table 5 Results of the multi-regression analysis
Features_2d | Features_3d | |
---|---|---|
Train | ||
RMSE | 0.34 | 0.30 |
R2 | 0.72 | 0.75 |
Test | ||
RMSE | 0.41 | 0.34 |
R2 | 0.68 | 0.70 |
Various examinations revealed no difference in the prediction accuracy between Features_2d and Features_3d. The features selected using LASSO regression were almost completely different from Features_2d and Features_3d. The common point of the selected features is the presence of many wavelet-transformed features and more texture features than first-order or shape features. One of the most commonly used preprocessing steps in radiomics analysis is the wavelet transform [26]. High- and low-pass filters can be used to obtain image features that are decomposed into low- and high-frequency components. Therefore, detecting features that emphasize noise and edge components compared with those without image processing is possible. Texture features do not use the original pixel values of the image; however, they transform the distribution of pixel values into a matrix and then calculate the feature values as scalar quantities. Therefore, the process of calculating these diverse features established the prediction accuracy of Features_2d, which has only 2D information, considered to be equal to that of Features_3d. However, because the amount of information in Features_3d is greater than that in Features_2d, we believe that the features without image processing (“original”) were also able to produce high regression coefficients. Fig. 3 shows the cases in our study that could be classified as LCC or SCC and those that could not. The histological types of lung cancer can be distinguished by features such as spiculation, notches, ground-grass density, and cavitation [27-29]. For example, cavitation has been reported to often occur in cases of SCC and adenocarcinoma. In the analysis using Features_2d, wHHL_RV values of the radiomics features tended to be larger in SCC cases. Because “variance” in wHHL_RV increases with the variation of CT values, cavitation may have contributed to the variation of CT values. Additionally, because wHHL_f_90P increases with an increasing CT value, nodules with a solid internal structure were assigned as LCC. In Features_3d, the GLRLM, and GLSZM features were selected, which are matrices representing a sequence of identical concentration values. In the analysis using Features_3d, the higher the GLRLM, and GLSZM features, the more SCC was considered. Therefore, even if SCC is considered to be the most frequent case with cavitation, many continuous CT values in three dimensions are deemed present.
However, the prediction accuracies of both features were unsatisfactory. The sensitivities of LCC and SCC obtained from the decision boundaries of LDA were 86.8% and 77.5% for Features_2d and 89.5% and 75.0% for Features_3d, respectively. The R2 values were approximately 0.7 for both features. In this study, we used data from 38 LCC and 40 SCC cases, and the low number of cases resulted in a lower accuracy. The image data used in this study were obtained from the TCIA dataset. Therefore, the same CT image may have different imaging and detailed reconstruction conditions. Accuracy degradation owing to uncertainty caused by the CT image acquisition conditions and other factors have also been reported [30]. Particularly, slice thickness is an important factor in this study. Previous reports have indicated that slice thickness affects radiomics analysis [14]. It is believed that changing slice thickness also changes the sharpness and noise characteristics of CT images, which may affect radiomics analysis. However, in this study, we believe that isotropic voxelization minimized the effect of the slice thickness. Furthermore, we were also able to minimize the effects of differences in slice spacing, FOV, and noise property between cases. Each case was reconstructed using a kernel for the lung fields; however, the frequency response is not always the same. Therefore, the reconstruction kernel used might have affected the construction of the predictive model. To achieve a higher accuracy, considering the time of image data acquisition—for example, by fixing the imaging conditions at the facility—is necessary. This is a limitation of this study. Furthermore, because the segmentation in this study was performed by one author, the related results may have bias. However, segmentation was evaluated by all the authors and corrections were employed accordingly. We consider that the accuracy of the segmentation did not differ between the 2D and 3D segmentations because both segmentations were performed by one author. However, the ROC curve clearly demonstrated that the difference between the two features was minimal. No statistically significant difference was found in the accuracies between the two features. Therefore, in the prediction accuracy between Features_2d and Features_3d in the radiomics analysis of LCC and SCC using CT images, no difference was observed. The image data size used to obtain Features_2d was approximately 600 MB, whereas that used to obtain Features_3d was approximately 10 GB. Lesion segmentation can also be achieved much more quickly with a 2D method. Therefore, a 2D analysis for predicting LCC and SCC using radiomics analysis is recommended.
In this study, we investigated whether a difference exists in the accuracy of radiomics analysis for differentiating between two types of lung cancer when Features_2d and Features_3d are employed in the analysis. Various examinations (i.e., LDA, sensitivity, RMSE, R2, and ROC analysis) revealed no difference in the differentiation accuracy between Features_2d and Features_3d. Considering the amount of segmentation work and the amount of data used in the analysis, Features_3d has many disadvantages. The use of Features_2d has a comparable result with that of the Features_3d with data size reduction. Therefore, radiomics analysis to differentiate between LCC and SCC is recommended using Features_2d.
The authors have nothing to disclose.
The data that support the findings of this study are available on request from the corresponding author.
Conceptualization: Ryohei Fukui, Ryutarou Matsuura, Katsuhiro Kida, Sachiko Goto. Data curation: Ryohei Fukui, Katsuhiro Kida. Formal analysis: Ryohei Fukui, Ryutarou Matsuura. Investigation: Ryohei Fukui. Methodology: Ryohei Fukui, Ryutarou Matsuura, Katsuhiro Kida, Sachiko Goto. Project administration: Ryohei Fukui, Sachiko Goto. Resources: Ryohei Fukui. Software: Ryohei Fukui. Supervision: Ryohei Fukui, Sachiko Goto. Validation: Ryohei Fukui, Katsuhiro Kida. Visualization: Ryohei Fukui. Writing - original draft: Ryohei Fukui. Writing - review & editing: Ryohei Fukui, Ryutarou Matsuura, Katsuhiro Kida, Sachiko Goto.
Progress in Medical Physics 2023; 34(3): 23-32
Published online September 30, 2023 https://doi.org/10.14316/pmp.2023.34.3.23
Copyright © Korean Society of Medical Physics.
Ryohei Fukui , Ryutarou Matsuura , Katsuhiro Kida , Sachiko Goto
Department of Radiological Technology, Faculty of Health Sciences, Okayama University, Okayama, Japan
Correspondence to:Ryohei Fukui
(rfukui@okayama-u.ac.jp)
Tel: 81-86-235-6907
Fax: 81-86-222-3717
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Purpose: In radiomics analysis, to evaluate features, and predict genetic characteristics and survival time, the pixel values of lesions depicted in computed tomography (CT) and magnetic resonance imaging (MRI) images are used. CT and MRI offer three-dimensional images, thus producing three-dimensional features (Features_3d) as output. However, in reports, the superiority between Features_3d and two-dimensional features (Features_2d) is distinct. In this study, we aimed to investigate whether a difference exists in the prediction accuracy of radiomics analysis of lung cancer using Features_2d and Features_3d.
Methods: A total of 38 cases of large cell carcinoma (LCC) and 40 cases of squamous cell carcinoma (SCC) were selected for this study. Two- and three-dimensional lesion segmentations were performed. A total of 774 features were obtained. Using least absolute shrinkage and selection operator regression, seven Features_2d and six Features_3d were obtained.
Results: Linear discriminant analysis revealed that the sensitivities of Features_2d and Features_3d to LCC were 86.8% and 89.5%, respectively. The coefficients of determination through multiple regression analysis and the areas under the receiver operating characteristic curve (AUC) were 0.68 and 0.70 and 0.93 and 0.94, respectively. The P-value of the estimated AUC was 0.87.
Conclusions: No difference was found in the prediction accuracy for LCC and SCC between Features_2d and Features_3d.
Keywords: Radiomics, Computed tomography, Lung cancer, Least absolute shrinkage and selection operator, Linear discriminant analysis
Evidence-based cancer treatment plans are generally determined and recommended by relevant academic societies and other scientific organizations. In recent years, attempts have been made to personalize medicine by predicting the prognosis of each patient according to blood, tissue, imaging, and genomic information, an approach referred to as precision medicine [1,2]. In precision medicine, biopsy of cancerous tissues is essential; however, it is highly invasive, and mutations with good and poor prognoses are deemed to coexist in the same cancer tissue [3]. Recently, radiomics, which is an image analysis technique, has attracted the attention of researchers [4]. The term “radiomics” was coined from “radiology” and “-omics” which refers to a science that deals with a large amount of information in a systematic manner. Radiomics analysis uses the pixel values of tissues and lesions as diagnostic radiological images, such as those of computed tomography (CT) and magnetic resonance imaging (MRI). Features, such as lesion shape, histogram, and texture, are obtained from these pixel values. A prediction model is then constructed via the radiomics analysis using features that are highly correlated with the cancer type and genomic information. Numerous studies have reported on the utility of radiomics in lung cancer, brain tumor, and breast cancer [5-7]. Construction of a survival prediction model by analyzing the relationship between features and survival time has also been reported [8,9]. A main challenge before a successful construction of a prediction model in radiomics is the segmentation of the radiological image to specify a given lesion. To simplify this task, a three-dimensional (3D) Slicer is often used as a segmentation tool [10]. The Cancer Imaging Archive (TCIA; Frederick National Laboratory for Cancer Research, Frederick, MD, USA)—a public database of CT images—provides presegmented samples that can be used to initiate a study easily [11]. However, in practice, to improve study originality, a study is often initiated using radiological images from the same institution, and lesion segmentation is an unavoidable task. In cases of CT and MRI image, segmentation in only one slice of the lesion can yield two-dimensional features (Features_2d). Extracting three-dimensional features (Features_3d) by performing segmentation in three dimensions is possible as these images offer 3D images. However, 3D segmentation requires significantly more work than two-dimensional (2D) segmentation. Theoretically, if no differences exist in the prediction model accuracy between Features_3d and Features_2d, the workload involved in starting a radiomics study may be reduced. Therefore, in this study, we aimed to investigate whether a difference exists in the accuracy of radiomics analysis of lung cancer CT using Features_2d and Features_3d.
CT images of lung cancer cases from non-small cell lung cancer (NSCLC)-Radiomics, a public database of TCIA, were used. The database contains 422 patients with NSCLC. Among them, 114 were diagnosed with large cell carcinoma (LCC) and 152 with squamous cell carcinoma (SCC). Other adenocarcinoma cases were also registered; however in this study, LCC as a case of non-SCC and SCC were included. By constructing prediction models for LCC and SCC, the accuracies of Features_2d and Features_3d were verified. In this study, some cases from the LCC and SCC datasets were excluded. The exclusion criteria were as follows: patients who underwent contrast-enhanced imaging, patients with poor breath-holding, and patients with lesions surrounded with pleural effusion or pneumonia. After excluding patients who satisfied the above criteria, 38, and 40 patients were selected for LCC and SCC, respectively. Details of the selected patients are shown in Table 1. The American Joint Committee on Cancer staging criteria were used in Table 1 [12]. These patients were imaged using CT equipment provided by Siemens Healthineers (Erlangen, Germany) or CMS Imaging (North Charleston, SC, USA). The tube voltage was 120 kV and auto exposure control was used for all cases. The matrix size was 512×512 pixels, while the field of view (FOV), slice thickness, and slice spacing varied depending on the facility where the images were taken. Therefore, because the sampling intervals in the x, y, and z directions differed in the cases, isotropic voxelization was performed. Isotropic voxelization includes the nearest neighbor interpolation, linear interpolation, and cubic interpolation methods. Among these methods, the cubic interpolation method was employed as isotropic voxelization is reported to be able to calculate features while maintaining texture [13,14]. In this step, the voxel size was 2×2×2 mm in all cases. Images were reconstructed using the filtered back projection method and the kernel to observe the lung fields. The CT values were the pixel values of the CT images. Therefore, the pixel values of the images were used without any correction because they were comparable even if the equipment and imaging conditions were different. An ethical review for the conduction of this study was not sought owing to the public availability of the image data used in this study.
Table 1 . Details of the selected cases.
Age (y) | Sex (cases) | Stage (cases) | ||||||
---|---|---|---|---|---|---|---|---|
Male | Female | I | II | IIIa | IIIb | |||
LCC | 65.6±10.6 | 22 | 16 | 11 | 3 | 12 | 12 | |
SCC | 70.0±10.3 | 24 | 16 | 11 | 8 | 10 | 11 |
The 3D Slicer (v5.2.1; Brigham and Women’s Hospital, Boston, MA, USA) was used for lesion segmentation. An example of a lesion segmentation using the 3D Slicer is shown in Fig. 1. One of the authors (a radiological technologist with 14 years of experience) performed the lesion segmentation, which was confirmed by another author. The 2D segmentation was performed on the slice with the largest tumor diameter, which was selected, and agreed by all the authors. For 2D segmentation, we used the level tracing function of the 3D Slicer, which only automatically selects the pixel values closest to the lesion. The region where the boundary of the level tracing function matched the tumor boundary the most was adopted as the segmentation (Fig. 1a). For semiautomatic 3D segmentation, the GrowCut method was used [15]. GrowCut automatically divides the tumor and its surroundings by manually selecting points in and around the tumor. If the entire tumor is not well selected, another point can be selected to achieve whole-tumor segmentation. The accuracy of the segmentation was confirmed through a review from three directions with the co-author (Fig. 1b). The aforementioned techniques provided images of lesions segmented in two and three dimensions.
To calculate the radiomic features, PyRadiomics (Harvard Medical School, Boston, MA, USA), a Python package, was used [16]. In this study, 19 first-order, 16 shape, and 75 texture features were calculated. In Table 2, all calculated radiomic features are shown. The same features were obtained for both Features_2d and Features_3d. In wavelet-transformed features, wavelet image processing was added to these features. Consequently, 774 features were obtained. They were standardized using the following formula:
Table 2 . All radiomic features used in this study.
Category | Feature name | Category | Feature name |
---|---|---|---|
Shape | Elongation | GLSZM | GrayLevelNonUniformity |
Flatness | GrayLevelNonUniformityNormalized | ||
LeastAxisLength | GrayLevelVariance | ||
MajorAxisLength | HighGrayLevelZoneEmphasis | ||
Maximum2DDiameterColumn | LargeAreaEmphasis | ||
Maximum2DDiameterRow | LargeAreaHighGrayLevelEmphasis | ||
Maximum2DDiameterSlice | LargeAreaLowGrayLevelEmphasis | ||
Maximum3DDiameter | LowGrayLevelZoneEmphasis | ||
MeshVolume | SizeZoneNonUniformity | ||
MinorAxisLength | SizeZoneNonUniformityNormalized | ||
Sphericity | SmallAreaEmphasis | ||
SurfaceArea | SmallAreaHighGrayLevelEmphasis | ||
SurfaceVolumeRatio | SmallAreaLowGrayLevelEmphasis | ||
VoxelVolume | ZoneEntropy | ||
Firstorder | 10Percentile | ZonePercentage | |
90Percentile | ZoneVariance | ||
Energy | GLRLM | GrayLevelNonUniformity | |
Entropy | GrayLevelNonUniformityNormalized | ||
InterquartileRange | GrayLevelVariance | ||
Kurtosis | HighGrayLevelRunEmphasis | ||
Maximum | LongRunEmphasis | ||
MeanAbsoluteDeviation | LongRunHighGrayLevelEmphasis | ||
Mean | LongRunLowGrayLevelEmphasis | ||
Median | LowGrayLevelRunEmphasis | ||
Minimum | RunEntropy | ||
Range | RunLengthNonUniformity | ||
RobustMeanAbsoluteDeviation | RunLengthNonUniformityNormalized | ||
RootMeanSquared | RunPercentage | ||
Skewness | RunVariance | ||
TotalEnergy | ShortRunEmphasis | ||
Uniformity | ShortRunHighGrayLevelEmphasis | ||
Variance | ShortRunLowGrayLevelEmphasis | ||
GLCM | Autocorrelation | GLDM | DependenceEntropy |
JointAverage | DependenceNonUniformity | ||
ClusterProminence | DependenceNonUniformityNormalized | ||
ClusterShade | DependenceVariance | ||
ClusterTendency | GrayLevelNonUniformity | ||
Contrast | GrayLevelVariance | ||
Correlation | HighGrayLevelEmphasis | ||
DifferenceAverage | LargeDependenceEmphasis | ||
DifferenceEntropy | LargeDependenceHighGrayLevelEmphasis | ||
DifferenceVariance | LargeDependenceLowGrayLevelEmphasis | ||
JointEnergy | LowGrayLevelEmphasis | ||
JointEntropy | SmallDependenceEmphasis | ||
Imc1 | SmallDependenceHighGrayLevelEmphasis | ||
Imc2 | SmallDependenceLowGrayLevelEmphasis | ||
GLCM | Idm | ||
Idmn | |||
Id | |||
Idn | |||
InverseVariance | |||
MaximumProbability | |||
SumEntropy | |||
SumSquares |
where
Least absolute shrinkage and selection operator (LASSO) regression was used to extract useful features to discriminate between LCC and SCC from the output features [17]. LASSO regression is a regularized linear regression method that adds the sum of the weights (L1 regularization term) to the least-squares cost function, as follows:
where
Table 3 . Selected features using least absolute shrinkage and selection operator regression.
Features_2d (lambda value=0.08) | Features_3d (lambda value=0.09) |
---|---|
wavelet-LHH_glrlm_RunVariance | original_glrlm_ShortRunHighGrayLevelEmphasis |
wavelet-LHH_gldm_LargeDependenceLowGrayLevelEmphasis | original_glszm_HighGrayLevelZoneEmphasis |
wavelet-HLL_glrlm_ShortRunEmphasis | wavelet-HLH_firstorder_Kurtosis |
wavelet-HLH_glcm_Imc2 | wavelet-HHH_glcm_Imc2 |
wavelet-HHL_firstorder_90Percentile | wavelet-HHH_gldm_DependenceVariance |
wavelet-HHL_glrlm_LongRunEmphasis | wavelet-LLL_gldm_LargeDependenceHighGrayLevelEmphasis |
wavelet-HHL_glrlm_RunVariance |
Using Fisher’s linear discriminant analysis (LDA), LCC, and SCC were classified from each feature, with Features_2d and Features_3d selected by LASSO regression as input [23]. Because LDA generates linear decision boundaries, its output represents the degree to which the features classify the two groups. To calculate the root mean square error (RMSE) and the coefficient of determination (R2), multiple regression analysis was performed. In this regression, 70% of the cases were used as training data (27 LCC and 28 SCC cases), and 30% were validation data (11 LCC and 12 SCC cases). The receiver operating characteristic (ROC) curve was calculated using the values of the features with the highest regression coefficient
where
where
The two features with the highest regression coefficients in Features_2d and Features_3d are shown in a scatterplot to show that the selected features can classify LCC and SCC, and the results of drawing the decision boundaries by LDA are shown in Fig. 2. The features with the highest regression coefficients were “wavelet-HHL_firstorder_90Percentile” (wHHL_f_90P) and “wavelet-HHL_glrlm_RunVariance” (wHHL_RV) in Features_2d. In Features_3d, the highest regression coefficients were “original_glszm_HighGrayLevelZoneEmphasis” (o_HGLZE) and “original_glrlm_ShortRunHighGrayLevelEmphasis” (o_SRHGLE). In Table 4, the classification of the LCCs and SCCs using LDA are presented. The sensitivity and specificity (sensitivity to SCC) of the LCC classification in Features_2d were 86.8% and 77.5%, respectively. The sensitivity and specificity (sensitivity to SCC) of the LCC classification in Features_3d were 89.5% and 75.0%, respectively. Additionally, two images of each case classified by the prediction model are shown in Fig. 3. In Feature_2d, cases with high wHHL_RV and low wHHL_f_90P were expected to be SCC. In Features_3d, cases with high o_HGLZE and o_SRHGLE were expected to be SCC. The RMSE and R2 results from the multiple regression analysis are shown in Table 5, with similar RMSE and R2 values for Features_2d and Features_3d. Furthermore, ROC curves were created using wHHL_RV and o_SRHGLE, which had the highest regression coefficients in Features_2d and Features_3d (Fig. 4). The AUCs were 0.93 and 0.94, respectively. AUCs were analyzed using the DeLong method, with a
Table 4 . Estimation of sensitivity and specificity using the linear discriminant analysis analysis.
Features_2d output | Features_3d output | |||||
---|---|---|---|---|---|---|
LCC | SCC | LCC | SCC | |||
Truth | LCC | 86.8% (33/38) | 13.2% (5/38) | 89.5% (34/38) | 10.5% (4/38) | |
SCC | 22.5% (9/40) | 77.5% (31/40) | 25.0% (10/40) | 75.0% (30/40) |
Table 5 . Results of the multi-regression analysis.
Features_2d | Features_3d | |
---|---|---|
Train | ||
RMSE | 0.34 | 0.30 |
R2 | 0.72 | 0.75 |
Test | ||
RMSE | 0.41 | 0.34 |
R2 | 0.68 | 0.70 |
Various examinations revealed no difference in the prediction accuracy between Features_2d and Features_3d. The features selected using LASSO regression were almost completely different from Features_2d and Features_3d. The common point of the selected features is the presence of many wavelet-transformed features and more texture features than first-order or shape features. One of the most commonly used preprocessing steps in radiomics analysis is the wavelet transform [26]. High- and low-pass filters can be used to obtain image features that are decomposed into low- and high-frequency components. Therefore, detecting features that emphasize noise and edge components compared with those without image processing is possible. Texture features do not use the original pixel values of the image; however, they transform the distribution of pixel values into a matrix and then calculate the feature values as scalar quantities. Therefore, the process of calculating these diverse features established the prediction accuracy of Features_2d, which has only 2D information, considered to be equal to that of Features_3d. However, because the amount of information in Features_3d is greater than that in Features_2d, we believe that the features without image processing (“original”) were also able to produce high regression coefficients. Fig. 3 shows the cases in our study that could be classified as LCC or SCC and those that could not. The histological types of lung cancer can be distinguished by features such as spiculation, notches, ground-grass density, and cavitation [27-29]. For example, cavitation has been reported to often occur in cases of SCC and adenocarcinoma. In the analysis using Features_2d, wHHL_RV values of the radiomics features tended to be larger in SCC cases. Because “variance” in wHHL_RV increases with the variation of CT values, cavitation may have contributed to the variation of CT values. Additionally, because wHHL_f_90P increases with an increasing CT value, nodules with a solid internal structure were assigned as LCC. In Features_3d, the GLRLM, and GLSZM features were selected, which are matrices representing a sequence of identical concentration values. In the analysis using Features_3d, the higher the GLRLM, and GLSZM features, the more SCC was considered. Therefore, even if SCC is considered to be the most frequent case with cavitation, many continuous CT values in three dimensions are deemed present.
However, the prediction accuracies of both features were unsatisfactory. The sensitivities of LCC and SCC obtained from the decision boundaries of LDA were 86.8% and 77.5% for Features_2d and 89.5% and 75.0% for Features_3d, respectively. The R2 values were approximately 0.7 for both features. In this study, we used data from 38 LCC and 40 SCC cases, and the low number of cases resulted in a lower accuracy. The image data used in this study were obtained from the TCIA dataset. Therefore, the same CT image may have different imaging and detailed reconstruction conditions. Accuracy degradation owing to uncertainty caused by the CT image acquisition conditions and other factors have also been reported [30]. Particularly, slice thickness is an important factor in this study. Previous reports have indicated that slice thickness affects radiomics analysis [14]. It is believed that changing slice thickness also changes the sharpness and noise characteristics of CT images, which may affect radiomics analysis. However, in this study, we believe that isotropic voxelization minimized the effect of the slice thickness. Furthermore, we were also able to minimize the effects of differences in slice spacing, FOV, and noise property between cases. Each case was reconstructed using a kernel for the lung fields; however, the frequency response is not always the same. Therefore, the reconstruction kernel used might have affected the construction of the predictive model. To achieve a higher accuracy, considering the time of image data acquisition—for example, by fixing the imaging conditions at the facility—is necessary. This is a limitation of this study. Furthermore, because the segmentation in this study was performed by one author, the related results may have bias. However, segmentation was evaluated by all the authors and corrections were employed accordingly. We consider that the accuracy of the segmentation did not differ between the 2D and 3D segmentations because both segmentations were performed by one author. However, the ROC curve clearly demonstrated that the difference between the two features was minimal. No statistically significant difference was found in the accuracies between the two features. Therefore, in the prediction accuracy between Features_2d and Features_3d in the radiomics analysis of LCC and SCC using CT images, no difference was observed. The image data size used to obtain Features_2d was approximately 600 MB, whereas that used to obtain Features_3d was approximately 10 GB. Lesion segmentation can also be achieved much more quickly with a 2D method. Therefore, a 2D analysis for predicting LCC and SCC using radiomics analysis is recommended.
In this study, we investigated whether a difference exists in the accuracy of radiomics analysis for differentiating between two types of lung cancer when Features_2d and Features_3d are employed in the analysis. Various examinations (i.e., LDA, sensitivity, RMSE, R2, and ROC analysis) revealed no difference in the differentiation accuracy between Features_2d and Features_3d. Considering the amount of segmentation work and the amount of data used in the analysis, Features_3d has many disadvantages. The use of Features_2d has a comparable result with that of the Features_3d with data size reduction. Therefore, radiomics analysis to differentiate between LCC and SCC is recommended using Features_2d.
The authors have nothing to disclose.
The data that support the findings of this study are available on request from the corresponding author.
Conceptualization: Ryohei Fukui, Ryutarou Matsuura, Katsuhiro Kida, Sachiko Goto. Data curation: Ryohei Fukui, Katsuhiro Kida. Formal analysis: Ryohei Fukui, Ryutarou Matsuura. Investigation: Ryohei Fukui. Methodology: Ryohei Fukui, Ryutarou Matsuura, Katsuhiro Kida, Sachiko Goto. Project administration: Ryohei Fukui, Sachiko Goto. Resources: Ryohei Fukui. Software: Ryohei Fukui. Supervision: Ryohei Fukui, Sachiko Goto. Validation: Ryohei Fukui, Katsuhiro Kida. Visualization: Ryohei Fukui. Writing - original draft: Ryohei Fukui. Writing - review & editing: Ryohei Fukui, Ryutarou Matsuura, Katsuhiro Kida, Sachiko Goto.
Table 1 Details of the selected cases
Age (y) | Sex (cases) | Stage (cases) | ||||||
---|---|---|---|---|---|---|---|---|
Male | Female | I | II | IIIa | IIIb | |||
LCC | 65.6±10.6 | 22 | 16 | 11 | 3 | 12 | 12 | |
SCC | 70.0±10.3 | 24 | 16 | 11 | 8 | 10 | 11 |
Table 2 All radiomic features used in this study
Category | Feature name | Category | Feature name |
---|---|---|---|
Shape | Elongation | GLSZM | GrayLevelNonUniformity |
Flatness | GrayLevelNonUniformityNormalized | ||
LeastAxisLength | GrayLevelVariance | ||
MajorAxisLength | HighGrayLevelZoneEmphasis | ||
Maximum2DDiameterColumn | LargeAreaEmphasis | ||
Maximum2DDiameterRow | LargeAreaHighGrayLevelEmphasis | ||
Maximum2DDiameterSlice | LargeAreaLowGrayLevelEmphasis | ||
Maximum3DDiameter | LowGrayLevelZoneEmphasis | ||
MeshVolume | SizeZoneNonUniformity | ||
MinorAxisLength | SizeZoneNonUniformityNormalized | ||
Sphericity | SmallAreaEmphasis | ||
SurfaceArea | SmallAreaHighGrayLevelEmphasis | ||
SurfaceVolumeRatio | SmallAreaLowGrayLevelEmphasis | ||
VoxelVolume | ZoneEntropy | ||
Firstorder | 10Percentile | ZonePercentage | |
90Percentile | ZoneVariance | ||
Energy | GLRLM | GrayLevelNonUniformity | |
Entropy | GrayLevelNonUniformityNormalized | ||
InterquartileRange | GrayLevelVariance | ||
Kurtosis | HighGrayLevelRunEmphasis | ||
Maximum | LongRunEmphasis | ||
MeanAbsoluteDeviation | LongRunHighGrayLevelEmphasis | ||
Mean | LongRunLowGrayLevelEmphasis | ||
Median | LowGrayLevelRunEmphasis | ||
Minimum | RunEntropy | ||
Range | RunLengthNonUniformity | ||
RobustMeanAbsoluteDeviation | RunLengthNonUniformityNormalized | ||
RootMeanSquared | RunPercentage | ||
Skewness | RunVariance | ||
TotalEnergy | ShortRunEmphasis | ||
Uniformity | ShortRunHighGrayLevelEmphasis | ||
Variance | ShortRunLowGrayLevelEmphasis | ||
GLCM | Autocorrelation | GLDM | DependenceEntropy |
JointAverage | DependenceNonUniformity | ||
ClusterProminence | DependenceNonUniformityNormalized | ||
ClusterShade | DependenceVariance | ||
ClusterTendency | GrayLevelNonUniformity | ||
Contrast | GrayLevelVariance | ||
Correlation | HighGrayLevelEmphasis | ||
DifferenceAverage | LargeDependenceEmphasis | ||
DifferenceEntropy | LargeDependenceHighGrayLevelEmphasis | ||
DifferenceVariance | LargeDependenceLowGrayLevelEmphasis | ||
JointEnergy | LowGrayLevelEmphasis | ||
JointEntropy | SmallDependenceEmphasis | ||
Imc1 | SmallDependenceHighGrayLevelEmphasis | ||
Imc2 | SmallDependenceLowGrayLevelEmphasis | ||
GLCM | Idm | ||
Idmn | |||
Id | |||
Idn | |||
InverseVariance | |||
MaximumProbability | |||
SumEntropy | |||
SumSquares |
Table 3 Selected features using least absolute shrinkage and selection operator regression
Features_2d (lambda value=0.08) | Features_3d (lambda value=0.09) |
---|---|
wavelet-LHH_glrlm_RunVariance | original_glrlm_ShortRunHighGrayLevelEmphasis |
wavelet-LHH_gldm_LargeDependenceLowGrayLevelEmphasis | original_glszm_HighGrayLevelZoneEmphasis |
wavelet-HLL_glrlm_ShortRunEmphasis | wavelet-HLH_firstorder_Kurtosis |
wavelet-HLH_glcm_Imc2 | wavelet-HHH_glcm_Imc2 |
wavelet-HHL_firstorder_90Percentile | wavelet-HHH_gldm_DependenceVariance |
wavelet-HHL_glrlm_LongRunEmphasis | wavelet-LLL_gldm_LargeDependenceHighGrayLevelEmphasis |
wavelet-HHL_glrlm_RunVariance |
Table 4 Estimation of sensitivity and specificity using the linear discriminant analysis analysis
Features_2d output | Features_3d output | |||||
---|---|---|---|---|---|---|
LCC | SCC | LCC | SCC | |||
Truth | LCC | 86.8% (33/38) | 13.2% (5/38) | 89.5% (34/38) | 10.5% (4/38) | |
SCC | 22.5% (9/40) | 77.5% (31/40) | 25.0% (10/40) | 75.0% (30/40) |
Table 5 Results of the multi-regression analysis
Features_2d | Features_3d | |
---|---|---|
Train | ||
RMSE | 0.34 | 0.30 |
R2 | 0.72 | 0.75 |
Test | ||
RMSE | 0.41 | 0.34 |
R2 | 0.68 | 0.70 |
pISSN 2508-4445
eISSN 2508-4453
Formerly ISSN 1226-5829
Frequency: Quarterly