Ex) Article Title, Author, Keywords
Ex) Article Title, Author, Keywords
Progress in Medical Physics 2024; 35(4): 106-115
Published online December 31, 2024
https://doi.org/10.14316/pmp.2024.35.4.106
Copyright © Korean Society of Medical Physics.
Dong Hyeok Choi1,2,3 , Jin Sung Kim1,2,3
, So Hyun Ahn4,5,6
Correspondence to:So Hyun Ahn
(mpsohyun@ewha.ac.kr)
Tel: 82-2-6986-6305
Fax: 82-0504-158-4052
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Purpose: This study aims to develop a comprehensive preprocessing workflow for Digital Imaging and Communications in Medicine (DICOM) files to facilitate their effective use in AI-driven medical applications. With the increasing utilization of DICOM data for AI learning, analysis, Metaverse platform integration, and 3D printing of anatomical structures, the need for streamlined preprocessing is essential. The workflow is designed to optimize DICOM files for diverse applications, improving their usability and accessibility for advanced medical technologies.
Methods: The proposed workflow employs a systematic approach to preprocess DICOM files for AI applications, focusing on noise reduction, normalization, segmentation, and conversion to 3D-renderable formats. These steps are integrated into a unified process to address challenges such as data variability, format incompatibilities, and high computational demands. The study incorporates real-world medical imaging datasets to evaluate the workflow’s effectiveness and adaptability for AI analysis and 3D visualization. Additionally, the workflow’s compatibility with virtual environments, such as Metaverse platforms, is assessed to ensure seamless integration.
Results: The implementation of the workflow demonstrated significant improvements in the preprocessing of DICOM files. The processed files were optimized for AI analysis, yielding enhanced model performance and accuracy in learning tasks. Furthermore, the workflow enabled the successful conversion of DICOM data into 3D-printable formats and virtual environments, supporting applications like anatomical visualization and simulation. The study highlights the workflow's ability to reduce preprocessing time and errors, making advanced medical imaging technologies more accessible.
Conclusions: This study emphasizes the critical role of effective preprocessing in maximizing the potential of DICOM data for AI-driven applications and innovative medical solutions. The proposed workflow simplifies the preprocessing of DICOM files, facilitating their integration into AI models, Metaverse platforms, and 3D printing processes. By enhancing usability and accessibility, the workflow fosters broader adoption of advanced imaging technologies in the medical field.
KeywordsDICOM preprocessing, AI in medical imaging, 3D printing, Metaverse applications
In recent years, the integration of artificial intelligence (AI) within the medical field has revolutionized the approach to diagnostics, treatment planning, and personalized medicine [1-3]. With the advent of advancing imaging technologies, the Digital Imaging and Communications in Medicine (DICOM) standard has emerged as a cornerstone for medical imaging data [4-6]. As the usage of DICOM files grows alongside the increasing utilization of AI algorithms for training and analysis, the need for effective preprocessing methodologies becomes crucial [7].
The preprocessing of DICOM files serves multiple purposes, facilitating the extraction of meaningful features and patterns that can be harnessed by AI systems [8]. This process is critical in ensuring data quality and consistency, which directly impacts the performance and reliability of the AI models. Furthermore, the preprocessing phase lays the foundation for various downstream applications, such as image segmentation, classification, and disease detection [9]. By transforming raw DICOM data into structured formats suitable for AI analysis, researchers and clinicians can harness machine learning algorithms to enhance clinical decision-making and patient outcomes [10-12].
Additionally, the burgeoning field of the Metaverse has opened new avenues for the utilization of medical data [13]. By importing patient data into virtual environments, healthcare professionals can engage in immersive simulations, thereby enhancing the understanding of complex anatomical structures and improving surgical planning. The integration of three dimensional (3D) printing technologies enables the physical manifestation of these digital models, facilitating the creation of patient-specific anatomical replicas [14,15]. These innovations have shown the potential to personalize treatment approaches and enhance patient engagement in their healthcare journeys.
To fully realize the benefits of AI and related technologies in the medical field, a comprehensive and user-friendly preprocessing workflow for DICOM files is essential. This thesis presents a streamlined preprocessing pipeline designed to simplify the extraction, transformation, and loading of DICOM files into formats suitable for AI applications, as well as for use in Metaverse platforms and 3D printing [16-18]. By standardizing and optimizing preprocessing steps, this work aims to empower healthcare professionals and researchers, enabling them to leverage AI and 3D technologies to improve patient care and outcomes.
In recent studies, frameworks, such as MIScnn, have been developed to streamline medical image segmentation and preprocessing by integrating data preprocessing, augmentation, and deep learning model training into cohesive workflows [19]. These frameworks have significantly advanced the field by introducing robust pipelines for medical imaging data. However, they primarily focus on segmentation tasks and often lack support for integrating radiotherapy (RT) structure data or generating formats compatible with emerging applications such as 3D printing and Metaverse platforms. Our study addresses these limitations by introducing a unified preprocessing pipeline that harmonizes the computed tomography (CT) and RT structure data, ensuring compatibility across diverse applications. Unlike existing frameworks, our approach not only facilitates segmentation but also prepares data for advanced uses, including RT planning, surgical simulation, and patient-specific 3D modeling. This pipeline bridges the gap between imaging modalities, providing an adaptable and efficient solution for modern medical imaging applications.
The primary distinction of this study lies in the development of a comprehensive preprocessing pipeline that manages both CT and RT structure data, transforming DICOM files into AI-compatible and 3D-printable formats. Previous research has primarily focused on converting DICOM data into single formats or offering limited preprocessing functionalities. For instance, Anderson et al. [20] introduced a Python module for converting DICOM and RT structure data into formats such as NumPy arrays (NumPy Development Team) or SimpleITK images (Kitware Inc.) for AI analysis. However, its capability to consistently adjust both CT and RT structure data to standardized formats, allowing for versatile format conversions like Neuroimaging Informatics Technology Initiative (NIfTI) and Stereolithography (STL), is limited. In contrast, our study supports conversions into multiple formats, enabling the medical data to be flexibly utilized across platforms, from AI analysis to 3D printing. In addition, the research by Mamdouh et al. [21] focuses on converting 2D DICOM images into 3D models yet does not address a standardized preprocessing pipeline that integrates multimodal data, including RT structure data. Our study is differentiated by integrating both CT and RT structure data, performing standardized alignment, resizing, and pixel interval normalization across multimodal data, and ensuring high compatibility with AI and 3D printing platforms.
This study specifically aims to address two critical limitations identified in previous research: the absence of a user-friendly and efficient preprocessing pipeline capable of handling both CT and RT structure data and converting them into AI-compatible and 3D-printable formats, and the necessity for a standardized methodology that maintains spatial and structural integrity across multiple modalities. These limitations impede the broader implementation of multimodal imaging in fields, such as personalized medicine and surgical planning, while also restricting data utilization in innovative platforms like the Metaverse.
The primary objective of this research is to develop a robust preprocessing workflow capable of reliably transforming raw DICOM files from CT and RT structure modalities into standardized formats suitable for various downstream applications. Through the implementation of pixel interval normalization, spatial alignment, and intensity correction, this pipeline ensures the homogenization of all imaging data, thereby addressing the challenges of cross-modality data inconsistency. This methodological approach not only facilitates AI-driven image analysis but also enhances the compatibility of the processed data with 3D printing and virtual reality environments.
This study outlines a structured approach to preprocessing and normalizing CT and RT structure data, leveraging Python for data handling and processing. The pipeline is designed to standardize image resolutions, correct spatial inconsistencies, and prepare data for diverse applications, such as RT planning, surgical simulation, and patient-specific 3D modeling. Using Python libraries, such as DICOM for DICOM handling, lumpy for numerical operations, and SimpleITK for image resampling, the pipeline automates the following key steps: loading raw DICOM files, normalizing pixel intervals to a target resolution, and converting data to multiple formats, including NIfTI, STL, and polygon file format (PLY). These formats ensure compatibility with AI algorithms, 3D printing, and Metaverse platforms, enhancing their utility for segmentation, registration, and 3D visualization tasks. Fig. 1 illustrates the Python-based workflow for preprocessing CT and RT structure data, highlighting an adaptable framework that ensures data compatibility across imaging applications.
The CT imaging data are processed through a structured pipeline that sequentially handles the loading, normalization, and preprocessing of each modality.
To establish standardized pixel dimensions across images acquired from multiple scanners or protocols, a systematic approach for pixel spacing normalization in CT images was implemented. The primary objective was to rescale images with varying resolutions to a standardized pixel spacing, enabling direct comparison between datasets. In this process, the images were resized if the original pixel spacing deviated from the predefined standard resolution. Interpolation techniques were applied during resizing to maintain image integrity while preserving anatomical details.
Raw DICOM files were used to load CT images, from which metadata, including slice position and image dimensions, was extracted. A pixel interval normalization technique was then applied to harmonize the resolution of all slices. In this study, pixel and slice spacing normalization was performed using isotropic resampling with a target spacing of 1 mm in each dimension (x, y, and z). By normalizing the pixel and slice spacing in all three dimensions, uniformity was achieved, which enhanced the compatibility of the imaging data with AI algorithms and 3D visualization tools, thereby facilitating accurate cross-modality analysis and alignment. After normalization, additional preprocessing steps were undertaken to correct the image intensity values and prepare the data for subsequent analyses. The processed CT images were saved in various formats, including NIfTI, STL, and PLY. These images were then utilized for tasks, such as machine learning, segmentation, and visualization.
The structure data were processed through a structured pipeline that sequentially handled spatial alignment, resizing, and preprocessing of the contour data. This process ensures that the RT contours are compatible with the corresponding CT images, ensuring proper alignment.
Discrepancies in the pixel spacing between the RT and CT images may result in misalignment. To mitigate this, the RT contours were resized using a similar approach to ensure spatial alignment with the corresponding CT images. After resizing, the contours were centered and adjusted within a standardized 512×512-pixel grid, which is the commonly used dimension in clinical CT imaging. The resizing process followed specific rules: when a resized contour exceeds the 512×512 grid dimensions, it was systematically truncated by removing excess pixels while preserving the central region of interest. Conversely, if the resized contour was smaller than the standard grid, zero-value padding was symmetrically added to all sides until it matched the 512×512 dimensions. This standardization process ensured consistent sizing across all image slices while maintaining the anatomical integrity of the central structures.
The contour data were obtained using MIM Software (MIM Software Inc.), a commercial tool widely recognized for its robust capabilities in medical image analysis and treatment planning. In our study, we utilized optional contour processing techniques to enhance the delineation of anatomical structures. The contour processing included smoothing and edge enhancement, which were applied based on the specific requirements of the analysis. Smoothing was used to reduce noise and irregularities in the contours, ensuring a more accurate representation of the anatomical boundaries. Edge enhancement was applied to sharpen the contours, making them more distinct and aiding precise segmentation, particularly in regions with closely packed or overlapping structures.
In this study, we utilized patient data approved by the Institutional Review Board under approval number 2023-07-001-002. The patient data were collected at Ewha Womans University Mokdong Hospital and included information from 10 patients. Anatomical regions of interest, such as the breast, liver, and spleen, were segmented.
We applied our preprocessing pipeline to both the CT images and the RT structure data, ensuring proper alignment and format conversion. As a result of the preprocessing, the CT and RT structure data were converted into NIfTI, STL, and PLY file formats. For visualization, we displayed the results in the NIfTI and STL formats. Although PLY files were also generated, they are structurally identical to the STL files when viewed using a 3D viewer. Therefore, only the STL results are presented in this paper.
The proposed preprocessing pipeline for CT and RT structure data was successfully implemented, demonstrating its ability to effectively normalize and harmonize imaging data across modalities.
In our study, we analyzed data from 10 patients, processing structures on the left breast, right breast, liver, and spleen for each patient. The average total processing time was 72.78 seconds per patient. During this time, we executed all necessary preprocessing steps for both the CT and RT structures, successfully converting them into the NIfTI, STL, and PLY formats. This comprehensive approach ensured that all relevant anatomical data were adequately prepared for further applications, including AI analysis, 3D printing, and integration into the Metaverse.
The preprocessing of patient data yielded several output formats generated from both the CT images and the RT structure data, as described in the methods section. Figs. 2 and 3 illustrate the results of the preprocessed data, showcasing both the CT data and the corresponding RT structures.
Figs. 2a, c display the preprocessed CT data, while Figs. 2b, d show corresponding RT structure information for the same slices presented in Figs. 2a, c, respectively. Specifically, Fig. 2b demonstrates the segmented left and right breasts, while Fig. 2d shows the segmented liver and spleen.
Fig. 3 highlights the results of the format conversion process. Fig. 3a presents the CT data after conversion into the STL format, showing how the original CT imaging data has been rendered into a 3D model suitable for various applications, including 3D printing and visualization. In contrast, Fig. 3b illustrates the RT structure data converted into the STL format. The organs displayed in Fig. 3b, from top to bottom, are the breasts, liver, and spleen, offering a clear visual reference for how the segmented structures align within the overall anatomical context of the CT scan.
These results affirm the efficacy of our preprocessing pipeline in preparing patient imaging data for downstream tasks by converting it into multiple file formats while preserving structural and spatial integrity. Although PLY files were also generated, only the STL files are shown in this study, as both formats appear visually identical when viewed in a 3D viewer.
The preprocessed data were successfully exported in NIfTI, STL, and PLY formats. These formats were chosen based on their compatibility with AI algorithms, 3D printing, and virtual reality applications in the Metaverse. The conversion process preserved the integrity of the preprocessed data, ensuring that the output files were of high quality and ready for downstream applications. The NIfTI format facilitated the use of these images in machine learning models for tasks, such as segmentation and classification, while the STL and PLY formats provided detailed 3D models for physical printing and virtual simulations.
The preprocessing of medical imaging data for AI learning is designed to be simple and accessible, enabling even non-expert individuals to perform it effortlessly. A core principle behind the development of the proposed preprocessing pipeline is user-friendliness, allowing healthcare professionals, researchers, and even non-expert users to efficiently transform raw DICOM data into formats suitable for AI analysis. By streamlining the process and minimizing the need for extensive manual intervention, the workflow allows users to quickly prepare data for machine learning algorithms without requiring specialized programming skills or advanced computational knowledge.
Moreover, the workflow demonstrated impressive efficiency in processing time. The entire preprocessing process for a single patient’s CT images and RT structure data, including their transformation into multiple output formats, such as NIfTI, STL, and PLY, can be completed in a highly reasonable amount of time. On average, the processing time for a single patient is approximately 70 seconds. This rapid processing time underscores the efficiency of the pipeline, making it an ideal solution for scenarios that require the preparation of large datasets or the processing of data from multiple patients within a limited time frame.
The primary contribution of this thesis lies in the design of a preprocessing workflow that systematically addresses the inherent challenges of handling raw DICOM files. By implementing pixel interval normalization, spatial alignment, and intensity correction, the pipeline ensures that all imaging data are homogenized, thus facilitating cross-modality analyses such as image registration, fusion, and multimodal diagnosis. These steps are crucial for enhancing the precision of AI models, especially when working with heterogeneous datasets obtained from different imaging devices or protocols.
Moreover, preprocessing DICOM files is a crucial prerequisite for downstream applications, such as 3D segmentation, volumetric analysis, and even surgical simulation, all of which are central to personalized medicine. Accurate segmentation and delineation of anatomical structures—essential for tumor identification or organ-specific treatment planning—depend heavily on the quality of the preprocessed data. The pipeline’s ability to enhance the consistency and integrity of the data ultimately improves the reliability of AI-based diagnostic tools.
Beyond AI, this thesis explores the integration of preprocessed medical data with emerging technologies, such as the Metaverse and 3D printing. The Metaverse, with its immersive and interactive environment, has tremendous potential in healthcare education, virtual consultations, and surgical planning. The ability to visualize complex anatomical structures in a virtual setting enables clinicians to explore patient-specific cases in greater detail, enhancing decision-making and procedural outcomes.
3D printing, however, takes virtual visualization a step further by enabling the creation of physical models that replicate patient anatomy. These models are particularly useful for preoperative planning, the development of customized implants, and patient education. However, the accuracy of these models is directly tied to the quality of the digital data from which they are derived. The preprocessing pipeline presented in this thesis ensures that the DICOM data when converted to formats like STL or PLY, is accurate and optimized for 3D printing applications.
By combining AI, the Metaverse, and 3D printing, the proposed framework addresses the growing need for patient-specific models and virtual environments, enhancing the understanding of patient anatomy and improving clinical outcomes. This multifaceted approach paves the way for innovations that could transform personalized medicine, enabling highly tailored diagnostic and therapeutic strategies.
The versatility of our processing pipeline with various AI models is a key consideration. While our current implementation focuses on specific applications, it is essential to explore how these processed 3D images can be utilized across different AI frameworks, which may have varying requirements for input data formats and preprocessing steps. Future research should focus on evaluating the compatibility of our outputs with commonly used AI models in medical imaging, ensuring that the generated files maintain anatomical fidelity and serve as effective inputs for advanced analytical tools. By addressing these practical limitations, we can enhance the overall impact and utility of our research in real-world clinical scenarios.
Additionally, a limitation lies in the computational cost of the preprocessing workflow. Although the pipeline has been optimized for efficiency, processing large datasets—particularly those involving high-resolution 3D images—can still be time-consuming. This issue could be mitigated by leveraging parallel processing techniques or cloud-based computing solutions, which would significantly reduce the time required for preprocessing large-scale data, particularly in clinical settings where timely decision-making is critical [22].
Furthermore, integrating AI models into the preprocessing pipeline could open new avenues for automation and increased efficiency. For instance, AI-driven algorithms could be trained to detect and correct artifacts, adjust intensity values based on contextual information, or even predict imaging errors using scanner metadata. Such advancements would further streamline the workflow and minimize the need for manual interventions.
In this investigation, we conducted a comprehensive analysis of 10 patient datasets, focusing on the anatomical structures of the left breast, right breast, liver, and spleen. The preprocessing pipeline demonstrated efficient performance, with a mean processing time of 72.78 seconds per patient (n=10). This duration encompassed the complete preprocessing workflow for both CT and RT structures, including their conversion into NIfTI, STL, and PLY formats. Our systematic approach ensured the thorough preparation of all relevant anatomical data for subsequent applications, including AI analysis, 3D printing, and Metaverse integration.
The future of medical imaging lies in the convergence of AI, virtual reality, and 3D printing technologies, with DICOM data serving as the linchpin that binds these technologies together. As the integration of these fields progresses, new applications and opportunities will emerge.
In our study, a primary focus was converting DICOM files into 3D model formats, specifically STL and PLY, which are crucial for both 3D printing and Metaverse environments. Raw DICOM files require preprocessing and conversion into these 3D model formats to be compatible with practical applications in virtual environments and 3D printing platforms. By converting anatomical data from DICOM images into STL and PLY formats, we enable seamless integration with 3D visualization tools. Once created in these formats, the 3D models can be used in the Metaverse for immersive virtual experiences, allowing users to interact with and explore anatomical structures, thereby enhancing educational and clinical training applications. Likewise, these 3D models can be directly used in 3D printing to produce physical replicas for surgical planning, patient education, and prosthetics.
One potential area of research that we aim to explore further is the development of real-time preprocessing algorithms that process DICOM data on the fly, facilitating instant feedback during diagnostic imaging or surgery. Real-time applications would greatly enhance the practicality of AI models and virtual simulations in the clinical environment, allowing for immediate adjustments and decision-making.
Additionally, augmented reality presents a promising avenue when integrated with the Metaverse. While the Metaverse offers a fully immersive experience, AR could overlay 3D models or medical data directly onto the patient during procedures, providing surgeons with a hybrid view that combines the real and virtual worlds. This integration could further elevate the precision of medical interventions and broaden the real-time application of AI-powered imaging in clinical scenarios. We plan to expand on these concepts in future work, providing specific technical details and examples of how 3D models in the Metaverse can enhance user interaction, visualization, and diagnostic accuracy.
The presented preprocessing pipeline effectively tackles the challenges in multimodal medical imaging, enabling the integration of AI, the Metaverse, and 3D printing in clinical practice. To enhance the manuscript’s relevance and impact, future research should explore real-time preprocessing algorithms for live imaging scenarios, aiming to improve surgical accuracy and decision-making. Additionally, integrating augmented reality with Metaverse technologies could enhance the visualization of patient anatomy during procedures. Collaborating with clinical institutions to validate the pipeline across diverse patient populations is essential to ensure its adaptability and effectiveness in various clinical settings.
This research supported by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant was funded by the Korea government (MOTIE) (20227410100050) and supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (grant number NRF-2022R1H1A2092091) and this work was supported by the Technology development Program (RS-2023-00257618) funded by the Ministry of SMEs and Startups (MSS, Korea). This Study was supported by a grant of the SME R&D project for the Start-up & Grow stage company, Ministry of SMEs and Startups (RS-2024-00426787).
The authors have nothing to disclose.
All relevant data are within the paper and its Supporting Information files.
Conceptualization: Dong Hyeok Choi, Jin Sung Kim, So Hyun Ahn. Formal analysis: Dong Hyeok Choi. Resources: Jin Sung Kim, So Hyun Ahn. Writing – original draft: Dong Hyeok Choi. Writing – review & editing: Dong Hyeok Choi, Jin Sung Kim, So Hyun Ahn.
The study was approved by the Institutional Review Board of Ewha Womans University Mokdong Hospital (IRB approval number; 2023-07-001-002).
Progress in Medical Physics 2024; 35(4): 106-115
Published online December 31, 2024 https://doi.org/10.14316/pmp.2024.35.4.106
Copyright © Korean Society of Medical Physics.
Dong Hyeok Choi1,2,3 , Jin Sung Kim1,2,3
, So Hyun Ahn4,5,6
1Department of Medicine, Yonsei University College of Medicine, Seoul, 2Medical Physics and Biomedical Engineering Lab (MPBEL), Yonsei University College of Medicine, Seoul, 3Department of Radiation Oncology, Yonsei Cancer Center, Heavy Ion Therapy Research, 4Ewha Medical Research Institute, School of Medicine, Ewha Womans University, Seoul, 5Department of Biomedical Engineering, Ewha Womans University College of Medicine, Seoul, 6Ewha Medical Artificial Intelligence Research Institute, Ewha Womans University College of Medicine, Seoul, Korea
Correspondence to:So Hyun Ahn
(mpsohyun@ewha.ac.kr)
Tel: 82-2-6986-6305
Fax: 82-0504-158-4052
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Purpose: This study aims to develop a comprehensive preprocessing workflow for Digital Imaging and Communications in Medicine (DICOM) files to facilitate their effective use in AI-driven medical applications. With the increasing utilization of DICOM data for AI learning, analysis, Metaverse platform integration, and 3D printing of anatomical structures, the need for streamlined preprocessing is essential. The workflow is designed to optimize DICOM files for diverse applications, improving their usability and accessibility for advanced medical technologies.
Methods: The proposed workflow employs a systematic approach to preprocess DICOM files for AI applications, focusing on noise reduction, normalization, segmentation, and conversion to 3D-renderable formats. These steps are integrated into a unified process to address challenges such as data variability, format incompatibilities, and high computational demands. The study incorporates real-world medical imaging datasets to evaluate the workflow’s effectiveness and adaptability for AI analysis and 3D visualization. Additionally, the workflow’s compatibility with virtual environments, such as Metaverse platforms, is assessed to ensure seamless integration.
Results: The implementation of the workflow demonstrated significant improvements in the preprocessing of DICOM files. The processed files were optimized for AI analysis, yielding enhanced model performance and accuracy in learning tasks. Furthermore, the workflow enabled the successful conversion of DICOM data into 3D-printable formats and virtual environments, supporting applications like anatomical visualization and simulation. The study highlights the workflow's ability to reduce preprocessing time and errors, making advanced medical imaging technologies more accessible.
Conclusions: This study emphasizes the critical role of effective preprocessing in maximizing the potential of DICOM data for AI-driven applications and innovative medical solutions. The proposed workflow simplifies the preprocessing of DICOM files, facilitating their integration into AI models, Metaverse platforms, and 3D printing processes. By enhancing usability and accessibility, the workflow fosters broader adoption of advanced imaging technologies in the medical field.
Keywords: DICOM preprocessing, AI in medical imaging, 3D printing, Metaverse applications
In recent years, the integration of artificial intelligence (AI) within the medical field has revolutionized the approach to diagnostics, treatment planning, and personalized medicine [1-3]. With the advent of advancing imaging technologies, the Digital Imaging and Communications in Medicine (DICOM) standard has emerged as a cornerstone for medical imaging data [4-6]. As the usage of DICOM files grows alongside the increasing utilization of AI algorithms for training and analysis, the need for effective preprocessing methodologies becomes crucial [7].
The preprocessing of DICOM files serves multiple purposes, facilitating the extraction of meaningful features and patterns that can be harnessed by AI systems [8]. This process is critical in ensuring data quality and consistency, which directly impacts the performance and reliability of the AI models. Furthermore, the preprocessing phase lays the foundation for various downstream applications, such as image segmentation, classification, and disease detection [9]. By transforming raw DICOM data into structured formats suitable for AI analysis, researchers and clinicians can harness machine learning algorithms to enhance clinical decision-making and patient outcomes [10-12].
Additionally, the burgeoning field of the Metaverse has opened new avenues for the utilization of medical data [13]. By importing patient data into virtual environments, healthcare professionals can engage in immersive simulations, thereby enhancing the understanding of complex anatomical structures and improving surgical planning. The integration of three dimensional (3D) printing technologies enables the physical manifestation of these digital models, facilitating the creation of patient-specific anatomical replicas [14,15]. These innovations have shown the potential to personalize treatment approaches and enhance patient engagement in their healthcare journeys.
To fully realize the benefits of AI and related technologies in the medical field, a comprehensive and user-friendly preprocessing workflow for DICOM files is essential. This thesis presents a streamlined preprocessing pipeline designed to simplify the extraction, transformation, and loading of DICOM files into formats suitable for AI applications, as well as for use in Metaverse platforms and 3D printing [16-18]. By standardizing and optimizing preprocessing steps, this work aims to empower healthcare professionals and researchers, enabling them to leverage AI and 3D technologies to improve patient care and outcomes.
In recent studies, frameworks, such as MIScnn, have been developed to streamline medical image segmentation and preprocessing by integrating data preprocessing, augmentation, and deep learning model training into cohesive workflows [19]. These frameworks have significantly advanced the field by introducing robust pipelines for medical imaging data. However, they primarily focus on segmentation tasks and often lack support for integrating radiotherapy (RT) structure data or generating formats compatible with emerging applications such as 3D printing and Metaverse platforms. Our study addresses these limitations by introducing a unified preprocessing pipeline that harmonizes the computed tomography (CT) and RT structure data, ensuring compatibility across diverse applications. Unlike existing frameworks, our approach not only facilitates segmentation but also prepares data for advanced uses, including RT planning, surgical simulation, and patient-specific 3D modeling. This pipeline bridges the gap between imaging modalities, providing an adaptable and efficient solution for modern medical imaging applications.
The primary distinction of this study lies in the development of a comprehensive preprocessing pipeline that manages both CT and RT structure data, transforming DICOM files into AI-compatible and 3D-printable formats. Previous research has primarily focused on converting DICOM data into single formats or offering limited preprocessing functionalities. For instance, Anderson et al. [20] introduced a Python module for converting DICOM and RT structure data into formats such as NumPy arrays (NumPy Development Team) or SimpleITK images (Kitware Inc.) for AI analysis. However, its capability to consistently adjust both CT and RT structure data to standardized formats, allowing for versatile format conversions like Neuroimaging Informatics Technology Initiative (NIfTI) and Stereolithography (STL), is limited. In contrast, our study supports conversions into multiple formats, enabling the medical data to be flexibly utilized across platforms, from AI analysis to 3D printing. In addition, the research by Mamdouh et al. [21] focuses on converting 2D DICOM images into 3D models yet does not address a standardized preprocessing pipeline that integrates multimodal data, including RT structure data. Our study is differentiated by integrating both CT and RT structure data, performing standardized alignment, resizing, and pixel interval normalization across multimodal data, and ensuring high compatibility with AI and 3D printing platforms.
This study specifically aims to address two critical limitations identified in previous research: the absence of a user-friendly and efficient preprocessing pipeline capable of handling both CT and RT structure data and converting them into AI-compatible and 3D-printable formats, and the necessity for a standardized methodology that maintains spatial and structural integrity across multiple modalities. These limitations impede the broader implementation of multimodal imaging in fields, such as personalized medicine and surgical planning, while also restricting data utilization in innovative platforms like the Metaverse.
The primary objective of this research is to develop a robust preprocessing workflow capable of reliably transforming raw DICOM files from CT and RT structure modalities into standardized formats suitable for various downstream applications. Through the implementation of pixel interval normalization, spatial alignment, and intensity correction, this pipeline ensures the homogenization of all imaging data, thereby addressing the challenges of cross-modality data inconsistency. This methodological approach not only facilitates AI-driven image analysis but also enhances the compatibility of the processed data with 3D printing and virtual reality environments.
This study outlines a structured approach to preprocessing and normalizing CT and RT structure data, leveraging Python for data handling and processing. The pipeline is designed to standardize image resolutions, correct spatial inconsistencies, and prepare data for diverse applications, such as RT planning, surgical simulation, and patient-specific 3D modeling. Using Python libraries, such as DICOM for DICOM handling, lumpy for numerical operations, and SimpleITK for image resampling, the pipeline automates the following key steps: loading raw DICOM files, normalizing pixel intervals to a target resolution, and converting data to multiple formats, including NIfTI, STL, and polygon file format (PLY). These formats ensure compatibility with AI algorithms, 3D printing, and Metaverse platforms, enhancing their utility for segmentation, registration, and 3D visualization tasks. Fig. 1 illustrates the Python-based workflow for preprocessing CT and RT structure data, highlighting an adaptable framework that ensures data compatibility across imaging applications.
The CT imaging data are processed through a structured pipeline that sequentially handles the loading, normalization, and preprocessing of each modality.
To establish standardized pixel dimensions across images acquired from multiple scanners or protocols, a systematic approach for pixel spacing normalization in CT images was implemented. The primary objective was to rescale images with varying resolutions to a standardized pixel spacing, enabling direct comparison between datasets. In this process, the images were resized if the original pixel spacing deviated from the predefined standard resolution. Interpolation techniques were applied during resizing to maintain image integrity while preserving anatomical details.
Raw DICOM files were used to load CT images, from which metadata, including slice position and image dimensions, was extracted. A pixel interval normalization technique was then applied to harmonize the resolution of all slices. In this study, pixel and slice spacing normalization was performed using isotropic resampling with a target spacing of 1 mm in each dimension (x, y, and z). By normalizing the pixel and slice spacing in all three dimensions, uniformity was achieved, which enhanced the compatibility of the imaging data with AI algorithms and 3D visualization tools, thereby facilitating accurate cross-modality analysis and alignment. After normalization, additional preprocessing steps were undertaken to correct the image intensity values and prepare the data for subsequent analyses. The processed CT images were saved in various formats, including NIfTI, STL, and PLY. These images were then utilized for tasks, such as machine learning, segmentation, and visualization.
The structure data were processed through a structured pipeline that sequentially handled spatial alignment, resizing, and preprocessing of the contour data. This process ensures that the RT contours are compatible with the corresponding CT images, ensuring proper alignment.
Discrepancies in the pixel spacing between the RT and CT images may result in misalignment. To mitigate this, the RT contours were resized using a similar approach to ensure spatial alignment with the corresponding CT images. After resizing, the contours were centered and adjusted within a standardized 512×512-pixel grid, which is the commonly used dimension in clinical CT imaging. The resizing process followed specific rules: when a resized contour exceeds the 512×512 grid dimensions, it was systematically truncated by removing excess pixels while preserving the central region of interest. Conversely, if the resized contour was smaller than the standard grid, zero-value padding was symmetrically added to all sides until it matched the 512×512 dimensions. This standardization process ensured consistent sizing across all image slices while maintaining the anatomical integrity of the central structures.
The contour data were obtained using MIM Software (MIM Software Inc.), a commercial tool widely recognized for its robust capabilities in medical image analysis and treatment planning. In our study, we utilized optional contour processing techniques to enhance the delineation of anatomical structures. The contour processing included smoothing and edge enhancement, which were applied based on the specific requirements of the analysis. Smoothing was used to reduce noise and irregularities in the contours, ensuring a more accurate representation of the anatomical boundaries. Edge enhancement was applied to sharpen the contours, making them more distinct and aiding precise segmentation, particularly in regions with closely packed or overlapping structures.
In this study, we utilized patient data approved by the Institutional Review Board under approval number 2023-07-001-002. The patient data were collected at Ewha Womans University Mokdong Hospital and included information from 10 patients. Anatomical regions of interest, such as the breast, liver, and spleen, were segmented.
We applied our preprocessing pipeline to both the CT images and the RT structure data, ensuring proper alignment and format conversion. As a result of the preprocessing, the CT and RT structure data were converted into NIfTI, STL, and PLY file formats. For visualization, we displayed the results in the NIfTI and STL formats. Although PLY files were also generated, they are structurally identical to the STL files when viewed using a 3D viewer. Therefore, only the STL results are presented in this paper.
The proposed preprocessing pipeline for CT and RT structure data was successfully implemented, demonstrating its ability to effectively normalize and harmonize imaging data across modalities.
In our study, we analyzed data from 10 patients, processing structures on the left breast, right breast, liver, and spleen for each patient. The average total processing time was 72.78 seconds per patient. During this time, we executed all necessary preprocessing steps for both the CT and RT structures, successfully converting them into the NIfTI, STL, and PLY formats. This comprehensive approach ensured that all relevant anatomical data were adequately prepared for further applications, including AI analysis, 3D printing, and integration into the Metaverse.
The preprocessing of patient data yielded several output formats generated from both the CT images and the RT structure data, as described in the methods section. Figs. 2 and 3 illustrate the results of the preprocessed data, showcasing both the CT data and the corresponding RT structures.
Figs. 2a, c display the preprocessed CT data, while Figs. 2b, d show corresponding RT structure information for the same slices presented in Figs. 2a, c, respectively. Specifically, Fig. 2b demonstrates the segmented left and right breasts, while Fig. 2d shows the segmented liver and spleen.
Fig. 3 highlights the results of the format conversion process. Fig. 3a presents the CT data after conversion into the STL format, showing how the original CT imaging data has been rendered into a 3D model suitable for various applications, including 3D printing and visualization. In contrast, Fig. 3b illustrates the RT structure data converted into the STL format. The organs displayed in Fig. 3b, from top to bottom, are the breasts, liver, and spleen, offering a clear visual reference for how the segmented structures align within the overall anatomical context of the CT scan.
These results affirm the efficacy of our preprocessing pipeline in preparing patient imaging data for downstream tasks by converting it into multiple file formats while preserving structural and spatial integrity. Although PLY files were also generated, only the STL files are shown in this study, as both formats appear visually identical when viewed in a 3D viewer.
The preprocessed data were successfully exported in NIfTI, STL, and PLY formats. These formats were chosen based on their compatibility with AI algorithms, 3D printing, and virtual reality applications in the Metaverse. The conversion process preserved the integrity of the preprocessed data, ensuring that the output files were of high quality and ready for downstream applications. The NIfTI format facilitated the use of these images in machine learning models for tasks, such as segmentation and classification, while the STL and PLY formats provided detailed 3D models for physical printing and virtual simulations.
The preprocessing of medical imaging data for AI learning is designed to be simple and accessible, enabling even non-expert individuals to perform it effortlessly. A core principle behind the development of the proposed preprocessing pipeline is user-friendliness, allowing healthcare professionals, researchers, and even non-expert users to efficiently transform raw DICOM data into formats suitable for AI analysis. By streamlining the process and minimizing the need for extensive manual intervention, the workflow allows users to quickly prepare data for machine learning algorithms without requiring specialized programming skills or advanced computational knowledge.
Moreover, the workflow demonstrated impressive efficiency in processing time. The entire preprocessing process for a single patient’s CT images and RT structure data, including their transformation into multiple output formats, such as NIfTI, STL, and PLY, can be completed in a highly reasonable amount of time. On average, the processing time for a single patient is approximately 70 seconds. This rapid processing time underscores the efficiency of the pipeline, making it an ideal solution for scenarios that require the preparation of large datasets or the processing of data from multiple patients within a limited time frame.
The primary contribution of this thesis lies in the design of a preprocessing workflow that systematically addresses the inherent challenges of handling raw DICOM files. By implementing pixel interval normalization, spatial alignment, and intensity correction, the pipeline ensures that all imaging data are homogenized, thus facilitating cross-modality analyses such as image registration, fusion, and multimodal diagnosis. These steps are crucial for enhancing the precision of AI models, especially when working with heterogeneous datasets obtained from different imaging devices or protocols.
Moreover, preprocessing DICOM files is a crucial prerequisite for downstream applications, such as 3D segmentation, volumetric analysis, and even surgical simulation, all of which are central to personalized medicine. Accurate segmentation and delineation of anatomical structures—essential for tumor identification or organ-specific treatment planning—depend heavily on the quality of the preprocessed data. The pipeline’s ability to enhance the consistency and integrity of the data ultimately improves the reliability of AI-based diagnostic tools.
Beyond AI, this thesis explores the integration of preprocessed medical data with emerging technologies, such as the Metaverse and 3D printing. The Metaverse, with its immersive and interactive environment, has tremendous potential in healthcare education, virtual consultations, and surgical planning. The ability to visualize complex anatomical structures in a virtual setting enables clinicians to explore patient-specific cases in greater detail, enhancing decision-making and procedural outcomes.
3D printing, however, takes virtual visualization a step further by enabling the creation of physical models that replicate patient anatomy. These models are particularly useful for preoperative planning, the development of customized implants, and patient education. However, the accuracy of these models is directly tied to the quality of the digital data from which they are derived. The preprocessing pipeline presented in this thesis ensures that the DICOM data when converted to formats like STL or PLY, is accurate and optimized for 3D printing applications.
By combining AI, the Metaverse, and 3D printing, the proposed framework addresses the growing need for patient-specific models and virtual environments, enhancing the understanding of patient anatomy and improving clinical outcomes. This multifaceted approach paves the way for innovations that could transform personalized medicine, enabling highly tailored diagnostic and therapeutic strategies.
The versatility of our processing pipeline with various AI models is a key consideration. While our current implementation focuses on specific applications, it is essential to explore how these processed 3D images can be utilized across different AI frameworks, which may have varying requirements for input data formats and preprocessing steps. Future research should focus on evaluating the compatibility of our outputs with commonly used AI models in medical imaging, ensuring that the generated files maintain anatomical fidelity and serve as effective inputs for advanced analytical tools. By addressing these practical limitations, we can enhance the overall impact and utility of our research in real-world clinical scenarios.
Additionally, a limitation lies in the computational cost of the preprocessing workflow. Although the pipeline has been optimized for efficiency, processing large datasets—particularly those involving high-resolution 3D images—can still be time-consuming. This issue could be mitigated by leveraging parallel processing techniques or cloud-based computing solutions, which would significantly reduce the time required for preprocessing large-scale data, particularly in clinical settings where timely decision-making is critical [22].
Furthermore, integrating AI models into the preprocessing pipeline could open new avenues for automation and increased efficiency. For instance, AI-driven algorithms could be trained to detect and correct artifacts, adjust intensity values based on contextual information, or even predict imaging errors using scanner metadata. Such advancements would further streamline the workflow and minimize the need for manual interventions.
In this investigation, we conducted a comprehensive analysis of 10 patient datasets, focusing on the anatomical structures of the left breast, right breast, liver, and spleen. The preprocessing pipeline demonstrated efficient performance, with a mean processing time of 72.78 seconds per patient (n=10). This duration encompassed the complete preprocessing workflow for both CT and RT structures, including their conversion into NIfTI, STL, and PLY formats. Our systematic approach ensured the thorough preparation of all relevant anatomical data for subsequent applications, including AI analysis, 3D printing, and Metaverse integration.
The future of medical imaging lies in the convergence of AI, virtual reality, and 3D printing technologies, with DICOM data serving as the linchpin that binds these technologies together. As the integration of these fields progresses, new applications and opportunities will emerge.
In our study, a primary focus was converting DICOM files into 3D model formats, specifically STL and PLY, which are crucial for both 3D printing and Metaverse environments. Raw DICOM files require preprocessing and conversion into these 3D model formats to be compatible with practical applications in virtual environments and 3D printing platforms. By converting anatomical data from DICOM images into STL and PLY formats, we enable seamless integration with 3D visualization tools. Once created in these formats, the 3D models can be used in the Metaverse for immersive virtual experiences, allowing users to interact with and explore anatomical structures, thereby enhancing educational and clinical training applications. Likewise, these 3D models can be directly used in 3D printing to produce physical replicas for surgical planning, patient education, and prosthetics.
One potential area of research that we aim to explore further is the development of real-time preprocessing algorithms that process DICOM data on the fly, facilitating instant feedback during diagnostic imaging or surgery. Real-time applications would greatly enhance the practicality of AI models and virtual simulations in the clinical environment, allowing for immediate adjustments and decision-making.
Additionally, augmented reality presents a promising avenue when integrated with the Metaverse. While the Metaverse offers a fully immersive experience, AR could overlay 3D models or medical data directly onto the patient during procedures, providing surgeons with a hybrid view that combines the real and virtual worlds. This integration could further elevate the precision of medical interventions and broaden the real-time application of AI-powered imaging in clinical scenarios. We plan to expand on these concepts in future work, providing specific technical details and examples of how 3D models in the Metaverse can enhance user interaction, visualization, and diagnostic accuracy.
The presented preprocessing pipeline effectively tackles the challenges in multimodal medical imaging, enabling the integration of AI, the Metaverse, and 3D printing in clinical practice. To enhance the manuscript’s relevance and impact, future research should explore real-time preprocessing algorithms for live imaging scenarios, aiming to improve surgical accuracy and decision-making. Additionally, integrating augmented reality with Metaverse technologies could enhance the visualization of patient anatomy during procedures. Collaborating with clinical institutions to validate the pipeline across diverse patient populations is essential to ensure its adaptability and effectiveness in various clinical settings.
This research supported by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant was funded by the Korea government (MOTIE) (20227410100050) and supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (grant number NRF-2022R1H1A2092091) and this work was supported by the Technology development Program (RS-2023-00257618) funded by the Ministry of SMEs and Startups (MSS, Korea). This Study was supported by a grant of the SME R&D project for the Start-up & Grow stage company, Ministry of SMEs and Startups (RS-2024-00426787).
The authors have nothing to disclose.
All relevant data are within the paper and its Supporting Information files.
Conceptualization: Dong Hyeok Choi, Jin Sung Kim, So Hyun Ahn. Formal analysis: Dong Hyeok Choi. Resources: Jin Sung Kim, So Hyun Ahn. Writing – original draft: Dong Hyeok Choi. Writing – review & editing: Dong Hyeok Choi, Jin Sung Kim, So Hyun Ahn.
The study was approved by the Institutional Review Board of Ewha Womans University Mokdong Hospital (IRB approval number; 2023-07-001-002).
pISSN 2508-4445
eISSN 2508-4453
Formerly ISSN 1226-5829
Frequency: Quarterly