Create Image Datastore Containing Single and Multi-File DICOM Series
Datastores are a convenient way of working with and representing collections of data that are too large to fit in memory at one time, especially in deep learning workflows. Digital Imaging and Communications in Medicine (DICOM) is a standardized medical image file format that can store volumes and image series as a single file or multiple files in a folder. This example shows how to create an image datastore containing DICOM data stored as a mix of single files and multiple files.
Medical Imaging Toolbox™ provides objects and functions that simplify this workflow. To get started, see Create Training Data for 3-D Medical Image Semantic Segmentation (Medical Imaging Toolbox).
Download Data
Download the file MedicalVolumeDICOMData.zip
from the bat365® website and unzip it in the current example directory. This file contains three chest CT volumes stored in the DICOM file format, with each volume stored as a directory of multiple files.
zipFile = matlab.internal.examples.downloadSupportFile("medical","MedicalVolumeDICOMData.zip"); unzip(zipFile,pwd)
Gather DICOM Information
In the DICOM file format, a series corresponds to one scan, such as one MRI or CT volume. The dicomCollection
function analyzes the metadata of all DICOM files in a folder, and returns a table in which each row represents one series. For multi-file DICOM volumes, the function aggregates the files into a single series.
Gather the details for the DICOM series in the current example directory, which includes the multi-file chest CT volumes and a multi-frame ultrasound series stored as a single DICOM file.
dicomDir = pwd; collection = dicomCollection(dicomDir,IncludeSubfolders=true)
collection=4×14 table
StudyDateTime SeriesDateTime PatientName PatientSex Modality Rows Columns Channels Frames StudyDescription SeriesDescription StudyInstanceUID SeriesInstanceUID Filenames
____________________ ________________________ ____________ __________ ________ ____ _______ ________ ______ ____________________________ _________________ ________________________________________________________________ __________________________________________________________________ ______________________________
s1 14-Dec-2018 08:10:05 {[14-Dec-2018 08:14:20]} "" "M" "CT" 512 512 1 176 "CT CARDIAC CALCIUM SCORING" "LUNG 30%" "1.3.6.1.4.1.9590.100.1.2.1168571428410023523212533210897210257" "1.3.6.1.4.1.9590.100.1.2.238036032333239687321629621091791352864" {176×1 string }
s2 14-Dec-2018 08:10:05 {[14-Dec-2018 08:14:20]} "" "M" "CT" 512 512 1 88 "CT CARDIAC CALCIUM SCORING" "LUNG 2.5 30%" "1.3.6.1.4.1.9590.100.1.2.1168571428410023523212533210897210257" "1.3.6.1.4.1.9590.100.1.2.270370554009590424527057468890139369500" { 88×1 string }
s3 14-Dec-2018 08:10:05 {[14-Dec-2018 08:14:20]} "" "M" "CT" 512 512 1 88 "CT CARDIAC CALCIUM SCORING" "STANDARD 30%" "1.3.6.1.4.1.9590.100.1.2.1168571428410023523212533210897210257" "1.3.6.1.4.1.9590.100.1.2.322859592438691631224656102213691253752" { 88×1 string }
s4 30-Jan-1994 11:25:01 {0×0 double } "Anonymized" "" "US" 430 600 1 10 "Echocardiogram" "PS LAX MR & AI" "999.999.3859744" "999.999.94827453" {["C:\US-PAL-8-10x-echo.dcm"]}
Create a temporary directory to store the processed DICOM volumes.
matFileDir = fullfile(pwd,"MATFiles"); if ~exist(matFileDir,"dir") mkdir(matFileDir) end
Convert DICOM Volumes to MAT Files
Find all image volumes in the dicomCollection
table, and save each volume as a MAT file.
Fist, loop through each series in the collection.
for idx = 1:size(collection,1)
For the current series, extract the DICOM filenames from the table. If the series contains a multi-file DICOM volume, the filenames are listed as a string array.
dicomFileName = collection.Filenames{idx};
Adapt the DICOM filenames to specify a single filename for the new MAT file.
if length(dicomFileName) > 1 matFileName = fileparts(dicomFileName(1)); matFileName = split(matFileName,filesep); matFileName = replace(strtrim(matFileName(end))," ","_"); else [~,matFileName] = fileparts(dicomFileName); end matFileName = fullfile(matFileDir,matFileName);
Read the image data in the current series. Try different read functions that handle multi-file and single file DICOM volumes.
1) Try reading the data by using the dicomreadVolume
function.
If the data is a multi-file volume, then
dicomreadVolume
runs successfully and returns the complete volume in a single 4-D array. The four dimensions correspond to [rows,columns,samples,slices], where samples is the number of channels per voxel. You can add this data to the datastore and skip step 2.If the data is contained in a single file, then
dicomreadVolume
does not run successfully. Move to step 2.
2) Try reading the data by using the dicomread
function.
If the data is a complete volume,
dicomread
returns a 4-D array. You can add this data to the datastore.If the data is a single 2-D image,
dicomread
returns a 2-D matrix or 3-D array. Skip this series and continue to the next series in the collection.
try data = dicomreadVolume(collection,collection.Row{idx}); catch ME data = dicomread(dicomFileName); if ndims(data)<4 % Skip files that are not volumes continue; end end
If the current series is a volume, write the data and the corresponding DICOM filenames to a MAT file.
save(matFileName,"data","dicomFileName");
End the loop over the studies in the collection.
end
Create Image Datastore
Create an imageDatastore
from the MAT files containing the volumetric DICOM data. Specify the ReadFcn
property as the helper function matRead
, which is defined at the end of this example.
imdsdicom = imageDatastore(matFileDir,FileExtensions=".mat", ... ReadFcn=@matRead);
Visually Check Results
Read the first DICOM volume from the image datastore.
[V,Vinfo] = read(imdsdicom); [~,VFileName] = fileparts(Vinfo.Filename);
Remove the singleton channel dimension by using the squeeze
function, then display the volume by using the volshow
function.
V = squeeze(V); volshow(V);
Supporting Functions
The matRead
function loads data from the first variable of a MAT file with filename filename
.
function data = matRead(filename) inp = load(filename); f = fields(inp); data = inp.(f{1}); end
See Also
dicominfo
| dicomread
| dicomreadVolume
| dicomCollection
| volshow
| imageDatastore
Related Topics
- Create Image Datastore Containing DICOM Images
- Preprocess Volumes for Deep Learning (Deep Learning Toolbox)
- Create Datastores for Medical Image Semantic Segmentation (Medical Imaging Toolbox)