The VAMPIRE Challenge: A Multi-Institutional Validation Study of CT Ventilation Imaging.
Med Phys. 2018 Dec 21;:
Authors: Kipritidis J, Tahir BA, Cazoulat G, Hofman MS, Siva S, Callahan J, Hardcastle N, Yamamoto T, Christensen GE, Reinhardt JM, Kadoya N, Patton TJ, Gerard SE, Duarte I, Archibald-Heeren B, Byrne M, Sims R, Ramsay S, Booth JT, Eslick E, Hegi-Johnson F, Woodruff HC, Ireland RH, Wild JM, Cai J, Bayouth J, Brock K, Keall PJ
Abstract
PURPOSE: CT ventilation imaging (CTVI) is being used to achieve functional avoidance lung cancer radiation therapy in three clinical trials (NCT02528942, NCT02308709, NCT02843568). To address the need for common CTVI validation tools, we have built the Ventilation And Medical Pulmonary Image Registration Evaluation (VAMPIRE) Dataset, and present the results of the first VAMPIRE Challenge to compare relative ventilation distributions between different CTVI algorithms and other established ventilation imaging modalities.
METHODS: The VAMPIRE Dataset includes 50 pairs of 4DCT scans and corresponding clinical or experimental ventilation scans, referred to as reference ventilation images (RefVIs). The dataset includes 25 humans imaged with Galligas 4DPET/CT, 21 humans imaged with DTPA-SPECT and 4 sheep imaged with Xenon-CT. For the VAMPIRE Challenge, 16 subjects were allocated to a training group (with RefVI provided) and 34 subjects were allocated to a validation group (with RefVI blinded). 7 research groups downloaded the Challenge dataset and uploaded CTVIs based on deformable image registration (DIR) between the 4DCT inhale/exhale phases. Participants used DIR methods broadly classified into B-splines, Free-form, Diffeomorphisms or Biomechanical modeling, with CT ventilation metrics based on the DIR evaluation of volume change, Hounsfield Unit change, or various hybrid approaches. All CTVIs were evaluated against the corresponding RefVI using the voxel-wise Spearman coefficient rS, and Dice similarity coefficients evaluated for low function lung (DSClow) and high function lung (DSChigh).
RESULTS: A total of 37 unique combinations of DIR method and CT ventilation metric were either submitted by participants directly or derived from participant-submitted DIR motion fields using the in-house software, VESPIR. The rS and DSC results reveal a high degree of inter-algorithm and inter-subject variability among the validation subjects, with algorithm rankings changing by up to 10 positions depending on the choice of evaluation metric. The algorithm with the highest overall cross-modality correlations used a biomechanical model based DIR with a hybrid ventilation metric, achieving a median (range) of 0.49 (0.27-0.73) for rS, 0.52 (0.36-0.67) for DSClow and 0.45 (0.28-0.62) for DSChigh. All other algorithms exhibited at least one negative rS value, and/or one DSC value less than 0.5.
CONCLUSIONS: The VAMPIRE Challenge results demonstrate that the cross-modality correlation between CTVIs and the RefVIs vary not only with the choice of CTVI algorithm, but also with the choice of RefVI modality, imaging subject, and the evaluation metric used to compare relative ventilation distributions. This variability may arise from the fact that each of the different CTVI algorithms and RefVI modalities provides a distinct physiologic measurement. Ultimately this variability, coupled with the lack of a 'gold standard,' highlight the ongoing importance of further validation studies before CTVI can be widely translated from academic centers to the clinic. It is hoped that the information gleaned from the VAMPIRE Challenge can help inform future validation efforts. This article is protected by copyright. All rights reserved.
PMID: 30575051 [PubMed - as supplied by publisher]
from A via a.sfakia on Inoreader http://bit.ly/2QN7WIk