RESUMEN
BACKGROUND AND OBJECTIVE: Computerized pathology image analysis is an important tool in research and clinical settings, which enables quantitative tissue characterization and can assist a pathologist's evaluation. The aim of our study is to systematically quantify and minimize uncertainty in output of computer based pathology image analysis. METHODS: Uncertainty quantification (UQ) and sensitivity analysis (SA) methods, such as Variance-Based Decomposition (VBD) and Morris One-At-a-Time (MOAT), are employed to track and quantify uncertainty in a real-world application with large Whole Slide Imaging datasets - 943 Breast Invasive Carcinoma (BRCA) and 381 Lung Squamous Cell Carcinoma (LUSC) patients. Because these studies are compute intensive, high-performance computing systems and efficient UQ/SA methods were combined to provide efficient execution. UQ/SA has been able to highlight parameters of the application that impact the results, as well as nuclear features that carry most of the uncertainty. Using this information, we built a method for selecting stable features that minimize application output uncertainty. RESULTS: The results show that input parameter variations significantly impact all stages (segmentation, feature computation, and survival analysis) of the use case application. We then identified and classified features according to their robustness to parameter variation, and using the proposed features selection strategy, for instance, patient grouping stability in survival analysis has been improved from in 17% and 34% for BRCA and LUSC, respectively. CONCLUSIONS: This strategy created more robust analyses, demonstrating that SA and UQ are important methods that may increase confidence digital pathology.
Asunto(s)
Procesamiento de Imagen Asistido por Computador , Humanos , IncertidumbreRESUMEN
Digital pathology imaging enables valuable quantitative characterizations of tissue state at the sub-cellular level. While there is a growing set of methods for analysis of whole slide tissue images, many of them are sensitive to changes in input parameters. Evaluating how analysis results are affected by variations in input parameters is important for the development of robust methods. Executing algorithm sensitivity analyses by systematically varying input parameters is an expensive task because a single evaluation run with a moderate number of tissue images may take hours or days. Our work investigates the use of Surrogate Models (SMs) along with parallel execution to speed up parameter sensitivity analysis (SA). This approach significantly reduces the SA cost, because the SM execution is inexpensive. The evaluation of several SM strategies with two image segmentation workflows demonstrates that a SA study with SMs attains results close to a SA with real application runs (mean absolute error lower than 0.022), while the SM accelerates the SA execution by 51â¯×â¯. We also show that, although the number of parameters in the example workflows is high, most of the uncertainty can be associated with a few parameters. In order to identify the impact of variations in segmentation results to downstream analyses, we carried out a survival analysis with 387 Lung Squamous Cell Carcinoma cases. This analysis was repeated using 3 values for the most significant parameters identified by the SA for the two segmentation algorithms; about 600 million cell nuclei were segmented per run. The results show that significance of the survival correlations of patient groups, assessed by a logrank test, are strongly affected by the segmentation parameter changes. This indicates that sensitivity analysis is an important tool for evaluating the stability of conclusions from image analyses.