Búsqueda | Portal Regional de la BVS

Challenges in multi-centric generalization: phase and step recognition in Roux-en-Y gastric bypass surgery.

Lavanchy, Joël L; Ramesh, Sanat; Dall'Alba, Diego; Gonzalez, Cristians; Fiorini, Paolo; Müller-Stich, Beat P; Nett, Philipp C; Marescaux, Jacques; Mutter, Didier; Padoy, Nicolas.

Int J Comput Assist Radiol Surg ; 2024 May 18.

Artículo en Inglés | MEDLINE | ID: mdl-38761319

RESUMEN

PURPOSE: Most studies on surgical activity recognition utilizing artificial intelligence (AI) have focused mainly on recognizing one type of activity from small and mono-centric surgical video datasets. It remains speculative whether those models would generalize to other centers. METHODS: In this work, we introduce a large multi-centric multi-activity dataset consisting of 140 surgical videos (MultiBypass140) of laparoscopic Roux-en-Y gastric bypass (LRYGB) surgeries performed at two medical centers, i.e., the University Hospital of Strasbourg, France (StrasBypass70) and Inselspital, Bern University Hospital, Switzerland (BernBypass70). The dataset has been fully annotated with phases and steps by two board-certified surgeons. Furthermore, we assess the generalizability and benchmark different deep learning models for the task of phase and step recognition in 7 experimental studies: (1) Training and evaluation on BernBypass70; (2) Training and evaluation on StrasBypass70; (3) Training and evaluation on the joint MultiBypass140 dataset; (4) Training on BernBypass70, evaluation on StrasBypass70; (5) Training on StrasBypass70, evaluation on BernBypass70; Training on MultiBypass140, (6) evaluation on BernBypass70 and (7) evaluation on StrasBypass70. RESULTS: The model's performance is markedly influenced by the training data. The worst results were obtained in experiments (4) and (5) confirming the limited generalization capabilities of models trained on mono-centric data. The use of multi-centric training data, experiments (6) and (7), improves the generalization capabilities of the models, bringing them beyond the level of independent mono-centric training and validation (experiments (1) and (2)). CONCLUSION: MultiBypass140 shows considerable variation in surgical technique and workflow of LRYGB procedures between centers. Therefore, generalization experiments demonstrate a remarkable difference in model performance. These results highlight the importance of multi-centric datasets for AI model generalization to account for variance in surgical technique and workflows. The dataset and code are publicly available at https://github.com/CAMMA-public/MultiBypass140.

Dissecting self-supervised learning methods for surgical computer vision.

Ramesh, Sanat; Srivastav, Vinkle; Alapatt, Deepak; Yu, Tong; Murali, Aditya; Sestini, Luca; Nwoye, Chinedu Innocent; Hamoud, Idris; Sharma, Saurav; Fleurentin, Antoine; Exarchakis, Georgios; Karargyris, Alexandros; Padoy, Nicolas.

Med Image Anal ; 88: 102844, 2023 08.

Artículo en Inglés | MEDLINE | ID: mdl-37270898

RESUMEN

The field of surgical computer vision has undergone considerable breakthroughs in recent years with the rising popularity of deep neural network-based methods. However, standard fully-supervised approaches for training such models require vast amounts of annotated data, imposing a prohibitively high cost; especially in the clinical domain. Self-Supervised Learning (SSL) methods, which have begun to gain traction in the general computer vision community, represent a potential solution to these annotation costs, allowing to learn useful representations from only unlabeled data. Still, the effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored. In this work, we address this critical need by investigating four state-of-the-art SSL methods (MoCo v2, SimCLR, DINO, SwAV) in the context of surgical computer vision. We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection. We examine their parameterization, then their behavior with respect to training data quantities in semi-supervised settings. Correct transfer of these methods to surgery, as described and conducted in this work, leads to substantial performance gains over generic uses of SSL - up to 7.4% on phase recognition and 20% on tool presence detection - as well as state-of-the-art semi-supervised phase recognition approaches by up to 14%. Further results obtained on a highly diverse selection of surgical datasets exhibit strong generalization properties. The code is available at https://github.com/CAMMA-public/SelfSupSurg.

Asunto(s)

Computadores , Redes Neurales de la Computación , Humanos , Aprendizaje Automático Supervisado

Weakly Supervised Temporal Convolutional Networks for Fine-Grained Surgical Activity Recognition.

Ramesh, Sanat; Dall'Alba, Diego; Gonzalez, Cristians; Yu, Tong; Mascagni, Pietro; Mutter, Didier; Marescaux, Jacques; Fiorini, Paolo; Padoy, Nicolas.

IEEE Trans Med Imaging ; 42(9): 2592-2602, 2023 09.

Artículo en Inglés | MEDLINE | ID: mdl-37030859

RESUMEN

Automatic recognition of fine-grained surgical activities, called steps, is a challenging but crucial task for intelligent intra-operative computer assistance. The development of current vision-based activity recognition methods relies heavily on a high volume of manually annotated data. This data is difficult and time-consuming to generate and requires domain-specific knowledge. In this work, we propose to use coarser and easier-to-annotate activity labels, namely phases, as weak supervision to learn step recognition with fewer step annotated videos. We introduce a step-phase dependency loss to exploit the weak supervision signal. We then employ a Single-Stage Temporal Convolutional Network (SS-TCN) with a ResNet-50 backbone, trained in an end-to-end fashion from weakly annotated videos, for temporal activity segmentation and recognition. We extensively evaluate and show the effectiveness of the proposed method on a large video dataset consisting of 40 laparoscopic gastric bypass procedures and the public benchmark CATARACTS containing 50 cataract surgeries.

Asunto(s)

Redes Neurales de la Computación , Cirugía Asistida por Computador

TRandAugment: temporal random augmentation strategy for surgical activity recognition from videos.

Ramesh, Sanat; Dall'Alba, Diego; Gonzalez, Cristians; Yu, Tong; Mascagni, Pietro; Mutter, Didier; Marescaux, Jacques; Fiorini, Paolo; Padoy, Nicolas.

Int J Comput Assist Radiol Surg ; 18(9): 1665-1672, 2023 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-36944845

RESUMEN

PURPOSE: Automatic recognition of surgical activities from intraoperative surgical videos is crucial for developing intelligent support systems for computer-assisted interventions. Current state-of-the-art recognition methods are based on deep learning where data augmentation has shown the potential to improve the generalization of these methods. This has spurred work on automated and simplified augmentation strategies for image classification and object detection on datasets of still images. Extending such augmentation methods to videos is not straightforward, as the temporal dimension needs to be considered. Furthermore, surgical videos pose additional challenges as they are composed of multiple, interconnected, and long-duration activities. METHODS: This work proposes a new simplified augmentation method, called TRandAugment, specifically designed for long surgical videos, that treats each video as an assemble of temporal segments and applies consistent but random transformations to each segment. The proposed augmentation method is used to train an end-to-end spatiotemporal model consisting of a CNN (ResNet50) followed by a TCN. RESULTS: The effectiveness of the proposed method is demonstrated on two surgical video datasets, namely Bypass40 and CATARACTS, and two tasks, surgical phase and step recognition. TRandAugment adds a performance boost of 1-6% over previous state-of-the-art methods, that uses manually designed augmentations. CONCLUSION: This work presents a simplified and automated augmentation method for long surgical videos. The proposed method has been validated on different datasets and tasks indicating the importance of devising temporal augmentation methods for long surgical videos.

Asunto(s)

Extracción de Catarata , Redes Neurales de la Computación , Humanos , Algoritmos , Extracción de Catarata/métodos

Multi-task temporal convolutional networks for joint recognition of surgical phases and steps in gastric bypass procedures.

Ramesh, Sanat; Dall'Alba, Diego; Gonzalez, Cristians; Yu, Tong; Mascagni, Pietro; Mutter, Didier; Marescaux, Jacques; Fiorini, Paolo; Padoy, Nicolas.

Int J Comput Assist Radiol Surg ; 16(7): 1111-1119, 2021 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-34013464

RESUMEN

PURPOSE: Automatic segmentation and classification of surgical activity is crucial for providing advanced support in computer-assisted interventions and autonomous functionalities in robot-assisted surgeries. Prior works have focused on recognizing either coarse activities, such as phases, or fine-grained activities, such as gestures. This work aims at jointly recognizing two complementary levels of granularity directly from videos, namely phases and steps. METHODS: We introduce two correlated surgical activities, phases and steps, for the laparoscopic gastric bypass procedure. We propose a multi-task multi-stage temporal convolutional network (MTMS-TCN) along with a multi-task convolutional neural network (CNN) training setup to jointly predict the phases and steps and benefit from their complementarity to better evaluate the execution of the procedure. We evaluate the proposed method on a large video dataset consisting of 40 surgical procedures (Bypass40). RESULTS: We present experimental results from several baseline models for both phase and step recognition on the Bypass40. The proposed MTMS-TCN method outperforms single-task methods in both phase and step recognition by 1-2% in accuracy, precision and recall. Furthermore, for step recognition, MTMS-TCN achieves a superior performance of 3-6% compared to LSTM-based models on all metrics. CONCLUSION: In this work, we present a multi-task multi-stage temporal convolutional network for surgical activity recognition, which shows improved results compared to single-task models on a gastric bypass dataset with multi-level annotations. The proposed method shows that the joint modeling of phases and steps is beneficial to improve the overall recognition of each type of activity.

Asunto(s)

Derivación Gástrica/métodos , Laparoscopía/métodos , Redes Neurales de la Computación , Procedimientos Quirúrgicos Robotizados/métodos , Humanos

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA