An application-dependent framework for the recognition of high-level surgical tasks in the OR

  • Authors:
  • Florent Lalys;Laurent Riffaud;David Bouget;Pierre Jannin

  • Affiliations:
  • INSERM and INRIA, VisAGeS Unité/Projet and University of Rennes I, CNRS, UMR 6074, IRISA, Rennes, France;INSERM and INRIA, VisAGeS Unité/Projet and University of Rennes I, CNRS, UMR 6074, IRISA and Department of Neurosurgery, Pontchaillou University Hospital, Rennes, France;INSERM and INRIA, VisAGeS Unité/Projet and University of Rennes I, CNRS, UMR 6074, IRISA, Rennes, France;INSERM and INRIA, VisAGeS Unité/Projet and University of Rennes I, CNRS, UMR 6074, IRISA, Rennes, France

  • Venue:
  • MICCAI'11 Proceedings of the 14th international conference on Medical image computing and computer-assisted intervention - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Surgical process analysis and modeling is a recent and important topic aiming at introducing a new generation of computer-assisted surgical systems. Among all of the techniques already in use for extracting data from the Operating Room, the use of image videos allows automating the surgeons' assistance without altering the surgical routine. We proposed in this paper an application-dependent framework able to automatically extract the phases of the surgery only by using microscope videos as input data and that can be adaptable to different surgical specialties. First, four distinct types of classifiers based on image processing were implemented to extract visual cues from video frames. Each of these classifiers was related to one kind of visual cue: visual cues recognizable through color were detected with a color histogram approach, for shape-oriented visual cues we trained a Haar classifier, for texture-oriented visual cues we used a bag-of-word approach with SIFT descriptors, and for all other visual cues we used a classical image classification approach including a feature extraction, selection, and a supervised classification. The extraction of this semantic vector for each video frame then permitted to classify time series using either Hidden Markov Model or Dynamic Time Warping algorithms. The framework was validated on cataract surgeries, obtaining accuracies of 95%.