University of Cambridge > Talks.cam > Foundation AI > Surgical data using LLMs

Log in

University Account

External (via Google)

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Surgical data using LLMs

Download to your calendar using vCal

Hugo Georgenthum
Friday 21 March 2025, 17:00-17:45
Lecture Theatre 2, Computer Laboratory, William Gates Building.

If you have a question about this talk, please contact Pietro Lio .

The automatic summarization of surgical videos is crucial for improving procedural documentation, surgical training, and post-operative analysis. This thesis presents a new method at the intersection of artificial intelligence and medicine, seeking to develop innovative machine-learning models with real-world applications in surgery. To this end, we propose a multi-modal approach to generate video summaries by benefiting from the latest improvements in both computer vision and large language models. For instance, the model processes surgical videos in the 3 following key steps. After dividing the video into clips, the focus is on the extraction of visual features, by treating the clips on a frame level with visual transformers. The goal is to detect the tools, organs, tissues and actions performed by the surgeon. These visual features are then translated to frame captions using large language models. Subsequently, on the video level, the emphasis is placed on the temporal features. The latter are obtained with a Vivit-based encoder by taking as input both the clips and the frame captions extracted earlier. In an analogous way to the frame captions, the temporal features are converted into clip captions, which capture the overall context of the clip. The last phase gathers the combination of the clip descriptions into a surgical report with an LLM specifically designed for this task. We train and evaluate our model on the CholecT50 dataset, leveraging instrument and action frame annotations along 50 laparoscopic videos. Experimental results demonstrate that our method produces coherent and contextually meaningful summaries, with a 96% precision for tool detection and 0.74% Bert score for temporal context extraction. This research contributes to the development of AI-assisted tools for surgical reporting and analysis

This talk is part of the Foundation AI series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Surgical data using LLMs

📅 Download to calendar (vCal)

👤 Speaker: Hugo Georgenthum
📅 Date & Time: Friday 21 March 2025, 17:00 - 17:45
📍 Venue: Lecture Theatre 2, Computer Laboratory, William Gates Building

Questions? Contact Pietro Lio

Abstract

Series This talk is part of the Foundation AI series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

Surgical data using LLMs

This talk is included in these lists:

Surgical data using LLMs

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Surgical data using LLMs

This talk is included in these lists:

Other lists

Other talks

Surgical data using LLMs

Abstract

Included in Lists