Multi-view Echocardiography | Tison Lab @ UCSF

Multi-view Left-Heart — Planes Land and Stay

Multi-view Deep Learning Improves Detection of Major Cardiac Conditions from Echocardiography

Joshua P. Barrios, Minhaj U. Ansari, Jeffrey E Olgin, Sean Abreau, Jacques Delfrate, Elodie L. Langlais, Robert Avram, Geoffrey H. Tison

Biomedical imaging captures complex three-dimensional (3D) anatomic structures with multiple two-dimensional (2D) views or projections. Across various medical imaging modalities, single views capture only partial information about 3D structures, requiring clinicians or scientists to mentally integrate complementary perspectives to form a coherent understanding of 3D structure, anatomic motion, and function.

Most existing deep learning models operate on a single input image or video at a time. They are not designed to jointly reason across multiple complementary views. Our work was motivated by the desire to enable deep learning models to explicitly integrate multiple complementary imaging views in a single predictive framework, mirroring how human experts interpret medical imaging studies.

A single echocardiogram study consists of many distinct image and video views of the heart

We developed a novel multi-view neural network architecture

We developed a multi-view deep neural network architecture designed to integrate multiple imaging videos (views) simultaneously. The framework extends video-based deep learning models by introducing dedicated cross-view convolutional blocks, allowing the neural network to learn both within-view and across-view spatiotemporal features and relationships.

Multi-view architecture schematic (Figure 1)

We applied the multi-view neural network architecture to several demonstration tasks in cardiac ultrasound, also called echocardiography. Echocardiography is the most common cardiovascular imaging modality and its interpretation regularly requires synthesis of complementary information across multiple views.

Each echocardiographic slice of the heart contains incomplete 2D information about the heart's 3D structure and function. By simultaneously accepting multiple views as input and integrating information across them, the multi-view neural network derives a more complete understanding of cardiac anatomy and function.

Our underlying multi-view neural network approach is applicable across the range of biomedical imaging applications: whenever multiple imaging views capture distinct but related information about the same underlying structure or process, learning across views within a single neural network may improve performance beyond analyzing views independently.

Application to echocardiography illustration

Image courtesy of Atif Qasim MD; www.echocardiographer.org

To illustrate the capabilities of the multi-view framework, we applied it to three representative diagnostic tasks in echocardiography:

Detection of ventricular abnormalities
Detection of diastolic dysfunction
Detection of valvular regurgitation

These composite demonstration tasks were selected because they require integrating information from multiple views and reflect a mix of structural, functional, and flow-related assessments.

For each task, we compared three modeling strategies:

A multi-view model that performs mid-fusion and convolutions across views
Single-view models trained on individual views
A late-fusion baseline consisting of the arithmetic average of the outputs from single-view models

Across all three tasks, the multi-view framework demonstrated improved discrimination compared to single-view models. Performance gains also exceeded those achieved by late fusion, indicating that jointly learning across views provides additional benefit beyond ensembling independent predictions.

Importantly, these results suggest that explicitly modeling relationships between views can improve performance for tasks where diagnostically relevant information is captured across multiple distinct imaging views. While performance improvements provided by multi-view neural networks will likely depend on the target task, data availability, and view selection, these results support multi-view learning as a beneficial strategy for biomedical imaging applications.

Multi-view vs Single-view model performance at UCSF

To assess generalizability, we evaluated the trained multi-view models on an external dataset acquired at the Montreal Heart Institute (MHI), Canada, using different equipment and measurement conventions. Despite these differences, performance was largely preserved.

These results suggest that our multi-view approach generalizes across hospital systems with differing demographics, acquisition devices, and practice patterns.

Multi-view vs single-view model performance at MHI

We used Grad-CAM and guided Grad-CAM techniques to visualize regions of the input videos that most strongly influenced model predictions. Across tasks and views, highlighted regions corresponded to anatomically and physiologically relevant structures, suggesting that the model learns meaningful features rather than relying on spurious cues.

This work demonstrates that a multi-view neural network architecture can integrate complementary video inputs to analyze complex three-dimensional structures, improving performance across a range of tasks and providing a general multi-view framework for other biomedical imaging applications.

Multi-view analysis may improve standardization

Multi-view analysis of biomedical imaging videos may improve standardization and consistency in settings where diagnostically relevant information is distributed across multiple views or projections.

Architecture and adaptability

The proposed multi-view neural network approach improves performance on representative tasks and can be extended to other imaging modalities, tasks, and clinical domains.

Clinical Deployment

Multi-view neural network architectures expand the capabilities of AI in biomedical imaging, laying the foundation to achieve augmented diagnosis and assisted decision support in diagnostic imaging.

Email questions or inquiries here.