CAREER: Automated Multimodal Learning for Healthcare


Sponsoring Agency
National Science Foundation


Multimodal data fusion is a core task in the machine learning and data mining fields. Although machine learning models, especially deep learning, have been shown to have superior predictive power in this task, most are manually designed based on experts’ knowledge and experience. Moreover, the existing approaches do not reflect a consensus on the optimal way to integrate multimodal data. To address these issues, in this CAREER proposal, we propose the development of a new learning paradigm – automated multimodal learning – which will allow researchers to identify the optimal way to fuse multimodal data. We apply this new paradigm to the multimodal healthcare predictive modeling task as a representative example of the paradigm’s advantages. We also propose to equip automated multimodal learning with the ability to model the unique challenges of multimodal health data, including data size variety, noise, and missing modalities. To be specific, Aim 1 targets at addressing the data size variety challenge. Building upon our preliminary work, we design a foundational yet general automated multimodal learning framework under the scenario of training with sufficient data in Aim 1.1. This framework is a crucial building block of the whole project. In Aim 1.2, we propose a knowledge-enhanced multi-source automated learning transfer model to handle the issue of small data. To tackle the data noise challenge, we propose a novel denoising automated multimodal learning framework in Aim 2, allowing the model to remove noise features driven by the target task automatically. Aim 3 focuses on addressing the missing modality challenge by imputing the missing modalities and simultaneously removing the noise introduced by the imputation. In Aim 4, we validate the proposed research for different multimodal fusion tasks in healthcare informatics and beyond and gather feedback from experts to refine the proposed research. Finally, our education and outreach plans are built around a set of activities that are tightly integrated with the proposed research. This project will shift the research direction of traditional multimodal learning and promote advances in the fields of machine learning, data mining, and healthcare informatics.