Multi-modal MAML: Revisiting Feature Fusion for Discriminative Generalization and Class Distribution

4 Nov 2025, 13:05
5m
Balcony (Conference Centre)

Balcony

Conference Centre

Poster Poster Poster Session

Description

Class distribution methods determine how classes are allocated across the meta-training, meta-validation, and meta-test sets. These methods play a critical role in influencing the generalization ability of various meta-learning algorithms, including Prototypical Networks and Model-Agnostic Meta-Learning (MAML), particularly when these models are trained from scratch on small datasets. Focusing on MAML, we hypothesize that the model fails to learn class-discriminative features on small datasets, thereby limiting its generalization performance. To address this limitation, we propose leveraging data fusion to enhance the quality of data by producing more discriminative features.
This paper introduces a novel extension to MAML—referred to as Multi-modal MAML—that incorporates multi-modality techniques by integrating two data modalities: images and texts. Although previous research has emphasized the challenges of training MAML on multi-modal data, our findings indicate that performance is significantly influenced by several key factors: tensor size, textual feature extraction techniques, and the type of fusion employed. we systematically investigate these factors by conducting experiments with three different tensor sizes, three textual feature extraction methods, and two fusion strategies—intermediate and late linear fusion—to evaluate how each combination affects MAML's ability to generalize. These experiments also validate whether the time complexity limits MAML’s ability to learn more discriminative features. Finally, we assess whether the proposed Multi-modal MAML can mitigate the impact of class distribution.

Author

Zainab almugbel (PhD Student)

Presentation materials

There are no materials yet.