2025 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN
Abstract
Supervised machine learning algorithms usually require sufficient labeled data to perform well. However, obtaining this information can be challenging due to monetary and time constraints. As a possible solution, recent works have proposed the combination of active and semi-supervised learning techniques. Active semi-supervised learning investigates methods to efficiently construct predictive models by incorporating unlabeled data, which is either labeled by a domain expert or pseudolabeled by a model. Despite already being studied in other problems, to the best of our knowledge, active semi-supervised learning has not been applied in the context of multi-target regression, a predictive task where multiple continuous targets must be predicted. In this work, we investigate active semi-supervised learning for multi-target regression. More specifically, we propose, MASSTER, Multi-target Active Semi-Supervised Training for Regression, a novel ensemble method that identifies the most relevant instance-target pairs based on the variance in their predictions. Experiments using 8 benchmark datasets reveal that our method for active learning provides superior results in most of the cases when compared to the current state-of-the-art active learning method for multi-target regression. Further, as its semi-supervised component, our method incorporates a variation of both self-learning (MASSTER-SL) and co-training (MASSTER-CT). Both variants presented better metrics in earlier epochs when compared to MASSTER-AL version with less labeled data especially in smaller datasets.