Objective. Tonic–clonic seizures (TCSs), which present a significant risk for sudden unexpected death in epilepsy, require accurate detection to enable effective long-term monitoring. Previous studies have demonstrated the advantages of multimodal seizure detection systems in reliably detecting TCSs over extended periods. However, the effectiveness of these data-driven systems depends heavily on the availability of reliable training data. Approach. To address this need, we propose an innovative data selection method designed to identify high-quality training samples. Our approach evaluates sample quality based on learning difficulty, classifying samples with lower learning difficulty as higher quality. We then introduce a confidence-based method to quantify the proportion of high-quality samples within the dataset. Main results. Experimental results show that our method improves the performance of a state-of-the-art TCS detection model by 11%. Significance. Using this data selection method, we develop a training pipeline that enhances the training process of multimodal seizure detection models.