scutr

From MaRDI portal
Software:93101



CRANscutrMaRDI QIDQ93101

Balancing Multiclass Datasets for Classification Tasks

Keenan Ganz

Last update: 17 November 2023

Copyright license: MIT license, File License

Software version identifier: 0.1.2, 0.2.0

Imbalanced training datasets impede many popular classifiers. To balance training data, a combination of oversampling minority classes and undersampling majority classes is useful. This package implements the SCUT (SMOTE and Cluster-based Undersampling Technique) algorithm as described in Agrawal et. al. (2015) <doi:10.5220/0005595502260234>. Their paper uses model-based clustering and synthetic oversampling to balance multiclass training datasets, although other resampling methods are provided in this package.




Related Items (1)


This page was built for software: scutr