Publication:
A Stacked Ensemble Deep Learning Approach For Imbalanced Multi-class Water Quality Index Prediction

dc.contributor.authorWen Yee Wongen_US
dc.contributor.authorKhairunnisa Hasikinen_US
dc.contributor.authorAnis Salwa Mohd Khairuddinen_US
dc.contributor.authorSarah Abdul Razaken_US
dc.contributor.authorHanee Farzana Hizaddinen_US
dc.contributor.authorMohd Istajib Mokhtaren_US
dc.contributor.authorMuhammad Mokhzaini Azizanen_US
dc.date.accessioned2024-05-29T02:29:17Z
dc.date.available2024-05-29T02:29:17Z
dc.date.issued2023
dc.date.submitted2024-2-20
dc.descriptionVol.76, No.2en_US
dc.description.abstractA common difficulty in building prediction models with realworld environmental datasets is the skewed distribution of classes. There are significantly more samples for day-to-day classes, while rare events such as polluted classes are uncommon. Consequently, the limited availability of minority outcomes lowers the classifier’s overall reliability. This study assesses the capability of machine learning (ML) algorithms in tackling imbalanced water quality data based on the metrics of precision, recall, and F1 score. It intends to balance the misled accuracy towards the majority of data. Hence, 10 ML algorithms of its performance are compared. The classifiers included are AdaBoost, Support Vector Machine, Linear Discriminant Analysis, k-Nearest Neighbors, Naïve Bayes, Decision Trees, Random Forest, Extra Trees, Bagging, and the Multilayer Perceptron. This study also uses the Easy Ensemble Classifier, Balanced Bagging, and RUSBoost algorithm to evaluate multi-class imbalanced learning methods. The comparison results revealed that a highaccuracy machine learning model is not always good in recall and sensitivity. This paper’s stacked ensemble deep learning (SE-DL) generalization model effectively classifies the water quality index (WQI) based on 23 input variables. The proposed algorithm achieved a remarkable average of 95.69%, 94.96%, 92.92%, and 93.88% for accuracy, precision, recall, and F1 score, respectively. In addition, the proposed model is compared against two state-of-the-art classifiers, the XGBoost (eXtreme Gradient Boosting) and Light Gradient Boosting Machine, where performance metrics of balanced accuracy and g-mean are included. The experimental setup concluded XGBoost with a higher balanced accuracy and G-mean. However, the SE-DL model has a better and more balanced performance in the F1 score. The SE-DL model aligns with the goal of this study to ensure the balance between accuracy and completeness for each water quality class. The proposed algorithm is also capable of higher efficiency at a lower computational time against using the standard Synthetic Minority Oversampling Technique (SMOTE) approach to imbalanced datasets.en_US
dc.identifier.citationW. Y. Wong, K. Hasikin, A. S. M. Khairuddin, S. A. Razak, H. F. Hizaddin et al., "A stacked ensemble deep learning approach for imbalanced multi-class water quality index prediction," Computers, Materials & Continua, vol. 76, no.2, pp. 1361–1384, 2023.en_US
dc.identifier.doi10.32604/cmc.2023.038045
dc.identifier.epage1384
dc.identifier.issn1546-2226
dc.identifier.issue2
dc.identifier.other2494-31
dc.identifier.spage1361
dc.identifier.urihttps://www.techscience.com/cmc/v76n2/54031
dc.identifier.urihttps://www.scopus.com/record/display.uri?eid=2-s2.0-85173545973&origin=resultslist&sort=plf-f&src=s&sid=3dd9073fabd4e7c8b4c704f3645e3e2b&sot=b&sdt=b&s=TITLE-ABS-KEY%28A+Stacked+Ensemble+Deep+Learning+Approach+for+Imbalanced+Multi-Class+Water+Quality+Index+Prediction%29&sl=101&sessionSearchId=3dd9073fabd4e7c8b4c704f3645e3e2b&relpos=0
dc.identifier.urihttps://oarep.usim.edu.my/handle/123456789/10807
dc.identifier.volume76
dc.language.isoen_USen_US
dc.publisherTech Science Pressen_US
dc.relation.ispartofComputers, Materials and Continuaen_US
dc.subjectWater quality classification; imbalanced data; SMOTE; stacked ensemble deep learning; sensitivity analysisen_US
dc.titleA Stacked Ensemble Deep Learning Approach For Imbalanced Multi-class Water Quality Index Predictionen_US
dc.typeArticleen_US
dspace.entity.typePublication

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
A Stacked Ensemble Deep Learning Approach for Imbalanced Multi-Class Water Quality Index Prediction.pdf
Size:
1.34 MB
Format:
Adobe Portable Document Format

Collections