Effective Text Classification Through Supervised Rough Set-Based Term Weighting

dc.contributor.authorCekik, Rasim
dc.date.accessioned2026-01-22T19:51:37Z
dc.date.issued2025
dc.departmentŞırnak Üniversitesi
dc.description.abstractThis research presents an innovative approach in text mining based on rough set theory. This study fundamentally utilizes the concept of symmetry from rough set theory to construct indiscernibility matrices and model uncertainties in data analysis, ensuring both methodological structure and solution processes remain symmetric. The effective management and analysis of large-scale textual data heavily relies on automated text classification technologies. In this context, term weighting plays a crucial role in determining classification performance. Particularly, supervised term weighting methods that utilize class information have emerged as the most effective approaches. However, the optimal representation of class-term relationships remains an area requiring further research. This study proposes the Rough Multivariate Weighting Scheme (RMWS) and presents its mathematical derivative, the Square Root Rough Multivariate Weighting Scheme (SRMWS). The RMWS model employs rough sets to identify information-carrying documents within the document-term-class space and adopts a computational methodology incorporating alpha, beta, and gamma coefficients. Moreover, the distribution of the term among classes is again effectively revealed. Comprehensive experimental studies were conducted on three different datasets featuring imbalanced-multiclass, balanced-multiclass, and imbalanced-binary class structures to evaluate the model's effectiveness. The results show that RMWS and its derivative SRMWS methods outperform existing approaches by exhibiting superior performance on balanced and unbalanced datasets without being affected by class imbalance and number of classes. Furthermore, the SRMWS method is found to be the most effective for SVM and KNN classifiers, while the RMWS method achieves the best results for NB classifiers. These results show that the proposed methods significantly improve the text classification performance.
dc.identifier.doi10.3390/sym17010090
dc.identifier.issn2073-8994
dc.identifier.issue1
dc.identifier.orcid0000-0002-7820-413X
dc.identifier.scopus2-s2.0-85215789648
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.3390/sym17010090
dc.identifier.urihttps://hdl.handle.net/11503/3410
dc.identifier.volume17
dc.identifier.wosWOS:001405400000001
dc.identifier.wosqualityN/A
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.institutionauthorCekik, Rasim
dc.language.isoen
dc.publisherMdpi
dc.relation.ispartofSymmetry-Basel
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_WOS_20260122
dc.subjecttext classification
dc.subjectterm weighting
dc.subjectrough set
dc.subjectsupervised learning
dc.subjectnatural language processing
dc.titleEffective Text Classification Through Supervised Rough Set-Based Term Weighting
dc.typeArticle

Dosyalar