A New Performance Metric to Evaluate Filter Feature Selection Methods in Text Classification

Cekik, Rasim; Kaya, Mahmut

doi:10.3897/jucs.111675

A New Performance Metric to Evaluate Filter Feature Selection Methods in Text Classification

dc.contributor.author	Cekik, Rasim
dc.contributor.author	Kaya, Mahmut
dc.date.accessioned	2026-01-22T19:51:36Z
dc.date.issued	2024
dc.department	Şırnak Üniversitesi
dc.description.abstract	High dimensionality and sparsity are the primary issues in text classification. Using feature selection approaches, the most effective way to solve the problem is to select a subset of features. The most common and effective methods used for this process are filter techniques. Various performance metrics such as Micro-F1, Macro-F1, and Accuracy are used to evaluate the performance of filter methods used for feature selection on datasets Such methods work depending on a classification algorithm. However, when selecting features in filter techniques, the information on the individual features is evaluated without considering the relationship between the features. In such an approach, the actual performance of the filter technique used in feature selection may not be determined. In such a case, it causes the existing methods to be insufficient in testing the validity of the proposed method. For this purpose, this study suggests a novel performance metric called Selection Error (SE) to determine the actual performance evaluation of filter techniques. The Selection Error metric allows us to analyze the information value of the selected features more accurately than existing methods without relying on a classifier. The feature selection performance of the filtering approaches was performed on six different datasets with both The Selection Error and traditional performance metrics. When the results are examined, it is seen that there is a strong relationship between the proposed performance metric and the classification performance metric results. The Selection Error aims to significantly contribute to the literature by demonstrating the success of filtering feature selection methods, regardless of classifier performance.
dc.description.sponsorship	Siirt University, Fund of Scientific Research Projects [2020-SIdot;UEMUEH-036]
dc.description.sponsorship	This work was supported by Siirt University, Fund of Scientific Research Projects under grant number 2020-S & Idot;UEMUEH-036
dc.identifier.doi	10.3897/jucs.111675
dc.identifier.endpage	1005
dc.identifier.issn	0948-695X
dc.identifier.issn	0948-6968
dc.identifier.issue	7
dc.identifier.orcid	0000-0002-7846-1769
dc.identifier.orcid	0000-0002-7820-413X
dc.identifier.scopus	2-s2.0-85201573081
dc.identifier.scopusquality	Q3
dc.identifier.startpage	978
dc.identifier.uri	https://doi.org/10.3897/jucs.111675
dc.identifier.uri	https://hdl.handle.net/11503/3384
dc.identifier.volume	30
dc.identifier.wos	WOS:001301587500005
dc.identifier.wosquality	Q3
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.language.iso	en
dc.publisher	Graz Univ Technolgoy, Inst Information Systems Computer Media-Iicm
dc.relation.ispartof	Journal of Universal Computer Science
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/openAccess
dc.snmz	KA_WOS_20260122
dc.subject	selection error
dc.subject	Text classification
dc.subject	feature selection
dc.subject	filtering methods
dc.subject	performance metric
dc.title	A New Performance Metric to Evaluate Filter Feature Selection Methods in Text Classification
dc.type	Article

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

A New Performance Metric to Evaluate Filter Feature Selection Methods in Text Classification

Dosyalar

Koleksiyon