A New Filter Feature Selection Method for Text Classification

dc.contributor.authorCekik, Rasim
dc.date.accessioned2026-01-22T19:51:53Z
dc.date.issued2024
dc.departmentŞırnak Üniversitesi
dc.description.abstractMassively amounts of text data have been created on the Internet due to the widespread use of platforms like social media. Text classification is one of the most frequently used techniques for extracting useful information from text data. One of the most fundamental problems in text classification is high dimensionality. In text classification, high dimensionality greatly reduces the success of classifiers while increasing their computational cost. The most effective way to overcome this problem is to select a subset of features comprising the most distinctive features across the entire feature space, with the help of a feature selector. This study presents a new filter feature selection approach called Multivariate Feature Selector (MFS) for text classification. The proposed approach calculates a score for each feature based on three knowledge structures: class-based, document-based, and document-class-based. These structures have been utilized to reveal hidden information at the class, document, and document-class levels. This enables a more precise and effective scoring calculation for each term. The proposed method (MFS) was tested on four different datasets, and micro-F1 and macro-F1 measures were used as performance evaluators to prove the method's success in feature selection. It has been observed that MFS outperforms the main feature selection methods in the literature. While different classification results were obtained depending on the selected feature size, MFS showed superior performance in all selected sub-feature spaces.
dc.identifier.doi10.1109/ACCESS.2024.3468001
dc.identifier.endpage139335
dc.identifier.issn2169-3536
dc.identifier.orcid0000-0002-7820-413X
dc.identifier.scopus2-s2.0-85205004963
dc.identifier.scopusqualityQ1
dc.identifier.startpage139316
dc.identifier.urihttps://doi.org/10.1109/ACCESS.2024.3468001
dc.identifier.urihttps://hdl.handle.net/11503/3560
dc.identifier.volume12
dc.identifier.wosWOS:001327330100001
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.institutionauthorCekik, Rasim
dc.language.isoen
dc.publisherIeee-Inst Electrical Electronics Engineers Inc
dc.relation.ispartofIEEE Access
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_WOS_20260122
dc.subjectFeature selection
dc.subjecttext classification
dc.subjecttext classification
dc.subjectdimensionality reduction
dc.subjectdimensionality reduction
dc.subjecttext mining
dc.subjecttext mining
dc.subjecttext mining
dc.titleA New Filter Feature Selection Method for Text Classification
dc.typeArticle

Dosyalar