Categorization of Turkish Text Documents Using Extreme Learning Machine
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
Nowadays, data generation is increasing rapidly and this situation is defined as Big Data. Textual data constitute a significant portion of these large data masses, so the importance of text processing is also increasing. When we look at the research on text processing and word embedding, it is seen that while there are extensive studies for world languages such as English, there is not enough research for Turkish. In this study, Turkish data is chosen as the target. In this study, 4 different versions of the TTC-3600 dataset were used: Zemb-DS, Original-DS, F7-DS and F5-DS. The performance of the Extreme Learning Machine algorithm was evaluated on these datasets using different activation functions (sigmoid, hardlim, sine, tribas, radbas) and varying number of hidden neurons (100,200,400,600,800,1000,1300,1500). In addition, the effect of different feature numbers (Top-100, Top-500, Top-1000) selected with the Distinguishing Feature Selector method on model accuracy was analysed. Experimental results show that the performance of the Extreme Learning Machine algorithm is highly sensitive to parameters such as activation function, number of hidden layers, and number of features. In addition, experimental results revealed that the sigmoid and hardlim activation functions consistently yielded the highest performance across different dataset versions and parameter settings. © 2024 IEEE.









