Academic Journals Database
Disseminating quality controlled scientific knowledge

Arabic Content Classification System Using statistical Bayes classifier With Words Detection and Correction

Author(s): Abdullah Mamoun Hattab | Abdulameer Khalaf Hussein

Journal: World of Computer Science and Information Technology Journal
ISSN 2221-0741

Volume: 2;
Issue: 6;
Start page: 193;
Date: 2012;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

Keywords: text mining | classification | Arabic text classification | Arabic language processing.

Automatic Arabic content classification is an important text mining task especially with the rapid growth of the number of online Arabic documents. This system is an enhancement of the implemented machine learning classification algorithm by applying detection and correction algorithm of Non-Words in Arabic text. This detection and correction algorithm is built on morphological knowledge in form of consistent root pattern relationships, and some morpho-syntactical knowledge based on affixation and morph-graphic rules to specify the word recognition and non-word correction process. Many researchers had been focused on Arabic content classification from only morphological view such as word’s root and stemming techniques (prefixes and suffixes) which showed variant results. In this work, consider classification from a very different way which is the syntactical approach. This paper presents the results of experiments on document classification achieved on ten different Arabic domains (Economy, History, Family studies, Islamic, Sport, Health, Law, Stories, astronomy and Food articles) using statistical methodology. The performance of this classification system showed encouraging results compared with other existing systems.

Tango Rapperswil
Tango Rapperswil

RPA Switzerland

RPA Switzerland

Robotic process automation