Academic Journals Database
Disseminating quality controlled scientific knowledge

Text Categorization using Distributional Features and Semantic Equivalence

Author(s): Tirupathaiah Kommi | Srikanth Jatla

Journal: International Journal of Computer Applications
ISSN 0975-8887

Volume: 30;
Issue: 7;
Start page: 30;
Date: 2011;
Original page

Keywords: Text mining | machine learning | text categorization | distributional feature | tfidf

In text mining domain, text categorization is widely used which is nothing but assigning predefined categories to text. The process of assigning values to words based on the occurrences of words known as bagofword approach was used by previous researchers in order to find how frequently a word is used in the document. This approach has a drawback as it does not consider other features of words except the count of it. This paper throws light into assigning other values to a word known as distributional features. This approach is novel and the distributional features include the position of first occurrence of word and compactness of its appearances. Our experimental results revealed that text categorization has been improved with the help of distributional features and semantic equivalence. The research has thrown light into another fact that distributional features are very useful when writing style is casual and document is long. The semantic equivalence used to extend equivalence rough set approach.

Tango Jona
Tangokurs Rapperswil-Jona

     Affiliate Program