Academic Journals Database
Disseminating quality controlled scientific knowledge

Improving Business Type Classification from Twitter Posts Based on Topic Model

Author(s): Chanattha Thongsuk | Choochart Haruechaiyasak | Somkid Saelee

Journal: World of Computer Science and Information Technology Journal
ISSN 2221-0741

Volume: 1;
Issue: 8;
Start page: 333;
Date: 2011;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

Keywords: Classification | topic model | Latent Dirichlet Allocation (LDA) | Twitter.

Today Twitter, a social networking website, has become a new advertising channel to promote products and services using online social network community. In this study, we propose a solution to recommend Twitter users to follow businesses, which match their interests. Our approach is based on classification algorithms to predict user’s interests by analyzing their posts. The challenging issue is the short length characteristic of Twitter posts. With only a few available key terms in each post, classifying Twitter posts is very difficult and challenging. To alleviate this problem, we propose a technique to improve the classification performance by expanding the term features from a topic model to train the classification models. A topic model is constructed from a set of topics based on the Latent Dirichlet Allocation (LDA) algorithm. We propose two feature processing approaches: (1) feature transformation, i.e., using a set of topics as features and (2) feature expansion, i.e., appending a set of topics to a set of terms. Experimental results of multi-classification showed that the highest accuracy of 95.7% is obtained with the feature expansion technique, an improvement of 19.1% over the Bag of Words (BOW) model. In addition, we also compared between multi-classification and binary classification using feature expansion approach to build the classification models. The performance of feature expansion approach using binary classification yielded higher accuracy than the multi-classification equal to 2.3%, 3.3% and 0.4%, for airline, food and computer & technology businesses, respectively.

Tango Jona
Tangokurs Rapperswil-Jona

     Affiliate Program