Academic Journals Database
Disseminating quality controlled scientific knowledge

A Framework for Building Applications Based on Hidden Topics with Short and Sparse Web Documents

Author(s): Kanimozhiveena E | D. Ramya Dorai

Journal: International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
ISSN 2278-1323

Volume: 2;
Issue: 3;
Start page: 984;
Date: 2013;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

Keywords: classification | data sparseness | matching/ranking | text categorization | semantic similarity | web mining

The main aim of this paper is to provide an approach for resolving two major issues in the web such as (1) data sparseness and (2) synonymy of the data. This paper provides a model that could reduce the data sparseness and the synonymy issues. To attain this objective, here the external data from users is taken. This external data helps to reduce both the mentioned issues. The external data is taken into consideration along with the dataset to reduce the data sparseness. It is because if a document that has more relevant content in it but, with very few sentences present in it, related to the keyword given in the query space, then the classification is not likely to be done perfectly. In this case, to classify such sparse and short documents more accurately, we use external data where the document may contain very few sentences and very fewer keywords present it and then enhance classification. In advertising, the ad messages and web pages are considered. Semantic similarity is measured between the ad messages and the web pages for their matching and ranking.
Save time & money - Smart Internet Solutions      Why do you need a reservation system?