Academic Journals Database
Disseminating quality controlled scientific knowledge

New Word Vector Representation for Semantic Clustering Une nouvelle représentation vectorielle pour la classification sémantique

ADD TO MY LIST
 
Author(s): Salma Jamoussi

Journal: Traitement Automatique des Langues
ISSN 1248-9433

Volume: 50;
Issue: 3;
Start page: 23;
Date: 2010;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

Keywords: clustering | semantic concepts | word vector representation

ABSTRACT
The idea we defend in this paper is the possibility to obtain significant semantic concepts using clustering methods. We start by defining some semantic measures to quantify the semantic relations between words. Then, we use some clustering methods to build up concepts in an automatic way. We test two well known methods: the K-means algorithm and the Ko- honen maps. Then, we propose the use of a Bayesian network conceived for clustering and called AutoClass. To group the words of the vocabulary in various classes, we test three vector representations of words. The first is a simple contextual representation. The second associates to each word a vector which represents its similarity with each word of the vocabulary. The third representation is a combination of the first and the second one.
Save time & money - Smart Internet Solutions      Why do you need a reservation system?