Academic Journals Database
Disseminating quality controlled scientific knowledge

Sentence Clustering Using Parts-of-Speech

Author(s): Richard Khoury

Journal: International Journal of Information Engineering and Electronic Business
ISSN 2074-9023

Volume: 4;
Issue: 1;
Start page: 1;
Date: 2012;
Original page

Keywords: natural language processing | Part-of-speech | clustering

Clustering algorithms are used in many Natural Language Processing (NLP) tasks. They have proven to be popular and effective tools to use to discover groups of similar linguistic items. In this exploratory paper, we propose a new clustering algorithm to automatically cluster together similar sentences based on the sentences’ part-of-speech syntax. The algorithm generates and merges together the clusters using a syntactic similarity metric based on a hierarchical organization of the parts-of-speech. We demonstrate the features of this algorithm by implementing it in a question type classification system, in order to determine the positive or negative impact of different changes to the algorithm.

Tango Jona
Tangokurs Rapperswil-Jona

     Affiliate Program