Academic Journals Database
Disseminating quality controlled scientific knowledge

Stemming and N-gram matching for term conflation in Turkish texts

Author(s): F. Çuna Ekmekçioglu | Michael F. Lynch | Peter Willett

Journal: Information Research: an international electronic journal
ISSN 1368-1613

Volume: 2;
Issue: 2;
Start page: 13;
Date: 1996;
Original page

Keywords: free text | indexing | retrieval | information retrieval | word forms | spelling errors | alternative spellings | multi-word concepts | transliteration | affixes | abbreviations | conflation algorithm | Turkish

One of the main problems involved in the use of free text for indexing and retrieval is the variation in word forms that is likely to be encountered. The most common type of variations are spelling errors, alternative spellings, multi-word concepts, transliteration, affixes and abbreviations. One way to alleviate this problem is to use a conflation algorithm, a computational procedure that is designed to bring together words that are semantically related, and to reduce them to a single form for retrieval purposes. In this paper, we discuss the use of conflation techniques for Turkish text databases.
Why do you need a reservation system?      Save time & money - Smart Internet Solutions