Academic Journals Database
Disseminating quality controlled scientific knowledge

A Hybrid Model For Phrase Chunking Employing Artificial Immunity System And Rule Based Methods

ADD TO MY LIST
 
Author(s): Bindu.M.S | Sumam Mary Idicula

Journal: International Journal of Artificial Intelligence & Applications
ISSN 0976-2191

Volume: 2;
Issue: 4;
Start page: 95;
Date: 2011;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

Keywords: Human immune system | Self Test | POS tagger | Detector set | Phrase tags

ABSTRACT
Natural language Understanding (NLU), an important field of Artificial Intelligence (AI) is concerned with the speech and language understanding between human and computer. Understanding language means knowing what concept a word or phrase stands for and how to link them to form meaningful sentence. Identification of phrases or phrase chunking is an important step in natural language understanding (NLU). Chunker identifies and divides sentences into syntactically correlated word groups. Question Answering (QA) systems, another important application of Artificial Intelligence (AI)mostly requires retrieval of nouns or noun phrases as answers to the questions raised by the users. Also Chunking is an important preprocessing step in full parsing. Due to high ambiguity of natural language, exact parsing of text may become very complex. This ambiguity may be partially resolved by using chunking as an intermediate step. To the best of our knowledge no known work or tag set is available for phrase chunking in Malayalam. To separate the chunks in a document it must be labeled with parts-ofspeech (POS) tags. POS Tagging is a difficult task in Malayalam as it is a complex and compounding language. In this paper we describe the application of artificial immunity system (AIS) for chunking which is implemented and obtained an accurate output with 96% precision and 93% recall. This system istested on corpuses collected from reputed news papers and magazines. These corpuses contained documents from five different domains such as sports, health, agriculture, science and politics and each document contained sentences –simple, compound, complex-of various levels of complexity. POS tag set with 52 tags is developed for preparing the tagged corpus for Malayalam. The phrase tag set contains 20 phrase tags.

Tango Jona
Tangokurs Rapperswil-Jona

     Save time & money - Smart Internet Solutions