Academic Journals Database
Disseminating quality controlled scientific knowledge

Multilingual Context Ontology Rule Enhanced Focused Web Crawler

Author(s): Mukesh Kumar | Renu Vig

Journal: Journal of Advances in Information Technology
ISSN 1798-2340

Volume: 1;
Issue: 1;
Start page: 21;
Date: 2010;
Original page

Keywords: Focused Crawler | Search Engines | Information Retrieval | Ontology | Adaptive Rules

Rapidly growing size and increasing number of Non-English resources on World-Wide-Web poses unprecedented challenges for general purpose crawlers and Search Engines. It is impossible for any search engine to index the complete Web. Focused crawler cope with the growing size by selectively seeking out pages that are relevant to a predefined set of topics and avoiding irrelevant regions of the Web. Rather than collecting and indexing all accessible Web documents, focused crawler analyses its crawl boundary to find the links likely to be the most relevant for the crawl. This paper presents a focused crawler whose crawl strategy is based upon the scores calculated from context ontologies and adaptive classification rules, and which is capable to deal with intermediate multilinguity situations (the situations in which the query language is same as that of target language but the intermediate path may pass through some pages which are written in mixed, in query and some other language, way). It  enhances the quality of  pages retrieved, because it may be possible that the English meaning of the other language word sequence may itself or point to some pages which are most relevant to the query, and hence should be included in the results, which, yet, are left untouched by all the existing crawlers.
Why do you need a reservation system?      Affiliate Program