Academic Journals Database
Disseminating quality controlled scientific knowledge

An Improved Approach to perform Crawling and avoid Duplicate Web Pages

Author(s): Dhiraj Khurana

Journal: International Journal of Computer Science and Management Studies
ISSN 2231-5268

Volume: 12;
Start page: 358;
Date: 2012;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

Keywords: Crawler | Optimization | Duplicate

When a web search is performed it includes many duplicate web pages or the websites. It means we can get number of similar pages at different web servers. We are proposing a Web Crawling Approach to Detect and avoid Duplicate or Near Duplicate WebPages. In this proposed work we are presenting a keyword Prioritization based approach to identify the web page over the web. As such pages will beidentified it will optimize the web search.
Save time & money - Smart Internet Solutions     

Tango Rapperswil
Tango Rapperswil