Academic Journals Database
Disseminating quality controlled scientific knowledge

Token-based method of blocking records for large data warehouse

Author(s): Jebamalar Tamilselvi J. | Saravanan V.

Journal: Advances in Information Mining
ISSN 0975-3265

Volume: 2;
Issue: 2;
Start page: 05;
Date: 2010;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

Keywords: Data Warehouse | Record Linkage | Token | Blocking Records | Record Comparisons | Duplicate Data

Record linkage is a critical problem in duplicate data elimination. It is used to detect and eliminateduplicate data. The elimination of duplicate data will increase the quality of data. Record Linkage problem willtake high computational cost because of the large number of record comparisons. The comparison of records isinefficient in large data warehouses. Blocking methods are used to group the records to minimize the number ofrecord comparisons. This paper explains the existing blocking methods and its comparison and discusses theselection of token-based blocking key for record comparisons.
RPA Switzerland

RPA Switzerland

Robotic process automation


Tango Jona
Tangokurs Rapperswil-Jona