Academic Journals Database
Disseminating quality controlled scientific knowledge

Principles of construction of the multidimensional space of terms in the analysis of object-oriented collection of documents

Author(s): Khrunichev Robert Vyacheslavovich

Journal: Vestnik Astrahanskogo Gosudarstvennogo Tehničeskogo Universiteta. Seriâ: Upravlenie, Vyčislitelʹnaâ Tehnika i Informatika
ISSN 2072-9502

Volume: 1;
Issue: Astrakhan State Technical University, Russia;
Start page: 136;
Date: 2012;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

Keywords: object-oriented collection of documents | frequency analysis of the text | data warehouse | space of terms

The paper considers the problem of information retrieval in object-oriented collection of documents, the possibility of searching for documents by means of the application of the modified search model, based on the vector model. Modernization of the vector model is the ability to use object-oriented glossary of terms at the stage of preliminary processing of the text, allowing to reduce the number of terms for subsequent frequency analysis of the text. Zipf's rule and the principle of Luhn, used during the frequency analysis, can also significantly reduce the number of analyzed terms. The paper shows the principle of construction of the multidimensional space of terms, based on the vectors that describe the document. The principles of these vectors formation are given. The article also lists the advantages of the object-oriented vocabulary application in the process of constructing the space of terms, consisting in the possibility of separating of composite terms, and through this, more accurate positioning of the document in its issue upon request.
RPA Switzerland

Robotic Process Automation Switzerland


Tango Jona
Tangokurs Rapperswil-Jona