Academic Journals Database
Disseminating quality controlled scientific knowledge

Chemical Informatics Functionality in R

ADD TO MY LIST
 
Author(s): Rajarshi Guha

Journal: Journal of Statistical Software
ISSN 1548-7660

Volume: 18;
Issue: 5;
Date: 2007;
Original page

Keywords: R | Java | CDK | PubChem | cheminformatics

ABSTRACT
The flexibility and scope of the R programming environment has made it a popular choice for statistical modeling and scientific prototyping in a number of fields. In the field of chemistry, R provides several tools for a variety of problems related to statistical modeling of chemical information. However, one aspect common to these tools is that they do not have direct access to the information that is available from chemical structures, such as contained in molecular descriptors.We describe the rcdk package that provides the R user with access to the CDK, a Java framework for cheminformatics. As a result, it is possible to read in a variety of molecular formats, calculate molecular descriptors and evaluate fingerprints. In addition, we describe the rpubchem that will allow access to the data in PubChem, a public repository of molecular structures and associated assay data for approximately 8 million compounds. Currently, the package allows access to structural information as well as some simple molecular properties from PubChem. In addition the package allows access to bio-assay data from the PubChem FTP servers.

Tango Jona
Tangokurs Rapperswil-Jona

     Save time & money - Smart Internet Solutions