Academic Journals Database
Disseminating quality controlled scientific knowledge

Flexible Model Selection Criterion for Multiple Regression

ADD TO MY LIST
 
Author(s): Kunio Takezawa

Journal: Open Journal of Statistics
ISSN 2161-718X

Volume: 02;
Issue: 04;
Start page: 401;
Date: 2012;
Original page

Keywords: GCV | GCVf | Identification of Functional Relationship | Knowledge Discovery | Multiple Regression | Significance Level

ABSTRACT
Predictors of a multiple linear regression equation selected by GCV (Generalized Cross Validation) may contain undesirable predictors with no linear functional relationship with the target variable, but are chosen only by accident. This is because GCV estimates prediction error, but does not control the probability of selecting irrelevant predictors of the target variable. To take this possibility into account, a new statistics “GCVf” (“f”stands for “flexible”) is suggested. The rigidness in accepting predictors by GCVf is adjustable; GCVf is a natural generalization of GCV. For example, GCVf is designed so that the possibility of erroneous identification of linear relationships is 5 percent when all predictors have no linear relationships with the target variable. Predictors of the multiple linear regression equation by this method are highly likely to have linear relationships with the target variable.
Why do you need a reservation system?      Affiliate Program