Academic Journals Database
Disseminating quality controlled scientific knowledge

Investigation of model fit and score scale comparability in international assessments

ADD TO MY LIST
 
Author(s): Maria Elena Oliveri | Matthias von Davier

Journal: Psychological Test and Assessment Modeling
ISSN 2190-0493

Volume: 53;
Issue: 3;
Start page: 315;
Date: 2011;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

Keywords: international large-scale assessments | item response theory | general diagnostic model | trends

ABSTRACT
This study used item response data from 30 countries who participated in the Programme for International Student Assessment (PISA). It compared reduction of proportion of item misfit associated with alternative item response theory (IRT; multidimensional and multi-parameter Rasch and 2 parameter logistic; 2PL) models and linking (mean-mean IRT vs. Lagrangian multiplier and concurrent calibration) approaches to those currently used by PISA to conduct score scale calibrations. The analyses are conducted with the general diagnostic model (GDM), which is a modeling framework that contains all IRT models used in the paper as special cases. The paper also investigated whether the use of an alternative score scale (i.e., a scale that includes the use of international and a subset of country-specific parameters) as compared to the use of solely international parameters for country score scale calibrations led to improvement of fit. Analyses were conducted using discrete mixture distribution IRT as well as multiple group (M-)IRT models. As compared to a scale that uses all international parameters, substantial improvement of fit was obtained using the concurrent calibration linking approach with the multi-group 2PL model allowing for partially-unique country parameters.
Save time & money - Smart Internet Solutions      Why do you need a reservation system?