single-level statistical models to multilevel data typically produces
underestimated standard errors, which may result in misleading conclusions.
This study examined the impact of ignoring multilevel data structure on the
estimation of item parameters and their standard errors of the Rasch, two-, and
three- parameter logistic models in item response theory (IRT) to demonstrate
the degree of such underestimation in IRT. Also, the Lord’s chi-square test
using the underestimated standard errors was used to test differential item functioning
(DIF) to show the impact of such underestimation on the practical applications
of IRT. The results of simulation studies showed that, in the most severe case
of multilevel data, the standard error estimate from the standard single-level
IRT models was about half of the minimal asymptotic standard error, and the
type I error rate of the Lord’s chi-square test was inflated up to .35. The
results of this study suggest that standard single-level IRT models may
seriously mislead our conclusions in the presence of multilevel data, and
therefore multilevel IRT models need to be considered as alternatives.
Item Response Theory, Rasch, Multilevel Data, Monte Carlo Simulation