Goodness of Fit of Cumulative Logit Models for Ordinal Response Categories and Nominal Explanatory variables with Two-Factor Interaction

Veeranun Pongsapukdee, Sujin Sukgumphaphan


Power and the assessing goodness of fit of cumulative models for ordinal response data with two nominal interaction term of explanatory variables are investigated. The magnitude of goodness-of-fit statistics, the coefficients of determination or R2 analogs, the likelihood ratio statistic, GM, AIC (Akaike Information Criterion, Akaike, 1973),and BIC (Bayesian Information Criterion, Schwarz, 1978) are calculated. The simulations have been conducted for the multinomial logit models with K=3 response categories and two random explanatory variables X1 and X2 whose joint distribution of (X1, X2 ) is assumed to be multinomial with probabilities π123 and π4 , corresponding to (X1, X2 ) values of (0, 0), (0,1), (1, 0), (1, 1), respectively. Three sets of (π123, π4) are studied to represent different distributional shapes, which were chosen to induce possibly strong effects such that β1=log 2, β2= log 3 and β12= 0.0 - 0.45 (increment 0.3), namely (X1, X2 )~multinomial(0.10,0.35,0.45,0.10), (X1, X2 )~ multinomial (0.50,0.30,0.10,0.10), and (X1, X2 )~multinomial (0.25,0.25,0.25,0.25). Four sets of the three ordered category distributing corresponding with the (X1, X2 ) were again generated through the models under the proportions of (p1, p2, p3), namely Y~multinomial( p1, p2, p3): (0.05,0.20,0.75), (0.25,0.50,0.25), (0.5,0.20,0.25), and (0.33,0.33,0.33) from which it follows that the true model intercepts are α1 = log (p1/(p2 + p3)), α2 = log (p1/(p2 + p3)), corresponding to the proportions of Y = 1, 2, 3 respectively. Four sample sizes of 600, 800, 1,000, and 1,500 units were performed. Each condition was carried out for 1,000 repeated simulations using the developed macro program run with the Minitab Release 11.

The results under the distribution conditions of (X1, X2 )~ multinomial (0.1,0.35,0.45,0.1) and Y ~(0.55, 0.20, 0.25) show that all goodness-of-fit statistics perform better than those of the distribution conditions of which Y~(0.25,0.5,0.25) and Y~(0.33,0.33,0.33) in term of the power of the tests, means and standard deviations of goodness-of-fit statistics. These results are also similar to the condition when (X1, X2 )~ (0.50,0.30,0.1,0.1). However, when the distribution conditions are symmetric such that (X1, X2 ) ~ (0.25,0.25,0.25,0.25) and Y~ (0.33,0.33,0.33) all statistics are much generally improved the model fits. In conclusion it probably is recommended to use large sample sizes in the analysis of ordinal categorical responses when the distributions of variables are asymmetric, except only when the distribution of the response categories is clearly increasing in order. Besides this, there is also a tendency to improve the model fit by using the models with an interaction term when the correlated structures between the explanatory variables are evident.

Key Words: Cumulative logit models; Interaction terms; Goodness of fits; Multinomial ordinal responses.

Full Text: PDF