### On the Assessing of Fits for Simple and Multiple Logistic Models with Dichotomous Response Categories

*Veeranun Pongsapukdee*

#### Abstract

The logistic model to allow for one or several explanatory variables of which the model is also called simple or multiple logistic model, respectively. In the usual case of logistic models, the basic random variable Y is dichotomous response which is commonly used procedure in many disciplines in health sciences research, medical sciences, engineering settings, and is becoming increasingly popular in the behavioral and social sciences and in quality control. In this model data taking the value 1 with the success probability P

The results of the simulation studies show that, for hypothesis testing goodness-of-fit of models, both of the %Correct (77-99 %) and the %Accept (94-96 %) are all satisfied and are consistent. The average of %Correct, when X is Exponential is around 77% and when X’s are Bernoulli and multinomial distributed, they are approximately equal to 99%. Similarly for the average of % Accept which are also approximately equal to 94 %. For X~ Exponential, R

_{1}, and the value 0 with the failure probability (1-P_{1}). Problems arise with different proposed statistics for assessing the fit of the models and which one of them is more preferable. In this article, 1,000 computer simulation experiments in each condition of the probability of Y=1 (P_{1}), calculated parameters and X’s distributions, were generated to evaluate the performance of several statistics, all of which were used for assessing the goodness-of-fit of the models. Ten statistics were computed for each combination of base rate levels and model conditions (Table 1): the likelihood ratio statistics G_{M}, the indexes of predictive efficiency which consist of λ_{p}, τ_{p}and Φ_{p}(Menard, 1995), the coefficients of determination or R^{2}analogs which consist of R^{2}_{C}(the contingency coefficient R^{2}; Aldrich and Nelson, 1984), R^{2}_{L}(the log likelihood ratio R^{2}; McFadden,1974; Menard,1995),R^{2}_{M}(the geometric mean squared improvement per observation R^{2}; Maddala, 1983; Ryan,1997), R^{2}_{N}(the adjusted geometric mean squared improvement R^{2}; Nagelkerke,1991; Ryan,1997), and R^{2}_{O}(the ordinary least squares R^{2}). Moreover, the correlation matrices for determining their magnitude (absolute values) of the measures of independence from the base rate levels, and among each pair of the statistics, the percentages of correct classification of the model (%Correct) and the type II error rates, corresponding to the percentages of power of the tests (%Accept) were also computed.The results of the simulation studies show that, for hypothesis testing goodness-of-fit of models, both of the %Correct (77-99 %) and the %Accept (94-96 %) are all satisfied and are consistent. The average of %Correct, when X is Exponential is around 77% and when X’s are Bernoulli and multinomial distributed, they are approximately equal to 99%. Similarly for the average of % Accept which are also approximately equal to 94 %. For X~ Exponential, R

^{2}_{C}, R^{2}_{M}, and R^{2}_{O}are preferable and for X~ Bernoulli R^{2}_{C}, R^{2}_{M}, R^{2}_{O}are still preferable but R^{2}_{O}outperforms. For (X1, X2)~ Multinomial, the results are similar but slightly superior to those of X~ Bernoulli. The indexes of predictive efficiency of the multinomial case when the success probability P_{1}is high, the l_{p}, t_{p}statistics may be used as alternatives of the R^{2}_{C}, R^{2}_{M}and R^{2}_{O}. Besides these, it is also found that the absolute values correlation coefficients among the R^{2}analogs increase as the P_{1}increase and also the values among the R^{2}analogs are higher than those among the R^{2}analogs and the indexes of predictive efficiency. Some recommendations are made for logistic models with dichotomous response and exponential explanatory variable distributed. Those are the statistics R^{2}_{C}, R^{2}_{M}, R^{2}_{O}, λ_{p}Full Text: PDF