-
J Ment Health Policy Econ · Mar 2002
Comparative StudyA comparison of methods to handle skew distributed cost variables in the analysis of the resource consumption in schizophrenia treatment.
- Reinhold Kilian, Herbert Matschinger, Walter Löeffler, Christiane Roick, and Matthias C Angermeyer.
- Universit t Leipzig, Klinik und Poliklinik f r Psychiatrie,Johannissalle 20, D-04317 Leipzig, Germany, kilr@medizin.uni-leipzig.de
- J Ment Health Policy Econ. 2002 Mar 1; 5 (1): 21-31.
BackgroundTransformation of the dependent cost variable is often used to solve the problems of heteroscedasticity and skewness in linear ordinary least square regression of health service cost data. However, transformation may cause difficulties in the interpretation of regression coefficients and the retransformation of predicted values.Aims Of The StudyThe study compares the advantages and disadvantages of different methods to estimate regression based cost functions using data on the annual costs of schizophrenia treatment.MethodsAnnual costs of psychiatric service use and clinical and socio-demographic characteristics of the patients were assessed for a sample of 254 patients with a diagnosis of schizophrenia (ICD-10 F 20.0) living in Leipzig. The clinical characteristics of the participants were assessed by means of the BPRS 4.0, the GAF, and the CAN for service needs. Quality of life was measured by WHOQOL-BREF. A linear OLS regression model with non-parametric standard errors, a log-transformed OLS model and a generalized linear model with a log-link and a gamma distribution were used to estimate service costs. For the estimation of robust non-parametric standard errors, the variance estimator by White and a bootstrap estimator based on 2000 replications were employed. Models were evaluated by the comparison of the R2 and the root mean squared error (RMSE). RMSE of the log-transformed OLS model was computed with three different methods of bias-correction. The 95% confidence intervals for the differences between the RMSE were computed by means of bootstrapping. A split-sample-cross-validation procedure was used to forecast the costs for the one half of the sample on the basis of a regression equation computed for the other half of the sample.ResultsAll three methods showed significant positive influences of psychiatric symptoms and met psychiatric service needs on service costs. Only the log- transformed OLS model showed a significant negative impact of age, and only the GLM shows a significant negative influences of employment status and partnership on costs. All three models provided a R2 of about.31. The Residuals of the linear OLS model revealed significant deviances from normality and homoscedasticity. The residuals of the log-transformed model are normally distributed but still heteroscedastic. The linear OLS model provided the lowest prediction error and the best forecast of the dependent cost variable. The log-transformed model provided the lowest RMSE if the heteroscedastic bias correction was used. The RMSE of the GLM with a log link and a gamma distribution was higher than those of the linear OLS model and the log-transformed OLS model. The difference between the RMSE of the linear OLS model and that of the log-transformed OLS model without bias correction was significant at the 95% level. As result of the cross-validation procedure, the linear OLS model provided the lowest RMSE followed by the log-transformed OLS model with a heteroscedastic bias correction. The GLM showed the weakest model fit again. None of the differences between the RMSE resulting form the cross- validation procedure were found to be significant.DiscussionThe comparison of the fit indices of the different regression models revealed that the linear OLS model provided a better fit than the log-transformed model and the GLM, but the differences between the models RMSE were not significant. Due to the small number of cases in the study the lack of significance does not sufficiently proof that the differences between the RSME for the different models are zero and the superiority of the linear OLS model can not be generalized. The lack of significant differences among the alternative estimators may reflect a lack of sample size adequate to detect important differences among the estimators employed. Further studies with larger case number are necessary to confirm the results.ImplicationsSpecification of an adequate regression models requires a careful examination of the characteristics of the data. Estimation of standard errors and confidence intervals by nonparametric methods which are robust against deviations from the normal distribution and the homoscedasticity of residuals are suitable alternatives to the transformation of the skew distributed dependent variable. Further studies with more adequate case numbers are needed to confirm the results.
Notes
Knowledge, pearl, summary or comment to share?You can also include formatting, links, images and footnotes in your notes
- Simple formatting can be added to notes, such as
*italics*
,_underline_
or**bold**
. - Superscript can be denoted by
<sup>text</sup>
and subscript<sub>text</sub>
. - Numbered or bulleted lists can be created using either numbered lines
1. 2. 3.
, hyphens-
or asterisks*
. - Links can be included with:
[my link to pubmed](http://pubmed.com)
- Images can be included with:
![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
- For footnotes use
[^1](This is a footnote.)
inline. - Or use an inline reference
[^1]
to refer to a longer footnote elseweher in the document[^1]: This is a long footnote.
.