• Statistics in medicine · Jun 2000

    Modern psychometric methods for detection of differential item functioning: application to cognitive assessment measures.

    • J A Teresi, M Kleinman, and K Ocepek-Welikson.
    • Columbia University, Stroud Center, New York, NY 10032, USA. Teresimeas@aol.com
    • Stat Med. 2000 Jun 15;19(11-12):1651-83.

    AbstractCognitive screening tests and items have been found to perform differently across groups that differ in terms of education, ethnicity and race. Despite the profound implications that such bias holds for studies in the epidemiology of dementia, little research has been conducted in this area. Using the methods of modern psychometric theory (in addition to those of classical test theory), we examined the performance of the Attention subscale of the Mattis Dementia Rating Scale. Several item response theory models, including the two- and three-parameter dichotomous response logistic model, as well as a polytomous response model were compared. (Log-likelihood ratio tests showed that the three-parameter model was not an improvement over the two-parameter model.) Data were collected as part of the ten-study National Institute on Aging Collaborative investigation of special dementia care in institutional settings. The subscale KR-20 estimate for this sample was 0.92. IRT model-based reliability estimates, provided at several points along the latent attribute, ranged from 0.65 to 0.97; the measure was least precise at the less disabled tail of the distribution. Most items performed in similar fashion across education groups; the item characteristic curves were almost identical, indicating little or no differential item functioning (DIF). However, four items were problematic. One item (digit span backwards) demonstrated a large error term in the confirmatory factor analysis; item-fit chi-square statistics developed using BIMAIN confirm this result for the IRT models. Further, the discrimination parameter for that item was low for all education subgroups. Generally, persons with the highest education had a greater probability of passing the item for most levels of theta. Model-based tests of DIF using MULTILOG identified three other items with significant, albeit small, DIF. One item, for example, showed non-uniform DIF in that at the impaired tail of the latent distribution, persons with higher education had a higher probability of correctly responding to the item than did lower education groups, but at less impaired levels, they had a lower probability of a correct response than did lower education groups. Another method of detection identified this item as having DIF (unsigned area statistic=3.05, p<0.01, and 2.96, p<0.01). On average, across the entire score range, the lower education group's probability of answering the item correctly was 0.11 higher than the higher education group's probability. A cross-validation with larger subgroups confirmed the overall result of little DIF for this measure. The methods used for detecting differential item functioning (which may, in turn, be indicative of bias) were applied to a neuropsychological subtest. These methods have been used previously to examine bias in screening measures across education and ethnic and racial subgroups. In addition to the important epidemiological applications of ensuring that screening measures and neuropsychological tests used in diagnoses are free of bias so that more culture-fair classifications will result, these methods are also useful for the examination of site differences in large multi-site clinical trials. It is recommended that these methods receive wider attention in the medical statistical literature.Copyright 2000 John Wiley & Sons, Ltd.

      Pubmed     Full text   Copy Citation     Plaintext  

      Add institutional full text...

    Notes

     
    Knowledge, pearl, summary or comment to share?
    300 characters remaining
    help        
    You can also include formatting, links, images and footnotes in your notes
    • Simple formatting can be added to notes, such as *italics*, _underline_ or **bold**.
    • Superscript can be denoted by <sup>text</sup> and subscript <sub>text</sub>.
    • Numbered or bulleted lists can be created using either numbered lines 1. 2. 3., hyphens - or asterisks *.
    • Links can be included with: [my link to pubmed](http://pubmed.com)
    • Images can be included with: ![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
    • For footnotes use [^1](This is a footnote.) inline.
    • Or use an inline reference [^1] to refer to a longer footnote elseweher in the document [^1]: This is a long footnote..

    hide…

Want more great medical articles?

Keep up to date with a free trial of metajournal, personalized for your practice.
1,694,794 articles already indexed!

We guarantee your privacy. Your email address will not be shared.