-
- Naomi Gafni, Avital Moshinsky, Orit Eisenberg, David Zeigler, and Amitai Ziv.
- National Institute for Testing and Evaluation (NITE), Jerusalem, Israel. naomi@nite.org.il
- Med Educ. 2012 Mar 1; 46 (3): 277-88.
ContextAssessment centres used in evaluating the non-cognitive attributes of medical school candidates must generate scores that reflect as accurate a measurement as possible of these attributes. Thus far, reliability coefficients for such centres have been based on limited samples and individual administrations, without reference to the error of variance that may result from retesting, or from the existence of multiple centres designed to measure the same attributes.MethodsThe National Institute for Testing and Evaluation in Israel has developed and administered two assessment centres: MOR is used by two medical schools and one dental school, and MIRKAM by another medical school. Each centre comprises eight or nine behavioural stations, a standardised biographical questionnaire, and a judgement and decision-making questionnaire. We calculated generalisability coefficients for each centre's eight or nine stations by year, composite reliability coefficients for the overall assessment centres, test-retest correlation coefficients for repeaters, and a correlation coefficient between the centres.ResultsBetween 2006 and 2009, 2662 and 2023 examinees participated in MOR and MIRKAM, respectively; 1479 of these participated in both. The average generalisability coefficients for the stations were 0.69 for MOR and 0.67 for MIRKAM. The composite reliability coefficients for the full centres (behavioural stations plus questionnaires) were 0.79 and 0.76 for MOR and MIRKAM, respectively. The correlations for repeaters, corrected for restriction of range, were 0.59 and 0.43 for MOR and MIRKAM stations, respectively, and 0.72 and 0.65 for the full MOR and MIRKAM assessments, respectively. The correlation between scores on the MOR and MIRKAM stations was 0.56 (0.75 for the overall score).DiscussionThe minimal reliability desirable for high-stakes decision making (0.80) was obtained only for 14 or 15 stations with questionnaires. Nevertheless, the values obtained are considerably higher than reliability coefficients for single interviews. The questionnaires contribute significantly to the accuracy of the measurement. These reliability measures constitute an upper threshold for measures of validity.© Blackwell Publishing Ltd 2012.
Notes
Knowledge, pearl, summary or comment to share?You can also include formatting, links, images and footnotes in your notes
- Simple formatting can be added to notes, such as
*italics*
,_underline_
or**bold**
. - Superscript can be denoted by
<sup>text</sup>
and subscript<sub>text</sub>
. - Numbered or bulleted lists can be created using either numbered lines
1. 2. 3.
, hyphens-
or asterisks*
. - Links can be included with:
[my link to pubmed](http://pubmed.com)
- Images can be included with:
![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
- For footnotes use
[^1](This is a footnote.)
inline. - Or use an inline reference
[^1]
to refer to a longer footnote elseweher in the document[^1]: This is a long footnote.
.