Joint Probability Of Agreement
Therefore, the common probability of an agreement will remain high, even in the absence of an „intrinsic“ agreement between the councillors. A useful interrater reliability coefficient (a) is expected to be close to 0 if there is no „intrinsic“ agreement and (b) increased if the „intrinsic“ agreement rate improves. Most probability-adjusted match coefficients achieve the first objective. However, the second objective is not achieved by many well-known measures that correct the odds.  Kang Y, Steis MR, Kolanowski AM, D, Prabhu VV (2016) Measuring the agreement between health collection instruments using reciprocal information. BMC Med Inform Decis Mak 16 (1): 99 An impeccable analysis was conducted to assess the extent to which coders consistently attributed categorical depression assessments to the subjects in the study. Marginal distributions of depression assessments did not highlight prevalence or bias problems, suggesting that Cohen`s Kappa (1960) was an appropriate index of IRR (Di Eugenis-Glass, 2004). Kappa was calculated for each pair of coders, which was then calculated to provide a single IRR index (Light, 1971). The resulting Kappa indicated a significant agreement, n- 0.68 (Landis-Koch, 1977), and was consistent with previously published IRR estimates from the coding of similar constructions in previous studies. The flawless analysis showed that coders had a significant match in depression assessments, although the interest rate variable had a slight error differential due to differentiated subjective assessments of coders, which slightly reduced statistical performance for subsequent analyses, although the evaluations were deemed appropriate to be used in the hypothesis tests of the present study.
Cohen (1968) proposes an alternative weighted kappa that allows researchers to penalize differences differently because of the magnitude of differences. Cohen`s weighted Kappa is generally used for category data with an ordinal structure, for example. B in an evaluation system that categorizes the high, medium or low presence of a particular attribute. In this case, a subject considered high by one coder and low by another should lead to a lower estimate of the ERREUR than that of a subject considered high by one coder and another by another. Norman and Streiner (2008) show that the use of a weighted cappa with square weights for ordination scales is identical to a single two-sided mixed ICC and can be replaced. This interchangeability is a particular advantage if three or more coders are used in a study, as CCIs may contain three or more coders, while weighted kappa can only contain two codes (Norman-Streiner, 2008).