Discussion
Many authors have outlined a number of problems with the use of
“traditional” measures of diagnostic performance6,16,25-27. These problems relate to the biases that
plague studies evaluating diagnostic studies, and to the metrics
themselves 28. In this paper, we focus on the latter.
In particular, we focus on the measurement of diagnostic accuracy as
opposed to the impact of diagnostic tests on health outcomes, which
depends on consideration of down-stream effects of testing such as the
choice of treatment and will not be considered here.
With regards to diagnostic accuracy, it has been argued6,8,29,30 that utilization of information theory, and
particularly MI, has theoretical and practical advantages over the
traditional measures at assessing the performance of a diagnostic test.
Notably, MI and RMI can be used to explicitly quantify the amount of
diagnostic uncertainty a test reduces. Such a direct measure can easily
be used to evaluate test performance not only by trained researchers but
also by any EBM literate practitioner. Here, we summarized the MI
advantages over traditional measures and demonstrated how MI can be
meta-analyzed using two cases from the literature.
The MI meta-analysis results presented in both cases show the
superiority of MI and RMI over other metrics in conveying arguably the
most useful clinical indicators of diagnostic test performance, namely
the amount of diagnostic uncertainty reduced by the test. Clearly,
consideration of other ethical and personal dilemmas is also involved in
the administration of a diagnostic test. However, for the EBM community
and the evidence synthesis practitioners , reduction of uncertainty is
of outmost importance. In terms of derivation, MI is easily computed and
meta-analyzed. In addition, although we have not emphasized it here, MI
has particular advantages over other metrics when it comes to analysis
of tests with continuous measurements such as PSA, blood pressure etc.
Analysis of such tests with traditional metrics requires dichotomization
of the test results discarding useful information 31.
On the other hand, MI can be computed both for discrete and continuous
variables 32.
One limitation of MI is its reliance on prevalence, which even though
represents theoretical advantages it introduces heterogeneity in
meta-analysis. To solve this problem, we propose meta-analyzing RMI
instead of MI, but at this time we know of no derivation of standard
error for RMI. Further development in the field of research synthesis of
diagnostic test performance may lie in the opportunity to develop robust
meta-analytic techniques for RMI.
In summary, we believe that MI is the most meaningful measure for both
decision makers and EMB researchers as it provides intuitive, easy to
understand metrics that quantify diagnostic tests information content.
We therefore, argue that the field of evidence-based diagnostics should
adopt MI as its most useful metric.
References
1. Sox HC, Blatt MA, Higgins MC, Marton MC. Medical Decision Making.
Boston: Butterworths; 1988.
2. Leeflang MM, Deeks JJ, Takwoingi Y, Macaskill P. Cochrane diagnostic
test accuracy reviews. Syst Rev 2013;2:82.
3. Leeflang MM, Deeks JJ, Gatsonis C, Bossuyt PM, Cochrane Diagnostic
Test Accuracy Working G. Systematic reviews of diagnostic test accuracy.
Ann Intern Med 2008;149:889-97.
4. Shannon CE, Waever W. The mathematical theory of communication.
Urbana: The University of Illinois Press; 1962.
5. Shannon C. A mathematical theory of communication, bell System
technical Journal 27: 379-423 and 623–656. Mathematical Reviews
(MathSciNet): MR10, 133e 1948.
6. Benish WA. Intuitive and axiomatic arguments for quantifying
diagnostic test performance in units of information. Methods Inf Med
2009;48:552-7.
7. Somoza E, Mossman D. Comparing and optimizing diagnostic tests: an
information-theoretical approach. Med Decis Making 1992;12:179-88.
8. Benish W. Mutual information as an index of diagnostic test
performance. Methods of information in medicine 2003;42:260-4.
9. Mossman D, Somoza E. Diagnostic tests and information theory. J
Neuropsychiatry Clin Neurosci 1992;4:95-8.
10. Somoza E, Soutullo-Esperon L, Mossman D. Evaluation and optimization
of diagnostic tests using receiver operating characteristic analysis and
information theory. International journal of bio-medical computing
1989;24:153-89.
11. Benish W. The use of information graphs to evaluate and compare
diagnostic tests. Methods of information in medicine 2002;41:114-8.
12. Nelson GW, O’Brien SJ. Using mutual information to measure the
impact of multiple genetic factors on AIDS. JAIDS Journal of Acquired
Immune Deficiency Syndromes 2006;42:347-54.
13. Meyer CR, Boes JL, Kim B, et al. Demonstration of accuracy and
clinical versatility of mutual information for automatic multimodality
image fusion using affine and thin-plate spline warped geometric
deformations. Medical image analysis 1997;1:195-206.
14. Diamond GA, Hirsch M, Forrester JS, et al. Application of
information theory to clinical diagnostic testing. The
electrocardiographic stress test. Circulation 1981;63:915-21.
15. Cover TM, Thomas JA. Elements of information theory: John Wiley &
Sons; 2012.
16. Hughes G. Application of Information Theory to Epidemiology:
American Phytopathological Society; 2012.
17. Hughes G, McRoberts N. The structure of diagnostic information.
Australasian Plant Pathology 2014:1-20.
18. Djulbegovic B, Hozo I, Abdomerovic I, Hozo S. Diagnostic entropy as
a function of therapeutic benefit/risk ratio. Med Hyoptheses
1995;45:503-9.
19. Djulbegovic B, Glasziou P, Chalmers I. The importance of randomised
vs non-randomised trials. The Lancet 2019;394:634-5.
20. Deeks JJ, Altman DG, Bradburn MJ. Statistical methods for examining
heterogeneity and combining results from several studies in
meta‐analysis. Systematic Reviews in Health Care: Meta-Analysis in
Context, Second Edition 2001:285-312.
21. Roulston MS. Estimating the errors on measured entropy and mutual
information. Physica D: Nonlinear Phenomena 1999;125:285-94.
22. Deeks JJ. Systematic reviews in health care: Systematic reviews of
evaluations of diagnostic and screening tests. BMJ 2001;323:157-62.
23. Smith-Bindman R, Kerlikowske K, Feldstein VA, et al. Endovaginal
ultrasound to exclude endometrial cancer and other endometrial
abnormalities. JAMA 1998;280:1510-7.
24. Menke J, Larsen J. Meta-analysis: Accuracy of contrast-enhanced
magnetic resonance angiography for assessing steno-occlusions in
peripheral arterial disease. Ann Intern Med 2010;153:325-34.
25. Knottnerus JA. The evidence base of clinical diagnosis. London: BMJ
Books; 2002.
26. Hilden J. The area under the ROC curve and its competitors. Med
Decis Making 1991;11:95-101.
27. Lee WC, Hsiao CK. Alternative summary indices for the receiver
operating characteristic curve. Epidemiology 1996;7:605-11.
28. Bossuyt PM, Reitsma JB, Bruns DE, et al. The STARD Statement for
reporting of studies of diagnostic accuracy: explanation and
elaboration. Clin Chem 2003;49:7-18.
29. Benish WA. Relative entropy as a measure of diagnostic information.
Medical decision making 1999;19:202-6.
30. Wu Y, Alagoz O, Ayvaci MU, et al. A comprehensive methodology for
determining the most informative mammographic features. Journal of
digital imaging 2013;26:941-7.
31. Shapiro DE. The interpretation of diagnostic tests. Stat Methods Med
Res 1999;8:113-34.
32. Ross BC. Mutual Information between Discrete and Continuous Data
Sets. PloS one 2014;9:e87357.
Appendix - Unabridged derivations of MI, RMI and Var(MI)
Entropy is expressed as:
\begin{equation}
H\left(D\right)=-\left(P(D+)\operatorname{}{P(D+)}+\left(1-P(D+)\right)\operatorname{}\left(1-P(D+)\right)\right)\nonumber \\
\end{equation}The uncertainty due to the diagnostic test is:
\begin{equation}
H\left(T\right)=-\left(\ P(D+|T+)\operatorname{}{P(D+|T+)}+\left(1-P(D+|T+)\right)\operatorname{}\left(1-P(D+|T+)\right)\right)\nonumber \\
\end{equation}The mutual information is computed as:
\begin{equation}
I\left(D,T\right)=H\left(D\right)+H\left(T\right)-H\left(D,T\right)=H\left(D\right)-H\left(D\middle|T\right)\nonumber \\
\end{equation}The relative mutual information is computed as:
\begin{equation}
I_{R}\left(D,T\right)=\frac{I\left(D,T\right)}{H\left(D\right)}=1-\frac{H(D|T)}{H(D)}\nonumber \\
\end{equation}In terms of sensitivity and specificity, mutual information is derived
as:
\begin{equation}
I\left(D,T\right)=H\left(D\right)+H\left(T\right)-H\left(D,T\right)=P\left(T+\middle|D+\right)P(D+)\left(\log_{2}\left(\frac{P\left(T+\middle|D+\right)\left(\left(1-P\left(T+\middle|D+\right)\right)P\left(D+\right)+P\left(T-\middle|D-\right)\left(1-P\left(D+\right)\right)\right)}{\left(1-P\left(T+\middle|D+\right)\right)\left(P\left(T+\middle|D+\right)P\left(D+\right)+\left(1-P\left(T-\middle|D-\right)\right)\left(1-P\left(D+\right)\right)\right)}\right)\right)+P\left(T-\middle|D-\right)\left(1-P(D+)\right)\ \left(\log_{2}\left(\frac{P(T-|D-)\left(P(T+|D+)P(D+)+\left(1-P(T-|D-)\right)\left(1-P(D+)\right)\right)}{\left(1-P(T-|D-)\right)\left(\left(1-P(T+|D+)\right)P(D+)+P(T-|D-)\left(1-P(D+)\right)\right)}\right)\right)\ \ \ \ \ \ \ \ +P(D+)\log_{2}\left(\frac{\left(1-P(T+|D+)\right)\left(P(T+|D+)P(D+)+\left(1-P(T-|D-)\right)\left(1-P(D+)\right)\right)}{\left(1-P(T-|D-)\right)\left(\left(1-P(T+|D+)\right)P(D+)+P(T-|D-)\left(1-P(D+)\right)\right)}\right)+\log_{2}\left(\frac{\left(1-P(T-|D-)\right)}{\left(P(T+|D+)P(D+)+\left(1-P(T-|D-)\right)\left(1-P(D+)\right)\right)}\right)\nonumber \\
\end{equation}The variance of mutual information is computed as:
\begin{equation}
\text{Var}\left(H\left(D\right)\right)=\left[\left(\operatorname{}{P(D+)}+H\left(D\right)\right)^{2}+\left(\operatorname{}\left(1-P(D+)\right)+H\left(D\right)\right)^{2}\right]\frac{P(D+)\left(1-P(D+)\right)}{N}\nonumber \\
\end{equation}and:
\begin{equation}
{\text{Var}\left(I\left(D,T\right)\right)=\left(\operatorname{}\left(P\left(T+\middle|D+\right)P(D+)+\left(1-P\left(T+\middle|D+\right)\right)P(D+)\right)+\operatorname{}\left(P\left(T+\middle|D+\right)P(D+)+(1-P\left(T-\middle|D-\right))(1-P\left(D+\right))\right)-\operatorname{}{(P\left(T+\middle|D+\right)P\left(D+\right))}+I\left(D,T\right)\right)^{2}\left(\frac{P\left(T+\middle|D+\right)P(D+)\left(1-P\left(T+\middle|D+\right)P(D+)\right)}{N}\right)\backslash n}{+\left(\operatorname{}\left(P\left(T+\middle|D+\right)P(D+)+\left(1-P\left(T+\middle|D+\right)\right)P(D+)\right)+\operatorname{}\left(\left(1-P\left(T+\middle|D+\right)\right)P\left(D+\right)+P\left(T-\middle|D-\right)(1-P\left(D+\right))\right)-\operatorname{}{(\left(1-P\left(T+\middle|D+\right)\right)P\left(D+\right))}+I\left(D,T\right)\right)^{2}\left(\frac{\left(1-P\left(T+\middle|D+\right)\right)P(D+)\left(1-\left(1-P\left(T+\middle|D+\right)\right)P(D+)\right)}{N}\right)\backslash n}{+\left(\operatorname{}\left((1-P\left(T-\middle|D-\right))(1-P\left(D+\right))+P\left(T-\middle|D-\right)(1-P\left(D+\right))\right)+\operatorname{}\left(P\left(T+\middle|D+\right)P(D+)+(1-P\left(T-\middle|D-\right))(1-P\left(D+\right))\right)-\operatorname{}{((1-P\left(T-\middle|D-\right))(1-P\left(D+\right)))}+I\left(D,T\right)\right)^{2}\left(\frac{(1-P\left(T-\middle|D-\right))(1-P\left(D+\right))\left(1-(1-P\left(T-\middle|D-\right))(1-P\left(D+\right))\right)}{N}\right)\backslash n}{+\left(\operatorname{}\left((1-P\left(T-\middle|D-\right))(1-P\left(D+\right))+P\left(T-\middle|D-\right)(1-P\left(D+\right))\right)+\operatorname{}\left(\left(1-P\left(T+\middle|D+\right)\right)P(D+)+P\left(T-\middle|D-\right)(1-P\left(D+\right))\right)-\operatorname{}{(P\left(T-\middle|D-\right)\left(1-P\left(D+\right)\right))}+I\left(D,T\right)\right)^{2}\left(\frac{P\left(T-\middle|D-\right)(1-P\left(D+\right))\left(1-P\left(T-\middle|D-\right)(1-P\left(D+\right))\right)}{N}\right)}\nonumber \\
\end{equation}