Fred Reyers, MMedVet (KLD), Reg. Specialist Clinical Pathologist (S Africa)
Introduction
For the practicing clinician most daily activities revolve around, or are based upon, deciding what is wrong with your patient (diagnosis) and deciding what to do about it (clinical decisions). Although, we intuitively know that these diagnostic and clinical decisions are unlikely to be 100% correct and/or logical, very few of us have ever stopped to ponder the actual degree of inaccuracy. Furthermore, because we appear to be unaware that there is a problem, and if so, the extent of the problem, we do not actively seek ways of improving our skills, if such methods exist.
In this paper we will define this thing, called a diagnosis. We will then examine the literature to discover how good we (biomedical scientists) appear to be at making diagnoses. We will examine what is known about the development of diagnostic ability and how changes with time. We will look at different diagnostic methods and examine whether of these are changed with time and increased expertise. We will then address the question as to whether we can improve our diagnostic ability or speed up the changes in diagnostic methods. Finally, we will have a quick look at another major, and growing field of study: Decision Analysis.
What Is a Diagnosis?
The diagnostic process is one of identifying disease in the patient by its characteristic signs, symptoms and test findings (laboratory, imaging, electro-diagnostics etc)
A diagnosis identifies both the disease present (using terminology accepted in the discipline) as well as the agent/process responsible, where possible.
A diagnosis specifies the above in a clear, succinct form.
Possibly, a diagnosis is best defined by statements that are not diagnoses, but often put forward as diagnoses.
The following are NOT diagnoses
A presenting complaint |
such as |
weight loss |
A clinical sign/finding |
such as |
anisocoria |
A pathophysiologic process |
such as |
haemolytic anaemia |
A patho-anatomical description |
such as |
hepatic centrilobular necrosis |
A "syndrome" (OP terminology) |
such as |
small intestinal diarrhoea |
So, a diagnosis is MORE than any of these
How good are we at making diagnoses?
Pose yourself the question: "What proportion of the diagnoses that I record on patient cards would stand up to the most intense diagnostic workup including autopsy and/or histopathology?"
Select from 100%, 90%, 80%, 70%, 60%, 50%, 40%.
What does the literature say?
Medical literature
As can be seen from the tabulation, below, the error rate is surprisingly high and surprisingly constant over the years. This was also reported by Goldberg et al (2002).
Author |
Year |
Cases (n) |
% Discordance |
Comment |
Cabot |
1912 |
3000 |
40 |
|
Fowler |
1977 |
1000 |
36 |
|
Carvalho |
1991 |
910 |
36 |
|
Kajiwara |
1993 |
997 |
34 to 40 |
|
Burton |
1998 |
1105 |
44 |
6% tumours death
attributable to error |
Attems |
2004 |
1594 |
47.5 |
|
Grade |
2004 |
4828 |
|
Agreement improved
with years |
One may also say, with justification: "I accept that there are some diagnoses that I'm uncertain of, but there are many that I would stake my life on." That may be an expensive wager, but there is certainly some truth in it as shown in the table, below.
Cases
(60 studies) |
Certainty assigned
by Clinician |
%
Discordance |
9 248 |
Fairly certain |
16 |
3 694 |
Probable |
33 |
1 282 |
Uncertain |
50 |
14 617 |
TOTAL |
25 |
Shojania K, Burton E, McDonald K, et al. The Autopsy as an Outcome and Performance Measure. Evidence Report/Technology Assessment No. 58 (University of California at San Francisco-Stanford Evidence-based Practice Center) : Agency for Healthcare Research and Quality. October 2002
Then you may say, based on experience: "Some organ systems are just much more difficult to arrive at a diagnosis than others." Again, you will be right, but studies of this nature still come up with some frightening surprises.
The two studies that the author analysed reveal that there is a marked disease/organ-system variation but also, that there is a marked institutional variation. One of the interesting, but possibly puzzling findings in the second study, is that there was a further 32% discordance between the clinical record and the death certificate.
Examining clinical record against necropsy reports
Disease |
Clinical
diagnosis |
Confirmed
at autopsy |
% Discordance |
Diagnoses at
autopsy,
missed clinically |
Missed as %
of clinically
correct diag. |
Pulm TB |
15 |
7 |
53 |
7 |
100 |
Pulm emboli |
79 |
44 |
44 |
99 |
282 |
Myocard infarct |
256 |
198 |
23 |
51 |
26 |
Cirrhosis |
27 |
22 |
19 |
13 |
59 |
Acute GIT/Abd |
13 |
9 |
31 |
1 |
11 |
TOTAL |
390 |
280 |
28 |
171 |
61 |
Cameron HM, McGoogan E, Clarke J, Wilson BA. Trends in hospital necropsy rates: Scotland 1961-74. BMJ 1977. 1:1577-80
Examining death certificates against necropsy reports
Disease |
Clinical
diagnosis |
Confirmed
at autopsy |
%
Discordance |
Diagnoses at
autopsy,
missed clinically |
Missed as %
of clinically
correct diag. |
Respiratory |
177 |
91 |
49 |
89 |
98 |
Cardio-Vasc |
71 |
46 |
36 |
119 |
257 |
Gastro-Intest |
49 |
26 |
47 |
33 |
127 |
Uro-Genital |
34 |
9 |
74 |
9 |
100 |
Neurological |
70 |
43 |
39 |
5 |
12 |
Others |
47 |
25 |
47 |
13 |
52 |
TOTAL |
448 |
240 |
46 |
268 |
112 |
Singnton JD, Cotterell BJ. Analysis of death certificates in 440 hospital deaths: a comparison with necropsy findings. JCP 2002. 55:499-502 (Aylesbury, UK)
A few interesting side-issues arise from a look at these studies. One of these is that there has been a marked decline, over the years, in the proportion of cases on which necropsies are performed in medical practice. One reason advanced was that with the ever-present possibility of litigation, it is better not to expose your clinical diagnosis to the test of necropsy.
And so we conclude, with a fair degree of confidence and possibly a little "smugness" that our medical colleagues are not doing too well in the diagnosis stakes. From these data, it seems a fairly sound practice to ask for a second and possibly a third opinion before we accept the diagnosis proclaimed by our doctors. We might even develop a sense of superiority.
So, the question arises, how well does veterinary diagnosis-making compare with medical diagnosis-making?
The first startling finding is that the veterinary profession has either hardly ever asked the question and has certainly hardly ever tried to answer the question. Whereas the medical literature will produce a plethora of studies (The recent AHRQ {Agency for healthcare, research and quality} survey {2002} found 225 English-language and 34 foreign-language studies published), there is only ONE single article in the electronic veterinary database (i.e., published since 1968). Fortunately, this single article, published in 2004, was produced by one of the Ivy-League universities (from a veterinary perspective), namely, Davis campus of the University of California, and probably represents current "best practice". After all, if some "third-rate" institution had produced it, we could have claimed exemption.
Section of Vet Hospital |
Cases
(n) |
%
Discordance |
Emergency Critical Care |
139 |
45 |
Internal Medicine |
272 |
44 |
Neurology |
100 |
35 |
Surgery |
33 |
27 |
Cardiology |
19 |
21 |
Oncology |
41 |
15 |
TOTAL |
604 |
39 |
Kent MS et al. Concurrence between clinical and pathological diagnoses in a veterinary medical teaching hospital (623 cases) 1989 and 1999. JAVMA 2004, 224, 3, 403-406. U Ca Davis.
Is There a Problem with the Accuracy of Clinical Diagnoses?
From the above data the obvious answer is, patently yes. The problem is not just limited to the medical fraternity but, most certainly, exists in veterinary medicine too. The problem differs somewhat between sub-disciplines but appears to apply, to a greater or lesser extent, across the board.
So, the questions that we should address next are:
"Do we just live with it?" Or
"Is it possible to improve/make progress?"
At this juncture it is probably appropriate to quote George Bernard Shaw "The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man." (Man & Superman III).
A question that can be asked is: "Do these data apply to everyone?" If not, then is it possible that there are "expert" experts? And, is it possible to improve your expertise?
To address these questions we should have are brief look at the field of research that is owned by biomedical science, psychology and education. Namely, the study of how diagnoses are made or alternately biomedical problem solving.
The Diagnostic Process
Components of the diagnostic process
There are probably many classifications but it would appear that there is consensus that there are three principal components.
1. Data gathering involves
Non-clinical/physical information
Taking of a history and the
Collection of information pertinent to the case that may be environmental and/or circumstantial.
The accumulation of information about the patient's clinical status (signs and symptoms).
2. Data interpretation appears to be clearly divided into two activities, namely.
Understanding the data which implies epidemiological, clinical and medical knowledge, and
Understanding disease which implies knowledge of pathophysiology as well as disease-causing organisms.
3. Establishing the diagnosis is then a process of using the gathered and interpreted data to
Propose a provisional diagnostic hypothesis (and there appear to be a few ways of skinning that cat [pardon the pun])
Ruling out alternative hypotheses.
Sources of error
Using this diagnostic process model, investigators have accumulated erroneous diagnoses and analysed the component/sub-component of the process where the error appeared to have its origin, with the following results.
1. Although the gathering of non-clinical (historical etc) data is considered by most diagnosticians (from 60 to 80% of the success of the diagnostic process is attributed to this phase) to be the single-most important component, it is really surprising that neither the time spent on this process (I imagine within reasonable limitations-allowing that it happens at all) nor the volume of data gathered in this process, appear to bear any significant relationship to the incidence of errors. However, the omission of diagnostically relevant data, pertinent to that specific case, (and therein lies the trick) does promote diagnostic failure.
2. Lack of understanding the data (both medical as well as pathophysiological) leads to diagnostic failure but apparently, more importantly, ignoring diagnostically relevant data (i.e., having gathered it but now not using it) leads to "premature closure" and consequent diagnostic failure.
3. In the generation of diagnostic hypotheses, the generation of inappropriate hypotheses for the available data (logic failure) leads to diagnostic errors and early elimination (discarding) of reasonable diagnostic hypotheses (either as principal or alternate), another form of premature closure, leads to diagnostic failure. Another term for such an error is "idiolepsis", making up your mind what the diagnosis is and thereafter ignoring evidence to the contrary.
The concept of premature closure implies making a diagnosis (establishing a diagnostic hypothesis) when reasonable alternate hypotheses are potentially available but have not been allowed to see the light of day, either because some critical data is missing OR because the clinician is ignoring (possibly not recognizing) available critical data. Of all the diagnostic errors that the authors detected, premature closure, as a concept, was the single most common, explaining fully 91% of all the incorrect diagnoses.
Does our diagnostic ability change with time?
Various investigators have pronounced on the different diagnostic strategies or emphasis employed by
1. Novice diagnosticians (undergraduate students),
2. Fully-schooled diagnosticians (interns or post-graduate students), and
3. Experienced diagnosticians (variously including qualified physicians or specialists).
There appears to be a general consensus (although there are dissenting voices) that the experienced diagnosticians:
Generate fewer diagnostic hypotheses.
Are more likely to include the correct diagnostic hypothesis in their list (so can fall back to a list of alternates that is likely to be successful).
Select the most appropriate diagnostic procedures (laboratory, imaging, electrodiagnostics etc) to rule-in/out competing hypotheses.
It appears that these "indicators of expertise" are related to the degree of diagnostic success as well as the speed at which the correct diagnosis is reached.
The extent to which these "indicators of expertise" are displayed, appears to improve with the student's progression through the medical training programme.
Is there something special about medical diagnostics--or is it all simply problem solving?
Again, the literature is not unanimous (and it is voluminous).
The evidence on one point, however, is fairly clear. The expertise involved in making a diagnosis is NOT the possession of a generic problem solving skill. Training in problem solving per-se does not translate into competence as a medical diagnostician.
Furthermore, it appears that problem solving skills are not exportable from one discipline to another.
Even within medicine, experts in one discipline may struggle to outperform novices (and certainly interns) in an unfamiliar medical discipline.
Are there different diagnostic methods (within medicine)?
Most authors recognise two basic groups of methods:
1. Backward reasoning (deductive) which is also referred to as the "Hypothetico-deductive" method. It appears that novices tend to use this method principally. The process is assumed to work as follows:
Look at the mass of data.
Think of as many differential diagnoses as the data allows and then go back to the data time-and-again to try to accommodate everything.
This tends to drown the novice in data and competing hypotheses.
2. Forward reasoning (inductive): is considered the "province" of the experts who integrate the data to form a few rational hypotheses and then employ a little backward reasoning only to consider inconsistencies and alternatives.
3. A third method has been recognised by some authors who suggest that intermediates (post-grads or interns) use a mixture of both backward and forward reasoning with a strong emphasis on disease process/pathophysiology algorithms or heuristics to achieve and explain the diagnosis. It is believed that experts revert to this mixed mode when faced by difficult cases or when working outside their field of expertise.
As I read the literature, these concepts are beginning to wear thin and the evidence is fairly clear that these models are an outmoded oversimplification. Apparently, the forward vs backward issue is still central to computer-based/artificial-intelligence approaches.
In the audience to whom this paper is addressed, the issue is probably not of how to get from novice to intermediate, where these concepts remain central to the teaching paradigm, but how to get from intermediate to expert and possibly from expert to super-expert. That being the case, we need to examine what the expert appears to do. Because if we can discover that, then we might be able to suggest ways of "thinking" or "training" to climb the ladder of expertise.
What is it that allows experts to succeed at clinical reasoning?
1. Experts are able to recognise non-diagnostic (non-pertinent) data (not always simply "normal" data) almost instantaneously and appear to be able to ignore these data almost subconsciously. When quizzed or interviewed about a case that they have solved they are often unable to recall the non-pertinent data. Novices, on the other hand will usually be able to outperform the expert on the recall of such data. This means that the expert has less data (but pertinent data) to "juggle" around.
2. Experts devote time and attention on inconsistencies (recognising them as such) and this appears to allow them to identify good rational alternative hypotheses. By contrast, novices try to shoehorn inconsistencies into their favoured hypothesis.
3. Experts are able to integrate historical, environmental and circumstantial data (some of which can be described as "enabling" data) with the clinical picture and thereby reduce the number of possible hypotheses.
4. Experts are presentation-sequence driven. They seem to be able to use the presentation course and sequence to rapidly access the more likely diagnostic hypotheses that would fit such a pattern. It is of particular interest that when case data are randomised experts tend to lose much of their diagnostic advantage over non-experts.
5. Experts appear not to think in pathophysiological terms (unlike intermediates). They under-perform on post-case quizzes recalling or verbalising pathophysiology applying to the case. However, they do know the pathophysiology. They appear to integrate pathophysiology into the clinical signs and clusters of signs, subconsciously. They seem to carry pathophysiologic concepts "inside" these clinical data.
6. Experts appear to use "illness scripts" in solving clinical diagnostic problems. There is some controversy about this issue but, currently, the consensus is that there is such a thing as an illness script (more about that, below). The difference of opinion seems to involve the issue of whether there are "experience-based" ready-made scripts in memory and the expert simply picks the appropriate one after a quick intra-cranial search or whether there are generic script models, in memory, and the expert "fleshes" these out with data from the case. These scripts allow the expert to reach a diagnosis (or select the most appropriate diagnostic hypothesis) extremely rapidly without having to go through a large amount of mental gymnastics.
The illness script concept
History
While the forward and backward reasoning and the Hypothetico-deductive models were being used to create insight (and some confusion) into diagnostic thinking, and guide the developments in biomedical teaching as well as artificial intelligence, there were some "unreasonable men", to use Shaw's phrase, developing an alternate approach to both teaching (PBL) as well as understanding medical expertise (Scripts). Howard Barrows, at the Southern Illinois University School of Medicine in Springfield, Illinois led a group of "alternate thinkers" including Paul Feltovich (Urbana, Illinois), Henk Boshuizen at the University of Maastricht in the Netherlands, Geoff Norman at McMaster University, Ontario, Canada (later joined by Henk Schmidt, now at Maastricht) and the late David Maddison at the University of Newcastle, Australia.
In a series of studies and publications Barrows, Feltovich, Schmidt and Boshuizen, particularly, developed the concept of "illness scripts" and how these appear to facilitate medical problem solving by experts.
Schematic representation
The illness script can be represented as a short tabulation of typical/commonly encountered findings in a particular disease, arranged in a logical (possibly temporal) order with default values/scores for the most important criteria.
A possible script for Congenital Porto-Systemic Shunting
Script Attribute |
Default Item Criteria |
These may be in temporal or chronology order |
Degree of abnormality or severity |
Relationships |
Prevalence |
Complaint "seizures" |
Transient fits, head-pressing, stargazing |
Time of day, meals |
Often first sign |
Complaint "runt/poor doer", stunted |
Marked to severe |
Despite Normal food intake |
Almost invariably |
Course chronic |
Moderate |
No acute signs |
Invariably |
Breed |
Yorkie, Min Schnauzer |
|
Common |
Age |
Less than 6 mts |
Breed, runt |
Almost invariably |
PU/PD |
Moderate |
Non-azotemic |
Common |
GIT signs |
Moderate vomit, anorexia |
|
Intermittent |
Microcytosis |
Moderate |
No hypochromia
Acantho. / targeting |
Common but incidental finding |
Urate uroliths/crystals |
Mild |
Non-Dalmatian |
Common but incidental finding |
Script formation
The script appears to exist as a knowledge network in the expert's memory (long term). It is thought to be created when a clinician experiences a patient with a particular disease (possibly initially based on things that struck the clinician). It is possible that as more cases of this disease are seen, the script becomes more and more sophisticated (esp. in terms of the default values and the attribute chronology).
Script implementation
The clinician experiences the patient in terms of the presenting data. This usually includes the presenting complaint (which may even have been transmitted telephonically when the patient was scheduled) as one of the first issues as well as the history.
By the time the clinician has got half-way (if that far) through the examination, some of the script attributes will have "triggered" a sub-conscious script search and certain candidate scripts seem to be brought from deep memory into a sort of "Random Access" (to use a computer term) memory.
It is possible that several scripts, with similar early (from a case evaluation point of view) script attributes, are mobilised simultaneously and then those that fail to give further expected attributes (or attributes inconsistent with the script) and acceptable default values will be dropped.
The more "hits" (script attributes present), the more certain the diagnosis.
Script items that do not conform to default values alert the clinician to the possibility of the script's inapplicability and if there are a number of non-default values, the script is rejected.
There appears to be a "when these are present, that is sufficient" logic that operates and this may explain why scripts can allow very rapid diagnostic conclusions-with a possible control process that checks if any of the defaults are badly violated.
How much of this is sub-conscious and how much conscious problem solving does not appear to have been clearly identified. It can be seen from the above that this is very similar to the hypothetico-deductive problem solving strategy that the script concept seeks to depose. The more sub-conscious, the less hypothetico-deductive.
Acquiring expert clinical problem solving skills
The Problem Oriented Medical Record (POMR) system, that has been the hallmark of biomedical education for many decades now, was developed and introduced into medical teaching by Lawrence Weed, who simply extended a neat hospital record keeping system into a teaching tool. In veterinary education, Osborne was the advocate of its use in teaching.
It appears that the POMR system is not suited to the creation of expertise. Whether it is essential to the creation of a post-novice status is a point that will be hotly debated as most biomedical curricula are designed around the system and changing that would be very disruptive.
If the illness script is the tool that experts use, then it makes sense to find ways and means of acquiring good script networks.
Features of scripts that have implication for their acquisition
Currently, script development is something that "just happens" as clinicians are exposed to patients.
It is likely that scripts, as such, cannot be transmitted from one expert's mind into another's by some form of rote learning. They have to be constructed (possibly unique in their detail for each individual) by the individual.
However, there is every reason to believe that the process of script building (and possibly script implementation) can be facilitated by creating a "script-positive" learning environment.
Scripts represent elaborated and organised knowledge.
It should be possible to construct efficient and well-structured knowledge bases.
Scripts appear to be constructed onto/from pre-existing knowledge networks in memory.
Scripts may develop through a process of becoming aware of explicit disease features when exposed to a case with such a disease (i.e., they need some explication of the case).
This explication, and thus the possibility of building good scripts, may be enhanced by discussing cases with one's peers or instructors.
Studies have shown that when experts discuss cases with their peers they very often render the case in the form of a script-but often do not mention all the script attributes, only the outstanding (interesting) features, apparently "knowing" that the listeners already have most of the necessary script information in their own heads. Consequently, this would have to be a conscious, deliberate process.
Proposal for improving, accelerating and/or acquiring good scripts
With each case that the clinician experiences, a script tabulation (similar to the one described, above) should be constructed and filed with the records.
From time to time (and preferably on a regular basis) clinicians should discuss their more interesting cases with peers or instructors with an emphasis on:
Rendering the case in script format
Explicating the reasons for the script attributes
Pontificating on the default values
Conclusion
It is hoped that by highlighting the very fallible nature of clinical diagnoses and outlining some of the studies on expert clinical reasoning, the readers have been encouraged to seek ways and means of improving their own clinical problem solving skills.
The illness script concept appears to present an answer to those who seek to improve their skills.
Decision analysis
It is probably not obvious (but commonly experienced) that there are times when decisions (particularly therapeutic and prognostic) are made without the benefit of an accurate, verified final diagnosis. In trauma medicine, for instance, the process of triage is a very good example. True, a degree of diagnosis is made but this may be limited to a pathophysiological concept rather than a true diagnosis. An example would be "Pneumothorax--do the following". The actual cause of the pneumothorax and/or the underlying disease process is, at that stage, not as important as applying life-saving measures.
One can extend the logic on which the pneumothorax diagnosis is based, to other cases where the immediate health implications may not be that serious. In veterinary medicine, in particular, financial considerations play a much more important role than they do in human medicine and it is quite rational to ask the question: "Do I need to make a diagnosis in order to take the right action?"
The field of decision analysis is involved in such questions.
Decisions about what to do with a patient are based on some or other probability/odds estimation of the likelihood that a certain process (or disease) is present. To a large extent this is dependent on a good grasp of situational analysis and epidemiological insights--and provides what could be called a "pre-test likelihood" (sometimes expressed as an odds ratio). A good illustration of this would be making a decision on what action to take when presented with a dog that has severe neurological signs reminiscent of rabies. The reaction to such a situation when confronted by a mature dog with a known solid vaccination history (i.e., had many shots) from an urban apartment environment would be quite different to the reaction to a young farm dog, with unknown vaccination status from a rural farm in KwaZulu Natal. In both cases there is a presumptive diagnosis (even if only transiently) of rabies. There is not even a flicker of thought of confirming the diagnosis (brain smears and histopathology are not well tolerated). The pre-test probability, in the first case is extremely low and in the second, quite high.
This example also acts as a good example of the next step/phase in medical decision analysis, namely, to evaluate the risk of harm (to the patient and, in this instance possibly also to 3rd parties) and balance that with the possibility of improvement or benefit (and those are not necessarily the same) if a certain procedure (such as pursuing a diagnosis) is carried out.
A management procedure or device has been proposed that allows a more rational approach to decision analysis and it is called the action threshold principle. Briefly, the decision whether to conduct further diagnostic tests or implement a therapeutic intervention is based on a balance between risk of harm and potential benefit and can be illustrated by the table below.
Harm/Risk of treatment |
Very High |
Dx |
Dx |
Dx |
Dx |
Dx |
| |
Dx |
Dx/Rx |
?? |
Rx/Dx |
Rx |
High |
Dx |
Dx |
Dx |
Dx |
Dx |
| |
Dx/Rx |
?? |
Rx/Dx |
Rx |
Rx |
Medium |
Dx |
Dx |
Dx |
Dx/Rx |
?? |
| |
Rx/Dx |
Rx |
Rx |
Rx |
Rx |
Low |
Dx |
Dx/Rx |
?? |
Rx/Dx |
Rx |
| |
Rx |
Rx |
Rx |
Rx |
Rx |
% Certainty of Diag |
10 |
20 |
30 |
40 |
50 |
| |
60 |
70 |
80 |
90 |
100 |
Dx: Pursue / Spend effort on the diagnosis. Do not apply any treatment yet
Dx/Rx: Continue to pursue the diagnosis and judiciously apply some treatment
??: Tough call.
Rx/Dx: Apply treatment while pursuing the diagnosis (but possibly less vigorously)
Rx: Apply treatment Probably not worth spending more time on diagnosis.
The risk axis (vertical/left hand side) ranges from low to very high and applies to the risk/harm of applying treatment. The diagnostic probability (certainty) axis (horizontal) ranges from (10%) to (100%).
Depending on where the clinician finds him/herself in this grid, the action to be taken is determined by the interplay between the two issues of treatment risk and need for diagnostic certainty and is illustrated by the abbreviations within each box.