TO THE EDITOR:

"No man can judge what is good evidence on any particular subject, unless he knows that subject well"George Eliot, "Middlemarch"

Clinical practice guidelines are an important tool for improving health care decision-making and outcomes.1 In an unpublished 2021 survey of 454 practicing hematologists and hematopathologists conducted for the American Society of Hematology (ASH), 91% of respondents indicated that guideline panels should comprise individuals reputed to be experts. Thus, the successful production of guidelines, and trust in the finished product, depends on recruiting panelists with a variety of expertise.2-7 The optimal method of identifying the experts to serve on guideline panels, however, is not established.

There is no agreed-upon method to identify individuals with expertise. Theories of medical expertise typically consider that expertise requires the successful integration of content-based basic scientific and clinical knowledge essential for clinical problem-solving.8-10 This integration is challenging to measure, and experts have instead been identified using criteria that judge activities and experiences that can lead to the development of expertise. Among these experiences is the belief that someone who has spent many hours in deliberative practice, popularly referred to as the “10 000 hours rule,” can attain the status of an expert.11,12 However, prior work has indicated that experience itself is a poor surrogate marker because the number of years in practice does not reliably correlate with favorable clinical outcomes.13 Other surrogates, such as reputation (a reflection of popularity), education (minimally expected level of competence), or title (titles remain even if our skills deteriorate) are unreliable markers of expertise.14,15 

Theories of medical expertise indicate that both content and methodological expertise should be incorporated into guideline panels. Domain knowledge and experience are generally viewed as necessary components of expertise but are not deemed sufficient to attain the status of expert.7 In addition to content knowledge, the methodological skills required to generate trustworthy guidelines include competence in using evidence-based medicine (EBM) principles1 advanced through systems of rating the certainty of evidence and strength of recommendations.16 

According to Weiss and Shanteau, the cornerstone of expertise is judgmental competence.14,17-20 This assumes that the judgments of panelists with varying background expertise are expected to result in complementary perspectives. Paired with EBM principles, these perspectives should foster the development of robust guidelines. The judgments of content experts, however, have been challenged as an obstacle to developing trustworthy guidelines as a conflict of interest(s) as well as the vested interests of professional societies can result in inherent bias among some panelists.1,21,22 As a result, some have suggested that the guidelines should be developed only by panelists who are methodological experts, which would eliminate judgments on content entirely.1,21,22 However, this suggestion risks stripping the panels of critical content knowledge required to contextualize guideline recommendations and make them useful to practitioners and the public. It also raises the risk of providing inaccurate recommendations, as methodologists who lack clinical understanding are prone to make faulty judgments.16,21-28 

Over several decades, objective methods to define expertise have been developed. Weiss and Shanteau have argued that given the vagaries of traditional methods of identifying experts, expertise needs to be assessed empirically. This proposed empirical measurement, in the absence of a gold standard, has been dubbed the CWS (Cochran, Weiss, and Shanteau) index (Figure 1).14,17-20 The CWS index represents a continuum, and its assessment can only be done when experts are compared with each other. This makes it infeasible to apply it at the inception of the development of guideline panels and suggests that expertise cannot be identified a priori.

Figure 1.

Comparison of CWS index across 4 different guidelines panels. Use of the CWS index to measure the expertise in guidelines panels. CWS is defined as a ratio between discrimination (which refers to the expert's differential evaluation of the various stimuli within a given set) and consistency (which refers to the judge's evaluation of the same stimuli over time or inconsistency, which is used in the formula to represent its complement). Higher the CWS ratio, the greater the expert's performance.14,17-20 This means that expertise is relative to one's peers. It is, therefore, challenging, if not impossible, to identify individuals in advance of a guideline panel without a "ground truth" or gold standard to compare these potential experts' knowledge.14,17-20 The participants were asked to make their judgments about the strength of recommendations (SoR) before vs immediately after the meeting deliberation. Although the CWS index for panel 2 was statistically significantly higher than other panels (P = .019), no statistically significant difference was detected in the judgments among the panelists’ members according to their role on the panel. (The analysis was performed using a linear mixed-effect model to control for judgments among panelists clustered within the guidelines panels). n = 45, based on unpublished data using ASH and other society guidelines. The figure shows a large variation in judgments among panel members making an argument for using the CWS index to identify the best people to serve on the panel. The results also demonstrate that judgments among the panels significantly differ as though they have relied on different types of knowledge.2,3 However, in this analysis, no difference was detected in judgments among the panelists’ members according to their role on the panel, possibly because of the small number of panelists (n = 45). The limitations of this system include the absence of an absolute value or cut-off, above which we label someone as an expert. In addition, as noted in the main text, the system cannot be applied a priori, making its practical application difficult.

Figure 1.

Comparison of CWS index across 4 different guidelines panels. Use of the CWS index to measure the expertise in guidelines panels. CWS is defined as a ratio between discrimination (which refers to the expert's differential evaluation of the various stimuli within a given set) and consistency (which refers to the judge's evaluation of the same stimuli over time or inconsistency, which is used in the formula to represent its complement). Higher the CWS ratio, the greater the expert's performance.14,17-20 This means that expertise is relative to one's peers. It is, therefore, challenging, if not impossible, to identify individuals in advance of a guideline panel without a "ground truth" or gold standard to compare these potential experts' knowledge.14,17-20 The participants were asked to make their judgments about the strength of recommendations (SoR) before vs immediately after the meeting deliberation. Although the CWS index for panel 2 was statistically significantly higher than other panels (P = .019), no statistically significant difference was detected in the judgments among the panelists’ members according to their role on the panel. (The analysis was performed using a linear mixed-effect model to control for judgments among panelists clustered within the guidelines panels). n = 45, based on unpublished data using ASH and other society guidelines. The figure shows a large variation in judgments among panel members making an argument for using the CWS index to identify the best people to serve on the panel. The results also demonstrate that judgments among the panels significantly differ as though they have relied on different types of knowledge.2,3 However, in this analysis, no difference was detected in judgments among the panelists’ members according to their role on the panel, possibly because of the small number of panelists (n = 45). The limitations of this system include the absence of an absolute value or cut-off, above which we label someone as an expert. In addition, as noted in the main text, the system cannot be applied a priori, making its practical application difficult.

Close modal

This brief review of theoretical and empirical knowledge on expertise has the following implications for selecting individuals to serve on guidelines panels:

Firstly, there is no universally accepted definition of expertise and no well-defined, validated approach for the selection of guideline panelists.7,14 As a result, it is usual practice to let those in a domain define that domain’s experts.7 Narrow measures of physician experience (ie, publications, tenure, and career stage) may be used to identify panelists; however, this should be done with the recognition that it is an imperfect practice. Reducing reliance on these proxy measures would allow for a broader array of individuals to participate in guideline development, enhancing panel diversity while adhering to methodological EBM approaches. As domain knowledge and relevant experience are required but not sufficient for expertise, it is usual practice that persons with adequate training and reputations are nominated as panelists.

Secondly, expertise is not universal: the same individuals may display expert competence in some settings and not in others. Expert competence depends on the task characteristics.14,17-20 For example, acute leukemia specialists with advanced knowledge in that discipline would not be considered to serve on nonleukemia panels. Having providers from different disciplines (eg, hematologists, hematopathologists, and palliative care providers serving on a leukemia panel) is expected to provide complementary perspectives and enhance the quality of the guidelines. Correctly matching an individual's task expertise with the scope of the guideline panel is a critical component of panel development.

Thirdly, the experts display various psychological and cognitive strategies to support their decisions. These are difficult to measure but generally include the use of dynamic feedback, reliance on decision aids, including a summary of evidence generated by systematic reviews, a tendency to decompose complex decision problems, and revisiting solutions to the problems at hand.7 Most importantly, when a task displays suitable characteristics, the experts usually exhibit accurate and reliable judgments. Such tasks are characterized by being similar, stable, and predictive or repetitive over time; they often can be clearly articulated, decomposed, and are suitable for objective analysis or can be solved by using decision aids.7 An “unaided expert may be an oxymoron since competent experts will adopt whatever aids are needed to assist their decision making.”7 

The roles of the guideline chair and cochair persons are essential to the success of guideline development. Research on guideline panels' decision-making demonstrates that the chair/cochairs significantly guide the conversation during meetings. In a survey of voting members from guideline panels, the chairs and cochairs numerically composed <10% of the panel members yet accounted for >50% of the discussion.29 A separate analysis reported similar results with the chair, cochair, and methodologist initiating and receiving >50% of all communication, with 42% of communication occurring between these individuals.29 

The use of systematic reviews, evidence tables, and/or decision aids is a prerequisite for developing evidence-based guidelines, thus creating an environment favorable for exercising accurate and reliable expertise both at a domain and methodological level. Indeed, empirical data show that expert panelists exhibit important features of high-ability participants, which is to follow instructions that require cognitive effort and suppress the influence of other factors and prior beliefs, although nonreproducible and biased judgments can still occur.30 The latter may be because most panelists receive little training in EBM or Grading of Recommendations Assessment, Development, and Evaluation practices, relying on chairs and cochairs (who are typically methodologists) to guide the process. An unfortunate consequence of this, as outlined earlier, is that these individuals dominate the process.

We can attempt to identify individuals to serve on guidelines panels by adhering to the theoretical considerations of expertise, particularly domain knowledge and relevant experience. For organizations such as ASH, we can expect that large numbers of people would meet these criteria. How, though, should panelists be selected from a large pool of qualified candidates? At present, most panelists are selected using the word-of-mouth approach. Solicitation of public nominations assures a diverse panel of individuals who meet conflict of interest criteria and wish to serve on the panels. This approach is pragmatic and efficient, yet it risks the exclusion of less prominent content experts and reduces diversity, equity, and inclusion. For these reasons, it is suboptimal.

When an empirical or evidence-based approach is not possible, a transparent and explicit process may provide trust and increase confidence in the panel selection process. For example, noticing the influential role of the chairs and cochairs in the development of guidelines, the UK National Institute for Health and Care Excellence publicly advertises the position for a chair and other applicants with a detailed job/person profile.4 The chair's responsibility to serve as the chair of the committee rather than a purveyor of a topic or methodological knowledge should be emphasized. All professional organizations can adopt this approach and ask their members to nominate (or, self-nominate) a chair and participants from an eligible pool of members.

We can also endorse good practices from the survey research to help us select qualified individuals fairly and without bias. Many societies keep directories that include the members' areas of expertise. Using a lottery to choose guideline panelists from the eligible membership pool represents a transparent mechanism for ensuring fairness in the process.31 For example, we can use a directory to identify the target population from which panelists will be drawn (eg, all acute myeloid leukemia experts who are ASH members). Next, we identify the sampling frame (ie, adult acute myeloid leukemia experts who are ASH members potentially enriched with additional experts who may not be ASH members). Finally, we use randomization techniques to generate an unbiased selection of a final sample of 10 or 20 individuals to be invited to serve on the panel. Randomization can be stratified to ensure that panels are diverse and representative. The assumption, here, is that by focusing on a fair process, we will also be able to identify competent individuals who are hypothesized to exist among members of a professional society.

Recognizing that trusted experts are required and that a well-established method for identifying these individuals does not exist, the ASH Committee on Quality forms an ad hoc group to oversee the development of new guideline panels. A public recruitment process can be used to assure the generation of a large, diverse pool of candidates who will then be vetted to assess the interest, availability, and eligibility to serve on the panel. Through an iterative process, a final panel of ∼25 individuals can be developed and sent to the ASH Committee on Quality for approval.

Guideline developers and end users place a high value on the panelist’s expertise; however, it can be challenging to determine whether actual panelists are genuine experts or not. Expertise cannot be identified a priori. However, proxy markers of productivity, reputation, and experience, in combination with a transparent and explicit guideline development process, can make it more likely that individuals in guideline panels are truly experts, thus, increasing the likelihood that their recommendations are robust and trustworthy.

Acknowledgments: This work was supported in part by grant R01HS024917 from the Agency for Healthcare Research and Quality (B.D.).

The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality.

Contribution: M.B., R.M., R.K., D.S., and B.D. developed the initial draft; M.B. and B.D. wrote the manuscript; and all authors reviewed, edited, and approved the final version.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Benjamin Djulbegovic, Division of Hematology/Oncology, Department of Medicine, Medical University of South Carolina, 39 Sabin St MSC 635, Charleston, SC 29425; e-mail: djulbegov@musc.edu.

1.
Graham
R
,
Mancher
M
,
Wolman
DM
,
Greenfield
S
,
Steinberg
E
, eds.
Clinical Practice Guidelines We Can Trust
.
National Academies Press
;
2011
. .
2.
Wieringa
S
,
Dreesens
D
,
Forland
F
, et al
.
Different knowledge, different styles of reasoning: a challenge for guideline development
.
BMJ Evid Based Med
.
2018
. ;
23
(
3
):
87
-
91
.
3.
Zuiderent-Jerak
T
,
Forland
F
,
Macbeth
F
.
.
Guidelines should reflect all knowledge, not just clinical trials
.
2012
. ;
345
:
e6702
.
4.
Kunz
R
,
Fretheim
A
,
Cluzeau
F
, et al
.
Guideline group composition and group processes
.
Proc Am Thorac Soc
.
2012
. ;
9
(
5
):
229
-
233
.
5.
Schünemann
HJ
,
Wiercioch
W
,
Etxeandia
I
, et al
.
Guidelines 2.0: systematic development of a comprehensive checklist for a successful guideline enterprise
.
Can Med Assoc J
.
2014
. ;
186
(
3
):
E123
-
E142
.
6.
Qaseem
A
,
Forland
F
,
Macbeth
F
, et al
.
Guidelines International Network: toward international standards for clinical practice guidelines
.
Ann Intern Med
.
2012
. ;
156
(
7
):
525
-
531
.
7.
Shanteau
J
.
Competence in experts: the role of task characteristics
.
Organ Behav Hum Decis Process
.
1992
. ;
53
(
2
):
252
-
266
.
8.
Violato
C
. Assessing Competence in Medicine & Other Health Professions.
CRC Press
;
2019
. .
9.
Polanyi
M
. Personal Knowledge: Towards a Post-Critical Philosophy.
University of the Chicago Press
;
1962
. .
10.
Mercier
H
,
Sperber
D
. The Enigma of Reason.
Harvard University Press
;
2017
. .
11.
Frederickson
DS
.
Sorting out the doctor's bag
.
Control Clin Trials
.
1980
. ;
1
(
3
):
263
-
267
.
12.
Abdulnour
R-EE
,
Parsons
AS
,
Muller
D
,
Drazen
J
,
Rubin
EJ
,
Rencic
J
.
Deliberate practice at the virtual bedside to improve clinical reasoning
.
N Engl J Med
.
2022
. ;
386
(
20
):
1946
-
1947
.
13.
Tsugawa
Y
,
Newhouse
JP
,
Zaslavsky
AM
,
Blumenthal
DM
,
Jena
AB
.
Physician age and outcomes in elderly patients in hospital in the US: observational study
.
BMJ
.
2017
. ;
357
:
j1797
.
14.
Weiss
DJ
,
Shanteau
J
.
Who's the best? A relativistic view of expertise
.
Appl Cognit Psychol
.
2014
. ;
28
(
4
):
447
-
457
.
15.
Weiss
DJ
,
Shanteau
J
,
Harries
P
.
People who judge people
.
J Behav Decis Making
.
2006
. ;
19
(
5
):
441
-
454
.
16.
Djulbegovic
B
,
Guyatt
GH
.
Progress in evidence-based medicine: a quarter century on
.
Lancet
.
2017
. ;
390
(
10092
):
415
-
423
.
17.
Weiss
DJ
,
Shanteau
J
.
Empirical assessment of expertise
.
Hum Factors
.
2003
. ;
45
(
1
):
104
-
116
.
18.
Weiss
D
,
Shanteau
J
. Do judgments alone provide sufficient information to determine the expertise of the judge who made them? Paper presented at: 11th International Symposium on Aviation Psychology.
2000
. Accessed 6 May 2023. https://www.academia.edu/2734422/Do_judgments_alone_provide_sufficient_information_to_determine_the_expertise_of_the_judge_who_made_them. Columbus, OH.
19.
Shanteau
J
,
Weiss
DJ
,
Thomas
RP
,
Pounds
JC
.
Performance-based assessment of expertise: how to decide if someone is an expert or not
.
Eur J Oper Res
.
2002
. ;
136
(
2
):
253
-
263
.
20.
Weiss
DJ
,
Shanteau
J
.
Decloaking the privileged expert
.
J Manag Organ
.
2012
. ;
18
(
3
):
1276
-
1304
.
21.
Ioannidis
JPA
.
Professional societies should abstain from authorship of guidelines and disease definition statements
.
Circ Cardiovasc Qual Outcomes
.
2018
. ;
11
(
10
):
e004889
.
22.
Gøtzsche
PC
,
Ioannidis
JPA
.
Content area experts as authors: helpful or harmful for systematic reviews and meta-analyses?
.
BMJ
.
2012
. ;
345
:
e7031
.
23.
Kwo
PY
,
Shiffman
ML
,
Bernstein
DE
.
The Cochrane review conclusion for hepatitis C DAA therapies is wrong
.
Am J Gastroenterol
.
2018
. ;
113
(
1
):
2
-
4
.
24.
Djulbegovic
B
,
Guyatt
GH
,
Ashcroft
RE
.
Epistemologic inquiries in evidence-based medicine
.
Cancer Control
.
2009
. ;
16
(
2
):
158
-
168
.
25.
Ericsson
KA
,
Charness
N
.
Expert performance: its structure and acquisition
.
Am Psychol
.
1994
. ;
49
(
8
):
725
-
747
.
26.
Ericsson
KA
,
Pool
R
. Peak: Secrets From the New Science of Expertise.
Mariner Books
;
2017
. .
27.
Chalmers
J
.
Guidelines under fire again!
.
Hypertension
.
2017
. ;
70
(
2
):
238
-
239
.
28.
Messerli
FH
,
Hofstetter
L
,
Agabiti-Rosei
E
, et al
.
Expertise: no longer a sine qua non for guideline authors?
.
J Hypertens
.
2017
. ;
35
(
8
):
1564
-
1566
.
29.
Djulbegovic
B
,
Hozo
I
,
Li
SA
,
Razavi
M
,
Cuker
A
,
Guyatt
G
.
Certainty of evidence and intervention's benefits and harms are key determinants of guidelines' recommendations
.
J Clin Epidemiol
.
2021
. ;
136
:
1
-
9
.
30.
Djulbegovic
B
,
Reljic
T
,
Elqayam
S
, et al
.
Structured decision-making drives guidelines panels' recommendations "for" but not "against" health interventions
.
J Clin Epidemiol
.
2019
. ;
110
:
23
-
33
.
31.
Silverman
WA
,
Chalmers
I
.
Casting and drawing lots: a time-honoured way of dealing with uncertainty and for ensuring fairness
.
BMJ
.
2001
. ;
323
(
7327
):
1467
-
1468
.

Author notes

Data are available on request from the corresponding author, Benjamin Djulbegovic (djulbegov@musc.edu).