Abstract 1978

In 2005, a NIH consensus conference was held to better define methods for research in chronic GVHD (cGVHD). Provisional definitions of response categories for individual organs and overall cGVHD disease activity were proposed: complete response (CR), partial response (PR), stable disease (SD) and progressive disease (PD). These response criteria were designed to improve consistency in documentation of disease activity across different centers, to allow less biased response assessments by comparison of enrollment and follow-up measures, rather than relying on clinician perceptions of change in the setting of clinical trials. In this study, we compared the proposed response criteria with clinician-reported changes in organ specific and overall responses. Good agreement would suggest that the proposed response criteria mirror clinician judgments of whether patients are responding to treatment or not. Methods: Patients ≥ 2 years of age diagnosed with cGVHD requiring systemic treatment ≤ 3 years after transplantation were eligible and assessed every 3–6 months. At each visit, clinicians reported the following: organ specific measures (used to calculate the NIH organ response for skin, mouth, eye and overall), perception of change in organ and overall involvement (completely gone = CR; very much or moderately improved = PR; a little better, stable, or a little worse = SD; or moderately or very much worse = PD), and overall aggregate response (CR, PR, SD, PD). Kappa statistics were used to compare agreement between these measures, with 0.21–0.4 considered fair agreement. Results: As of September 2010, 290 patients who had at least one follow-up visit 3 or 6 months beyond enrollment were included, with median age of 51 years (2–79). Based on NIH overall response criteria, 24 (8%) had CR, 83 (29%) had PR, 25 (9%) had SD, and 158 (54%) had PD for an overall CR+PR of 37%. In contrast, clinicians reported that 31 (11%) had CR, 171 (59%) had PR, 30 (10%) had SD and 56 (19%) had PD for an overall CR+PR of 70%. For organ specific comparisons, agreement rates between NIH proposed response measures and clinician reported changes in skin, mouth and eye were fair. For overall response, agreement rates between the calculated NIH response and clinician-reported overall change and clinician-reported response status were also fair. (Table) Conclusions: For both organ-specific and overall comparisons, the proposed NIH response criteria do not agree well with responses determined by clinicians. These data suggest that conclusions from prior literature reporting high overall CR+PR rates based on clinician judgment would not be supported if the current NIH response criteria had been used to measure response. Additional studies are needed to validate candidate response criteria through correlation with a robust, objective and informative gold standard.

Table.

Calculated NIH and clinician reported response rates in specific organs and overall

OrganResponse measureNNICRPRSDPDKappa with NIH response
Skin Calculated NIH skin response 286 35% 22% 7% 15% 21%  
 Clinician reported skin change 286 29% 17% 17% 32% 5% 0.39*/0.43** 
Mouth Calculated NIH mouth response 287 20% 15% 7% 45% 13%  
 Clinician reported mouth change 287 20% 15% 29% 33% 4% 0.28*/0.35** 
Eye Calculated NIH eye response 168 40% 10% 4% 26% 19%  
 Clinician reported eye change 168 44% 2% 10% 39% 5% 0.29*/0.26** 
Overall Calculated NIH overall response 288 — 8% 29% 9% 54%  
 Clinician reported overall change 285 — 7% 41% 45% 8% 0.24** 
 Clinician reported response status 288 — 11% 59% 10% 19% 0.20** 
OrganResponse measureNNICRPRSDPDKappa with NIH response
Skin Calculated NIH skin response 286 35% 22% 7% 15% 21%  
 Clinician reported skin change 286 29% 17% 17% 32% 5% 0.39*/0.43** 
Mouth Calculated NIH mouth response 287 20% 15% 7% 45% 13%  
 Clinician reported mouth change 287 20% 15% 29% 33% 4% 0.28*/0.35** 
Eye Calculated NIH eye response 168 40% 10% 4% 26% 19%  
 Clinician reported eye change 168 44% 2% 10% 39% 5% 0.29*/0.26** 
Overall Calculated NIH overall response 288 — 8% 29% 9% 54%  
 Clinician reported overall change 285 — 7% 41% 45% 8% 0.24** 
 Clinician reported response status 288 — 11% 59% 10% 19% 0.20** 

NI, not involved; CR, complete response/completely resolved; PR, partial response/moderately better, very much better; SD, stable disease/a little better/stable/a little worse; PD, progressive disease/moderately worse, very much worse

*

simple kappa, including all patients

**

weighted kappa, limited to patients with involvement by both measures at enrollment

Disclosures:

No relevant conflicts of interest to declare.

This icon denotes a clinically relevant abstract

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution