Abstract 5097

Method

The goal of this research is to assess intra-reader variability in identifying centroblast (CB) cells from digitized H&E-stained Follicular Lymphoma (FL) cases. We have enrolled three board-certified hematopathologists experienced in FL grading to complete two reading sessions on 51 High Power Field (HPF: 40 × magnification) images. These images were selected randomly from a database of five hundred HPF images collected from 17 different patients. The dataset is comprised of lymphoma cases with different grades FL and all grades (1, 2, and 3) are represented. Each pathologist was asked to grade the same set of images (51 images, 3 per patient) on different days (with a minimum of two month intervals). In the second reading the order of the images were also randomized. In each reading session, the pathologists examined digital images and recorded the spatial coordinates of CBs using in-house built marking software that allowed pathologists to mark CB cells using only their computer mouse.

Experimental Results

The results from each reading session were analyzed in terms of FL grade which was determined by averaging the centroblast counts across the three images for a patient and assigning grade using the standard WHO guidelines: Grade I = 0–5, Grade II = 6–15, Grade III = > 15 centroblasts/image. Two different kappa statistics were used to measure agreement. First, we used a weighted kappa in order to measure intra-reader agreement on the three level grade and then we computed a simple kappa measuring agreement on a two level diagnosis: Grade I or II (no chemoprevention assigned) versus Grade III (chemoprevention assigned).

Weighted Kappa for Three Level Grade

PathologistKappap-value
0.4295 0.0824 
0.4688 0.0515 
0.0015 
Mean 0.6328  
PathologistKappap-value
0.4295 0.0824 
0.4688 0.0515 
0.0015 
Mean 0.6328  

Kappa for Two Level Diagnosis (Grade I or II vs. III)

PathologistKappap-value
0.4295 0.0824 
0.4688 0.0515 
0.0015 
Mean 0.6328  
PathologistKappap-value
0.4295 0.0824 
0.4688 0.0515 
0.0015 
Mean 0.6328  

Landis and Koch [1] guidelines for degree of Kappa agreement; < 0 poor, 0–0.2 slight, 0.4–0.6 moderate, 0.6–0.8 substantial, 0.81–1 almost perfect.

Discussion

Table 1 provides the weighted kappa statistics based on the three level grading system. There was statistically significant agreement for each pathologist with pathologist 3 exhibiting nearly perfect agreement in grade while pathologists 1 and 2 exhibited moderate agreement. However, when we examined agreement based on the clinically significant diagnosis (Grade I or II versus III) (see table 2), the kappa statistics for pathologists for 1 and 2 were not significantly different from zero suggesting that there is not significant agreement in their diagnoses. Pathologist 3, on the other hand, exhibited perfect agreement in the two level diagnosis across readings.

Conclusion

In this study, we have examined intra-reader variability in grading follicular lymphoma in digital images. Although similar studies have been conducted to measure the variability at the slide level [2], this is the first time we are reviewing the variability in CB detection. A larger data set will be considered in the near future to generalize the results.

Reference

1. J. R. Landis and G. G. Koch (1977) “The measurement of observer agreement for categorical data” in Biometrics, 1977, vol. 33, pp. 159–174

2. G. E. Metter, B.N. Nathwani, J.S. Burke, et al, “Morphological subclassification of follicular lymphoma: variability of diagnoses among hematopathologists, a collaborative study between the Repository Center and Pathology Panel for Lymphoma Clinical Studies”, J. Clin Oncol. 1985: 3(1): 25–38.

Disclosures:

No relevant conflicts of interest to declare.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution