Thrombotic Microangiopathy in the Renal Allograft: Results of the TMA Banff Working Group Consensus on Pathologic Diagnostic Criteria

The Banff community summoned the TMA Banff Working Group to develop minimum diagnostic criteria (MDC) and recommendations for renal transplant TMA (Tx-TMA) diagnosis, which currently lacks standardized criteria. Using the Delphi method for consensus generation, 23 nephropathologists (panelists) with >3 years of diagnostic experience with Tx-TMA were asked to list light, immunofluorescence, and electron microscopic, clinical and laboratory criteria and differential diagnoses for Tx-TMA. Delphi was modified to include 2 validations rounds with histological evaluation of whole slide images of 37 transplant biopsies (28 TMA and 9 non-TMA). Starting with 338 criteria in R1, MDC were narrowed down to 24 in R8 generating 18 pathological, 2 clinical, 4 laboratory criteria, and 8 differential diagnoses. The panelists reached a good level of agreement (70%) on 76% of the validated cases. For the first time in Banff classification, Delphi was used to reach consensus on MDC for Tx-TMA. Phase I of the study (pathology phase) will be used as a model for Phase II (nephrology phase) for consensus regarding clinical and laboratory criteria. Eventually in Phase III (consensus of the consensus groups) and the final MDC for Tx-TMA will be reported to the transplantation community.


INTRODUCTION
Transplant thrombotic microangiopathy (Tx-TMA) is caused by endothelial injury which is hallmarked by thrombotic occlusion of small vessels resulting in often clinically unexpected allograft failure [1,2].Immunologic, genetic, hematologic disorders and drugs may trigger the disorder [1,3].A transplant kidney biopsy is performed for definitive diagnosis [4].
The histopathologic diagnosis of Tx-TMA relies on the subjective interpretation of a multitude of histopathologic findings of which thrombi is the major one, but varies in extent and frequency, and depends on its acute or chronic character, and, finally, on the pathologist.There is a long list of morphologies making the diagnosis challenging and often delaying initiation of targeted therapy.The Banff TMA working group (WG) was formed in 2016 under the auspices of the Banff Foundation for Allograft Pathology, with the aim of standardizing TMA diagnostic criteria and coming up with recommendations [5].A survey circulated in January 2016 among the WG participants, showed considerable heterogeneity among nephropathologists, using a multitude of known TMA features (as mentioned above) with vague or subjective definitions.Therefore, the first aim of the WG was to provide the Banff community with a standardized set of minimum diagnostic criteria for Tx-TMA.A secondary ambition which was identified during the study was to investigate specific lesions that could potentially determine specific etiologies of Tx-TMA.Diagnosis of TMA in the renal allograft, is not merely a morphologic exercise; clinical and laboratory information is crucial for diagnosis.The Delphi approach was considered by the co-chairs as a suitable method to generate consensus, among an expert panel [6][7][8][9].

MATERIALS AND METHODS
A detailed description of the materials and methods used in this project including establishing a steering committee, literature review, definition of a panelist, the role of the facilitator, and the process and sequences of events during Delphi rounds is presented in paper 1 [10].Herewith in paper 2, the authors describe those specific aspects of the materials and methods that are related to pathology.
In the preliminary round, R0, the facilitator asked several questions related to the diagnosis of Tx-TMA and requested the panelists to send their areas of difficulty with Tx-TMA diagnosis in free text.The questions are listed in Supplementary Table S1.

Cut-Offs
At the end of each R and after receipt of panelists' responses and data analysis, the cut-off for that R was chosen by the facilitator.It is important to emphasize that the Delphi methodology allows the facilitator to arbitrarily set cut-offs for Rs.This is to allow the facilitator to set the cut-off at a level where redundancies can be eliminated, but the most important information could be retained for the next R.In our study, a cut-off of 80% was set for all Rs, except for R4 and R5.To make sure that no important criterion is dropped for the next R, the cut-off for these two Rs was set at 60%, as a cut-off of 80% would have eliminated well-known TMA lesions, such as presence of double contours.

Pathological Validation of the Criteria
The original Delphi method used in other disciplines or in earlier pathology manuscripts did not contain a histology-based validation round.In this study, we designed a modified version of Delphi to adapt the methodology to the needs of our study, which was a pathology project, where the results of the rounds needed to be validated using real-life cases.Therefore, at the beginning of the study, the facilitator asked the panelists to submit transplant kidney biopsy (TxBx) cases from their institutional collection.A total of 37 cases of TxBx was collected and shipped to the facilitator (MA) at the Department of Pathology.For each case, 2-3 micron-thick paraffin-embedded sections, stained with hematoxylin & eosin (H&E), periodic-acid-Schiff (PAS), Masson's trichrome (TCR) and Jones silver or periodic-acid-methenamine-silver (PAMS) stains were submitted.IF and immunohistochemistry (IHC) including C4d staining, as well as EM findings were provided in free text.Only some cases were supplemented by EM images.Slides were de-identified and scanned at ×400 using an Aperio scanner at the University of Toronto.Central review of the cases was performed by the steering committee before circulating the cases among the panelists.

The Cases
Histological evaluation was included in the Delphi process during rounds R6 and R7, where 66 criteria (56 pathological, clinical and laboratory criteria and 10 differential diagnoses) were validated against 37 real-life cases.The panelists were asked to list the criteria they used to make their diagnosis on each case.The cases validated in this study were composed of TMA cases (n = 28) and non-TMA cases or look-alikes (n = 9), displayed in Supplementary Table S2.The original diagnosis of the 37 validated anonymized cases along with the patients' demographics reflected a random selection of real-life situations encountered by our panelists in their practice.Each case was accompanied by a short clinical history, relevant laboratory information available at the time of biopsy.The cochairs also received the original pathology report and diagnosis, and information regarding treatment and outcome, which were not shared with the panelists.

Percentage Agreement (%A) and Percentage Agreement Levels (%AL)
%A shows agreement amongst the panelists concerning a diagnosis or criterion.Moreover, we computed the level of agreement as the number of cases falling into a %AL.For example, a 97-100%AL was the level on which 97%-100% of the panelists agreed on the same diagnosis on X number of cases.Further, a %AL was considered: 0-40 = poor; 41-60 = fair; 61-80 = good; 81-96 = excellent and 97-100 = total.

Statistics
All statistical modeling were performed using SAS, version 9.4 (SAS, Inc., Cary, NC).Details on the statistics are published in paper 1 [10].Some figures were drawn using the open source data visualization tool RAWGraphs [11].
Of note, this study used a retrospective collection of cases to validate criteria resulting from the consensus and was not designed to measure outcome, therapy, or intervention.

Pathological Criteria
Table 1 lists the six pathological categories and their related criteria.A total of 18 pathological criteria (16 positive or 2 negative) were obtained at the end of R7.
The following lists the pathological criteria: - During this process, the panelists put an emphasis on the temporal character of the lesions, for instance, intracapillary thrombi reflecting acute and/or sub-acute Tx-TMA, while double contours, representing chronic Tx-TMA.Of note, acute, sub-acute and chronic TMA were considered as phenomena that can be present simultaneously.

Clinical Criteria
The 2 Clin+ criteria shown in Table 2 included pregnancy/postpartum/history of pre-eclampsia/eclampsia HELLP syndrome and past history of TMA/HUS/aHUS/TTP.

Laboratory Criteria
Table 2 also shows the results on the laboratory criteria.
The 4 Lab+ criteria included elevated LDH, low haptoglobin levels (in the absence of history of recent transfusion), dropping hematocrit/anemia/hemolytic anemia and thrombocytopenia.Two Lab-criteria were dropped because of insufficient votes (<20%): absence of donor ABO-incompatibility and absence of proteinuria.

Differential Diagnoses
Table 3 presents the eight differential diagnoses most used during the validation of the 37 cases.They were entertained during the two validation Rs and included thrombotic thrombocytopenic purpura (TTP)/acquired HUS/atypical HUS (aHUS); donorrelated TMA: observed in the donor in the first week/first month post Tx; chronic Tx glomerulopathy; disseminated intravascular coagulation (DIC); acute or chronic non-TMArelated ABMR (NT-ABMR); anti-phospholipid syndrome; immune complex-mediated glomerulonephritis (GN) including de novo or recurrent membranoproliferative GN, IgA nephropathy (IgAN), lupus nephritis (LN), post-infectious GN and accelerated hypertension.

Definitions
At the end R8, the need to generate consensus regarding morphological definition of key lesions was recognized.In R9, eight criteria were defined.Table 4 lists the definition of 4 LM and 4 EM criteria on which consensus was obtained among the panelists.

Criteria Evolution During Nine Rounds
Figure 1 shows criteria evolution from R1 to R9.A detailed explanation of the evolution of the criteria is reported in the result and discussion sections of paper 1 [10].
Basically, starting with 338 criteria obtained at the end of R1, the facilitator was able to narrow them down to a final number of 24 criteria and 8 differential diagnoses at the end of the study.

Quality of the Panelists' Agreement
The panelists' diagnostic performance on the 37 cases computed at 61-80%AL, 81-96%AL and 97-100%AL is shown in Table 5: The 61-80%AL column shows that up to 80% of the panelists agreed on 83.78% of cases (31/37) which represents a "good" level of agreement.The 81-96%AL column shows that up to 96% of panelists agreed on 54.05% of the cases (20/37) which is considered an "excellent" level of agreement on more than the half of the cases.Total agreement or 97-100%AL between the panelists was obtained on 10.81% of cases (4/37).In each column, those cases marked with (-) did not reach the %AL indicated for that column.It is worth noting that regarding choosing between a diagnosis of Tx-TMA vs. no TMA, on six cases (16.21%), the panelists' opinions were split (12 vs. 11).Agreement on these six cases was therefore judged as "equivocal".A more detailed information about the cases and their respective %AL is provided in Table 5.
R8 was originally planned to produce major and minor criteria according to the panelists' ranking; however, after examination of the results, the facilitator decided that future validation studies are needed to develop the concept of major/minor criteria.

DISCUSSION
TMA in the Native and the Transplanted Kidney: Similarities and Differences TMA in the native kidney shares many morphological features with TMA in the transplanted kidney.They both are caused by endothelial cell injury, and presence of intravascular thrombi, and especially when the lesions are diffuse, they are strong diagnostic tools for the pathologist.However, similarities between the two conditions stop at the morphological level as a transplanted organ is involved with and targeted by many factors that a native organ is not.TMA in the native kidney: 1. is typically part of a larger picture and one of the manifestations of a systemic disease such as Hemolytic Uremic Syndrome (HUS); 2. is associated with laboratory indicators of microvascular thrombosis, such as thrombocytopenia, elevated LDH and decreased haptoglobin; 3. is usually the only main finding in the biopsy; 4. is often the manifestation of a single disease, for example, systemic sclerosis or systemic lupus erythematosus.On the other hand, Tx-TMA often: 1. presents as localized TMA (L-TMA or renal TMA), and not as part of a systemic disease.While recurrent disease is the cause of a small proportion of Tx-TMAs, most transplant L-TMAs are de novo [27]; 2. lacks the laboratory indicators of microvascular thrombosis such as thrombocytopenia, presence of schistocytes, elevated LDH; 3. is difficult to diagnose as there are many confounding factors, such as antibody-mediated rejection (C4d-positive or C4d-negative), T cell-mediated rejection, drug toxicity, and recurrence of the pre-existing disease that blurs the picture for both clinical and pathological diagnosis.Therefore, while endothelial injury is central to the pathogenesis in both renal native and allograft TMA leading to similar lesions in the glomerulus and renal vasculature, diagnosis of Tx-TMA involves a different mindset, algorithm, and differential diagnosis, and sometimes, different criteria.

Literature Review
Up-to-date and to the authors' knowledge, there is no study dealing with the standardization of diagnostic criteria for Tx-TMA (Supplementary Table S3).The paper published by Haas et al [28], addresses the diagnostic criteria for TMA, however, only touches TMA in the native kidney and TMA in the renal allograft is not approached.Most scientific literature does not provide a detailed description of Tx-TMA-associated lesions, including the pathological criteria for which our study reached a consensus.Thus, our study fills this gap and provides, for the first time, diagnostic criteria as prerequisite for further comparative studies.

The TMA BWG Mandates: The Why and the What
As the results of the 2016 Banff TMA WG clearly showed, nephropathologists use many different criteria/lesions to diagnose Tx-TMA.The TMA BWG was formed with specific objectives and goals to standardize the existing biopsy lesions, retrospectively [29].The goals of the TMA BWG, according to the Banff 2017 meeting report were to: "1establish uniform diagnostic criteria for Tx-TMA; 2-determine the frequency with which TMA occurs in renal allograft biopsy; and 3determine if there are specific features of TMA in renal allografts that help resolve the differential diagnosis of Tx-TMA when the cause is not readily apparent from clinical history, DSA/C4d, etc. .." The authors achieved the first goal in 5 years and generated consensus among Banff participants regarding establishing a list of diagnostic criteria.The second goal was accomplished by reviewing the current literature: the authors unveiled the lack of data on the incidence of the Tx-TMA lesions Tx-TMA lesions identified through this Delphi study.The third goal could not be achieved entirely as further input from nephrologists will be needed to finalize the clinical and laboratory criteria.The Phase II of the study with nephrologists is currently in progress and will address the third goal.

Novelty of the Study: Introducing Delphi to the Banff Classification
Since 1991 and for the past 30 years, the Banff Classification on Allograft Pathology group used the NIH model of consensus generation as a tool to define transplant-related pathological lesions.This required resources for travelling and live meetings amongst expert pathologists, nephrologists, and transplant surgeons.The debates resulted in recommendations known as Banff criteria, which were proposed to the transplantation community, and applied for patient management, following rigorous validation studies.Although Delphi by itself is not a new methodology, it solves many of the inconveniences of the use of the NIH consensus format within the Banff community: anonymous yet democratic approach of consensus generation; first-time introduction of digital pathology to Delphi for case validation; and dramatic reduction of the costs of a Banff-related process.The total cost of the study was below US$20,000.00.As no travelling was required, in the era of global warming and the COVID pandemic, this methodology suggests a new approach for consensus generation to the Banff community.In the joint paper of our working group describing the Delphi process, readers will find why they should choose one method over the other [10].
It took 5 years to complete this study and come up with 24 criteria and 8 differential diagnoses.The time may seem long, however, if compared to allograft rejection introduced in Banff in 1991 which took 20 years for the Banff community, to reach consensus on final diagnostic criteria, this appears a speedy process.An example is the glomerulitis lesion (g lesion) which was introduced in Banff in 1993 [30].Although the criteria were introduced at that time, their definition and application evolved continuously throughout the years, discussions continued for years regarding threshold for number of glomerular leukocytes, the degree of endothelial cell enlargement/capillary luminal occlusion or even the exact application of the g score [31,32].The consensus for these lesions took 18 years, 9 Banff conferences held in multiple locations including Banff/Canada, Aberdeen/ Scotland, La Coruna/Spain, Edmonton/Canada, and Paris/France to come up with final diagnostic criteria on glomerulitis.In comparison, our Delphi study started with 338 suggestions, involved 23 panelists (all nephropathologists) and 4 nephropathologists who conducted the study.The study was completed in 5 years (despite the pandemic turmoil), with significantly smaller budget.The low cost of the Delphi method is not specific to this study and is a known advantage of Delphi.

Panelists' Performance
Panelists' performance from a statistical point of view, is briefly discussed in paper 1 [10].In the current paper, the authors would like to put an emphasis on the impact that the complexity of TX-TMA cases have on the pathologists' performance.
Light, immunofluorescence and electron microscopy criteria listed in Table 1 are the results of nine rounds of survey.The listed criteria do not represent any new lesions and every pathologist dealing with Tx-TMA uses some of them during his/her practice.This list is basically a guideline on the most important lesions that need to be considered when dealing with Tx-TMA.Some aspects of Tx-TMA also will need to be tested by additional studies with prediction analysis.For example, the distinction between chronic and acute lesions of Tx-TMA seems to be important, as they are manifested by different microscopic lesions.The presence of acute TMA lesions generally means the patient has an on-going treatable condition, while chronic TMA lesions generally mean the patient has potentially irreversible damages in the renal allograft.The usefulness of distinguishing chronic from acute TMA therefore could be the subject of such prediction analysis.
At this point the authors draw the reader's attention to an important point: The "subjects" in this Delphi study are neither the criteria nor the real-life cases that were validated.The "subjects" are "the panelists."Therefore, statistics usually expected from an NIHtype study such as adequacy of the sample size or number of validated cases, and reporting of p-values and ICCs related to criteria, should not be expected from this Delphi study.Only %A and %AL which reflect subjects' or panelists' performance can be reported.This is one of the main differences between Delphi and NIH-type consensus methods.Delphi evaluates performance at different agreement levels, not the criteria nor the cases.Therefore, the final results will not be presented with p-values or ICC but as total, excellent, good, fair or poor agreement levels.

Supporting Clinical and Laboratory Criteria
For the pathological diagnosis of Tx-TMA, the clinical situations such as arterial hypertension, acute renal or multi-system organ failure were deemed unnecessary, as well as laboratory items such as donor specific antibodies (DSA), positive crossmatch, low complement levels or high serum levels of CNIs, since the panelists believed none of these criteria can stand alone.
Despite the fact that clinical and laboratory information are essential for renal biopsy interpretation, consensus was reached on only a few criteria.Early on during the Delphi process, our renal transplant pathology expert panelists suggested and listed both therapeutic agents (for example, Tacrolimus or mTOR inhibitors) and complement-related disorders as items that could be considered in the final list of diagnostic criteria.However, as the list was narrowed down to reflect minimum diagnostic criteria, these items were eliminated by consensus.Additionally, the majority of the 37 cases shared by the panelists and validated, did not have any initial information about complement factors, as it happens in reallife situation and early in the course of diagnosing a case of Tx-TMA.Therefore, these items are not listed in this phase of the study.Importantly, this information is not lost, and being entertained in Phase II (as mentioned above) by the nephrologists.This is consistent with the difficulty that nephrologists and nephropathologists have in diagnosing Tx-TMA.Even though in the pathology phase (Phase I) these criteria were agreed on, they will need to be approved by the nephrologists in Phase II.They are, therefore, not final.

Emergence of Areas of Controversy
After reviewing the panelists' responses on the 37 cases, the most common confounding factor for pathology diagnosis of Tx-TMA emerged: ABMR.It became a source of considerable intellectual conflict every time a case that had a clinical, laboratory (C4d or DSA results) or morphological hint of ABMR was encountered by the panelists.To explain the magnitude of the problem: one of the most challenging questions for our panelists was whether ABMR is in the differential diagnosislistofTx-TMAoriscausingTx-TMA?Therefore, ABMRanditsattributeswerementionedbothasnegativecriteriawhen the panelists were trying to rule out Tx-TMA, and at the same time as criteria for diagnosis of Tx-TMA.The authors believe this area of conflict needs to be addressed by the Banff community, requiring further research and debate, and is out of the scope of this paper.

Strengths and Weaknesses of the Study
Comparisons between the Delphi method and other consensus generation tools, including the NIH-type method, have been discussed in detail in the literature [8].For our study, the reasons why we chose the Delphi methodology, which we consider a strength, weremultiple:itsanonymousaspect,itscapacitytogenerateconsensus among many participants, on numerousitems, and in ashort periodof time, as well as its huge advantage on cost-effectiveness.The Delphi methodology has recently been used in surgical pathology [33,34], however, this is the first time that the method is being used in the Banff classification group.Leading to rapid and inexpensive consensus, this process could represent a precedent in consensus generation within the Banff community.One of the advantages of Delphi is the flexibility that the facilitator has in designing the rounds.However, our study went beyond a general survey on opinions related to Tx-TMA and included histological evaluation of real-life cases within consensus generation to define diagnostic lesions.Online surveys allowed to respect our initial wish for anonymous responses.
The lack of accepted criteria that would play the role of gold standard in the diagnosis of the 37 cases not only was one of the main hurdles of this study, but also the main motivation behind initiating this work.During the two validation Rs, to circumvent this obstacle, it was decided to adhere to the original diagnosis provided by the panelist/expert who had submitted the cases.
Perhaps a further caveat of the study is the lack of correlation with treatment and outcome.
Despite the above-mentioned weaknesses, this study represents a significant step forward to tackle the pathology issues associated with Tx-TMA.A second Delphi study, with the collaboration of over 30 nephrologists, is currently ongoing.

CONCLUSION
The current work is a starting point in the process of diagnosing renal Tx-TMA.The TMA BWG looked at Tx-TMA from many different perspectives including its patterns of appearance (systemic versus localized), temporal occurrence (acute versus chronic), the difficulties pathologists face in identifying some of its lesions by LM, relationship between Tx-TMA and ABMR, and other potentially confounding conditions, and finally, the multitude of its mimickers (differential diagnoses).The authors generated consensus on 24 criteria, providing a list of differential diagnoses and identifying areas of diagnostic difficulty.While this realization undoubtedly conveys valuable recommendations for nephropathologists involved in the management of patients with Tx-TMA, its satisfactory implementation will require attentive validation and refinement, starting with consensus generation among nephrologists, who will fortify the clinical and laboratory criteria.Once Phase II and Phase III are completed, this study may serve as a baseline for diagnosing Tx-TMA, and Delphi be considered a useful methodology facilitating the process of consensus generation within the transplantation community.

FIGURE 1 |
FIGURE 1 | Criteria evolution during eight rounds of The X-axis represents each R and the Y-axis the cumulative number of data entries.R1 started with 338 criteria (arrow) which were narrowed down to 66 in R5 and remained 66 in R6.The criteria were further narrowed down to 35 (comprising eight differential diagnosis) in R7.R6 and R7 were two validation rounds and R9 was the control round also called the definition round.Abbreviations: Clin-, clinical data negative; Clin+, clinical data positive; #D, differential diagnosis; EM-, electron microscopy negative; EM+, electron microscopy positive; IF-, immunofluorescence microscopy negative; IF+, immunofluorescence microscopy positive; Lab-, laboratory negative; Lab+, laboratory positive; LM-, light microscopy negative; LM+, light microscopy positive.

TABLE 1 |
Pathological criteria classified in 6 categories and panelists' percentage of agreement (%A) for each criterion.

TABLE 2 |
Clinical and laboratory criteria and panelists' percentage of agreement (%A) for each criterion.

TABLE 4 |
Definitions for selected light and electron microscopy lesions.