Using Information Available at the Time of Donor Offer to Predict Kidney Transplant Survival Outcomes: A Systematic Review of Prediction Models

Riley, Stephanie; Zhang, Qing; Tse, Wai-Yee; Connor, Andrew; Wei, Yinghui

doi:10.3389/ti.2022.10397

SYSTEMATIC REVIEW AND META-ANALYSIS

Transpl. Int., 23 June 2022

Volume 35 - 2022 | https://doi.org/10.3389/ti.2022.10397

Using Information Available at the Time of Donor Offer to Predict Kidney Transplant Survival Outcomes: A Systematic Review of Prediction Models

SR
Stephanie Riley ¹
QZ
Qing Zhang ¹
WT
Wai-Yee Tse ²
AC
Andrew Connor ²
YW
Yinghui Wei ¹^*

1. Centre for Mathematical Sciences, School of Engineering, Computing and Mathematics, University of Plymouth, Plymouth, United Kingdom
2. Department of Renal Medicine, South West Transplant Centre, University Hospitals Plymouth NHS Trust, Plymouth, United Kingdom

Abstract

Statistical models that can predict graft and patient survival outcomes following kidney transplantation could be of great clinical utility. We sought to appraise existing clinical prediction models for kidney transplant survival outcomes that could guide kidney donor acceptance decision-making. We searched for clinical prediction models for survival outcomes in adult recipients with single kidney-only transplants. Models that require information anticipated to become available only after the time of transplantation were excluded as, by that time, the kidney donor acceptance decision would have already been made. The outcomes of interest were all-cause and death-censored graft failure, and death. We summarised the methodological characteristics of the prediction models, predictive performance and risk of bias. We retrieved 4,026 citations from which 23 articles describing 74 models met the inclusion criteria. Discrimination was moderate for all-cause graft failure (C-statistic: 0.570–0.652; Harrell’s C: 0.580–0.660; AUC: 0.530–0.742), death-censored graft failure (C-statistic: 0.540–0.660; Harrell’s C: 0.590–0.700; AUC: 0.450–0.810) and death (C-statistic: 0.637–0.770; Harrell’s C: 0.570–0.735). Calibration was seldom reported. Risk of bias was high in 49 of the 74 models, primarily due to methods for handling missing data. The currently available prediction models using pre-transplantation information show moderate discrimination and varied calibration. Further model development is needed to improve predictions for the purpose of clinical decision-making.

Systematic Review Registration:https://osf.io/c3ehp/l.

Introduction

End-stage kidney disease (ESKD) is the most advanced stage of chronic kidney disease. Kidney transplantation is the optimal treatment for many patients with ESKD. In the UK, approximately 3,000 kidney transplants are performed every year, but the number of patients waiting for a transplant is around 5,000 (). The success, in terms of efficacy and longevity, of an individual transplant will be influenced by a host of factors, some of which can be determined prior to transplantation. A balance must be struck to ensure maximal organ utilisation without compromising transplant outcomes. This is further complicated by the fact that “one size does not fit all”—the definition of a successful transplant will vary depending on the recipient and their clinical scenario. As such, every potential kidney transplant must be carefully considered in the context of the donor and recipient details.

In the UK donor organs are offered through a national donation system, which utilises an algorithm to balance patient priority and the intent to match immunological and additional parameters. The donor offers are reviewed by clinicians acting on behalf of the recipient and a prompt decision must be made to accept or reject each offer. Whether or not to accept a transplant offer remains a challenging clinical decision. Clinical prediction models that utilise information which would commonly be available to the clinician at the time of the donor kidney offer may help to inform the decision-making process.

The anticipated longevity of a kidney transplant is, of course, an important consideration for a clinician faced with the kidney donor acceptance decision. However, given that donor kidneys are a scarce resource and potential recipients must therefore sit on waiting lists, it is often appropriate to balance the anticipated longevity against the alternative of remaining on dialysis. As such, models that can predict graft survival outcomes would be of great clinical utility.

Prediction models have previously been developed for kidney transplant survival outcomes with the aim of advising clinicians at the time of the offer of a donor kidney. The number of articles related to clinical prediction models for kidney transplant survival outcomes is increasing year on year, suggesting a recognition of the clinical interest. The Kidney Donor Risk Index (KDRI) (), Estimated Post Transplant Score (EPTS), Maryland Aggregate Pathology Index (MAPI) () and Living Kidney Donor Profile Index (LKDPI) () are commonly reported risk indices. The KDRI and EPTS are part of the kidney allocation system in the US.

The aforementioned risk indices were developed in the US population. A similar index has been produced in the UK (UK KDRI) (), though is not widely used in practice. In the UK kidney allocation system NHS Blood and Transplant (NHSBT) use their own risk indices for donors and recipients (). This is to help ensure that the pool of donor kidneys is utilised to best effect. Through this system, for example, younger recipients will typically receive offers of kidneys from younger donors (in order to optimise the chances of transplant longevity) whilst a greater tolerance of less favourable immunological matches will be accepted for older recipients (in order to maximise offers for a cohort in whom immunological matching is of slightly less importance).

We identified two systematic reviews exploring existing prediction models for kidney transplantation. Kaboré et al. () reviewed prediction models for graft outcomes published between 2005 and 2015, while Senanayake et al. () reviewed machine learning methods to predict graft failure, delayed graft function (DGF) and acute graft rejection. Since only machine learning models were eligible, their review excluded articles that used the Cox model, which is the model most used for time-to-event analyses.

Both reviews allowed the inclusion of predictors that only become available after transplantation, such as whether patients experienced DGF. To our knowledge this is the first review to focus only on models that could aid clinical decision-making at the time of the donor offer.

In this systematic review we aim to identify, appraise and summarise existing clinical prediction models for kidney transplant survival outcomes. Only prediction models that use information available at the time of the single kidney-only offer were included, allowing us to focus on models with the most clinical utility.

Methods

We prospectively developed a protocol which is publicly available from OSF (). The findings of this review are reported in accordance with the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement ().

Eligibility Criteria

We included studies with adult recipients (aged 18 years or older) of single, kidney-only transplants. No restrictions were placed on donor type.

No limit was set on publication date. Only full texts published in English were eligible. Conference abstracts without full text were excluded from review.

The outcomes of interest were one or both of the following outcomes, time to graft failure and time to death at any time point following kidney transplantation. Models that did not account for time-to-event information were excluded.

We considered prediction models that make use of information available at the time of a donor kidney offer to inform the acceptance decision. Prediction models developed using predictors that only become available after transplantation were not included, as the decision would have been made by that time.

We included studies which were developed and validated for the outcomes of interest, and validation-only studies which validated existing models developed from independent cohorts. Any measure of predictive performance, such as calibration or discrimination, that was reported alongside a model was considered a form of validation. Validation-only model refers to the case where the current study validates an existing model.

Information Sources and Search Strategy

Electronic databases Embase, MEDLINE and Web of Science were searched from their respective dates of inception up to April 8th^, 2021. The search strategy is presented in Supplementary Table S1.

All citations from the search results were exported to Endnote, where duplicates were automatically removed from review. Titles and abstracts of all records were independently screened against the above eligibility criteria by two reviewers (SR and QZ) and managed through Rayyan (). A third reviewer (YW) also independently screened 10% of the titles and abstracts. Two reviewers then independently reviewed full-text reports to assess eligibility (SR and QZ). Any discrepancies were resolved through discussion.

Data Extraction

Data were extracted from eligible articles according to the Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS) checklist (). The full list of data extracted are given in Supplementary Table S2. Data were extracted independently by two reviewers (SR and YW) and any discrepancies were resolved through discussion.

Risk of Bias

We assessed the risk of bias (RoB) in individual models using the Prediction Model Risk of Bias Assessment Tool (PROBAST) (). Two reviewers (SR and YW) independently determined the RoB of each model and any disagreements were resolved by discussion.

Outcomes

All-Cause Graft Failure

All-cause graft failure, as a composite outcome, is defined as the earliest time to graft failure or death.

Death-Censored Graft Failure

Death-censored graft failure considers the time until graft failure, but patients are censored at the time of death. Graft failure and death are semi-competing events (). Semi-competing events arise when a terminal event precludes a non-terminal event, but not vice-versa ().

Death

This measures time to recipient death, of any cause, as the outcome of interest.

Analysis

Study Characteristics

We summarised the year of publication, geographical location, model type, and model being validated. We explored the discrimination measures by sample size and predictor type (donor, recipient, transplant, or combination of these). For each outcome, we summarised the type of predictors, modelling methods, and methods for handling missing data.

Measures of Model Performance

Model performance was evaluated by calibration and discrimination. Calibration assesses the agreement between observed and predicted risk and is often reported through a calibration plot. Discrimination measures a model’s ability to separate recipients who will experience the outcome event versus those who will not. It is often measured using Harrell’s C statistic, area under receiver operating characteristic curve (AUC) or time-dependent AUC, which account for the censoring of the time-to-event outcome. When a model is developed and internally validated in the same dataset it understandably performs well. Methods to correct for this optimism can be administered using bootstrapping, and resulting measures are referred to as optimism-corrected (). Where studies did not explicitly state that the C-statistic was adapted for censoring, we elected to report the terminology used in the original articles.

Results

We retrieved 4,025 citations from three databases through our search and identified one record related to one of the conference abstracts we screened. After the initial screening of titles and abstracts, 75 articles were retrieved for full-text review. Of these records, 23 articles describing 74 models met the inclusion criteria (–, -) (Figure 1).

FIGURE 1

Characteristics of Included Studies

Of the 74 eligible models, 28 developed and validated a clinical prediction model for our outcomes of interest. The remaining 46 models validated the performance of an existing model in an independent cohort. Articles were published between 2005 and 2020; fifteen of the twenty-three articles (65.22%) were published after 2015. Twelve articles used data of recipients from the United States, four from mainland Europe, three from Canada, two from Australia and New Zealand, and one each from the United Kingdom and Thailand. Characteristics of included studies for each model are available in Supplementary Tables S3–S6.

In the 28 development and validation models, 27 used the Cox proportional hazards model, while one () used a survival random forest. Only eight of the Cox models assessed the proportional hazards assumption.

There was considerable variability in sample sizes used for models (Table 1; Figure 2). In general, models performing validation alone tended to have smaller sample size. Models with smaller sample sizes did not have noticeably poorer discrimination for any of the outcomes (Figure 2).

TABLE 1

	Number of models	Range	Median	Mean	SD
All-cause graft failure
Development and validation	11	785–156,069	39,108	41,127	48,719
Validation only	15	416–69,994	5,042	8,641	17,141
Death-censored graft failure
Development and validation	5	259–10,086	6,662	5,586	4,811
Validation only	19	56–6,405	1,299	3,017	2,909
Patient survival
Development and validation	11	837–120,818	47,535	41,319	38,270
Validation only	11	935–5,042	4,983	3,323	2,007

Summary of sample size used in models for each outcome and model type.

Two models with other outcomes which do not fall into the above definitions have sample size of 20,085 and 2,734, respectively.

FIGURE 2

We considered three types of predictors, donor characteristics, recipient characteristics and transplant process. We found no clear evidence that the type of predictors was associated with better discrimination for any outcome (Supplementary Figures S1–S3). Clayton et al. () validated the US and UK KDRI, while also adjusting for recipient characteristics and transplant process. Those with higher values of discrimination (models 9 and 12) were adjusted for other donor, recipient and transplant related predictors. This was also observed by Molnar et al. (). However, this increase could simply be due to having more variables in the model.

Nine of the 28 development and validation models (, , , ) were available in the form of an online tool or calculator. One of the models () was presented in the form of a nomogram and another () as a contour plot of survival probability.

Commonly validated risk indices, as described in Table 2, included the KDRI, EPTS, UK KDRI, LKDPI, and MAPI. Other models validated included those developed by Kasiske et al. (), Nyberg et al. () and Remuzzi et al. ().

TABLE 2

Model	Donor	Recipient	Transplant organ/process	Histopathology	Validation studies
EPTS	NA	Age;	NA	NA	(22)
		Diabetes status;
		Prior solid organ transplants; Time on dialysis
LKDPI	Age; eGFR;	Sex (compared to donor); Weight (relative to donor weight)	Number of HLA mismatch at HLA-B and HLA-DR; ABO compatibility	NA
	BMI;
	Ethnicity;
	History of cigarette use;
	Systolic blood pressure;
	Sex;
	Weight
MAPI	NA	NA	NA	Arteriolar hyalinosis;
				Glomerulosclerosis;
				Periglomerular fibrosis;
				Scar
				Wall-to-lumen ratio interlobular arteries
UK KDRI	Age;	NA	NA	NA
	Days in hospital;
	History of hypertension;
	Use of adrenaline;
	Weight
US KDRI	Age;	NA	Cold ischaemic time; Double kidney transplant; En-bloc transplant; Number of HLA mismatch at HLA-B and HLA-DR	NA
	Cause of death;
	DCD;
	Diabetes status;
	Ethnicity;
	HCV status;
	Height
	History of hypertension
	Serum creatinine
	Weight

Summary of commonly reported risk indices for predicting kidney transplant survival outcomes.

BMI, body mass index; DCD, deceased cardiac donor; eGFR, estimated glomerular filtration rate; HCV, hepatitis C virus; HLA, human leukocyte antigen.