Comparison of Clinical Prediction Rules for Ruling Out Cirrhosis in Nonalcoholic Fatty Liver Disease (NAFLD)

Danielle Brandman; Marie Boyle; Stuart McPherson; Mark L. Van Natta; Arun J. Sanyal; Kris Kowdley; Brent Neuschwander-Tetri; Naga Chalasani; Manal F. Abdelmalek; Norah A. Terrault; Art McCullough; Ricki Bettencourt; Cyrielle Caussy; David E. Kleiner; Cynthia Behling; James Tonascia; Quentin M. Anstee; Rohit Loomba


Aliment Pharmacol Ther. 2022;55(11):1441-1451. 

In This Article


US Cohort (NASH CRN Cohort)

Cohort Characteristics. A total of 1483 patients met inclusion and exclusion criteria and formed the US cohort. Five hundred sixty-seven patients enroled in the NASH Database Study from July 2004 to February 2008 and 916 enrolled in the NASH Database 2 from July 2009 to November 2015. Cohort participants had a mean age of 50 years, were predominantly female (64%), and white (83%) (Table 1). Diabetes and hypertension were present in 37% and 57% of the cohort, respectively. Mean AST and ALT were 50 U/L (SD 35) and 68 U/L (SD 51), respectively. Most (65%) of patients had liver biopsy length of 15 mm or longer. Ten per cent (N = 147) of patients had cirrhosis on liver biopsy. Compared to patients without cirrhosis, those with cirrhosis were older (55 vs 49 years; p < 0.0001), more commonly white (93% vs 82%; p < 0.0008), had higher prevalence of diabetes and hypertension (66% vs 34%, p < 0.0001 and 67% vs 56%, p = 0.008, respectively), and higher BMI (36 vs 34 kg/m2, p = 0.004). AST, AST/ALT ratio, INR were significantly higher and platelet count lower (all p < 0.01) in patients with vs. without cirrhosis, whereas ALT was lower. Patients with cirrhosis had more ballooning, less steatosis and higher proportion with definite NASH than patients without cirrhosis (all p < 0.0001).

Clinical Prediction Rule Performance. The performance characteristics of the seven clinical prediction rules in identifying cirrhosis are presented in Table 2. Using the cutpoints derived from the use of the Youden index, sensitivity ranged from 64% to 82%, with FIB-4 and Bonacini CDS having the highest sensitivity at 80% and 82%, respectively. The NAFLD fibrosis score had the highest specificity (86%). All rules had low PPVs (≤35%), while NPVs were high (≥95%). The overall diagnostic accuracy to detect cirrhosis, using AUROC, was highest using FIB-4 (0.86), the Lok index (0.86) and NAFLD fibrosis score (0.84), as displayed in Figure 1. Performance of FIB-4 was significantly better than APRI, AST:ALT ratio, BARD and Bonacini (all p < 0.001). Neither the prediction cutpoints, nor the AUROCs and their associated p-values were substantially changed when each rule was assessed in patients with biopsy length ≥15 mm or ≥25 mm (Tables S3a and S3b).

Figure 1.

Performance of seven clinical prediction rules for diagnosis of cirrhosis in the US cohort (US)

Recognising that the use of the Youden index cutpoints may equally misclassify patients having or not having cirrhosis, we further analysed each clinical prediction rule to determine the optimal cutoff according to 90% sensitivity ("rule out" cutoff) and 90% specificity ("rule in" cutoff). Cutoffs derived from FIB-4 and NAFLD fibrosis score to identify cirrhosis were 1.28 and −1.59, respectively, for 90% sensitivity, and 2.35 and 0.58, respectively, for 90% specificity (Table S4). Using these cutoffs, performance was modelled according to different disease prevalence (1%, 10% and 25%). At 1% cirrhosis prevalence and 90% sensitivity, the NPV was >99% for all rules. When the cirrhosis prevalence was increased to 25%, the NPV remained high (≥93% for FIB-4 and NAFLD fibrosis score). The PPV when cirrhosis prevalence was 1% and specificity 90% was very low (≤7%) for all rules. When cirrhosis prevalence increased to 25%, the PPV increased to 63% and 68%, respectively (Table S5). Using the Youden-index derived FIB-4 cut-off of 1.67, the false-positive rate was lowest and the false-negative rate was highest when diagnosing cirrhosis vs. diagnosing other stages of fibrosis (Table S6).

UK Cohort (Newcastle Cohort)

Cohort Characteristics. In the 494 patients included in the UK cohort, the mean age was 53 years, 57% were male and all were white. Diabetes was present in 57% of the cohort. Mean AST and ALT were 49 (SD 27) U/L and 71 (SD 45) U/L, respectively. Cirrhosis on biopsy was present in 59 (12%) of the patients in the UK cohort (Table 3). Similar to the US cohort, patients with cirrhosis were older (59 vs 53 years; p < 0.001), more frequently diabetic (84% vs 53%; p < 0.001), and had higher BMI (37 vs 35 kg/m2; p = 0.002). Mean AST, AST/ALT ratio, INR and platelet count were significantly (all p < 0.005) higher in patients with cirrhosis compared to those without cirrhosis, whereas mean ALT was significantly lower (p = 0.03). Patients with cirrhosis had less ballooning, more steatosis and more lobular inflammation than patients without cirrhosis (all p < 0.05) but similar rates of definite NASH (41% with cirrhosis vs. 38% without cirrhosis).

Clinical Prediction Rule Performance. The clinical prediction rules were evaluated in the entire UK cohort, except for the Bonacini score and Lok index (n = 424 due to 70 patients not having INR available). Performance of the seven clinical prediction rules to diagnose cirrhosis in the UK cohort was similar to the performance in the US cohort (Table 4). Using the cutpoints derived from the Youden index analysis, APRI had the highest sensitivity to rule out cirrhosis (85%), FIB-4 still performed well, with the sensitivity of 78%. Specificity was highest for the Lok index (95%) and was 89% for the NAFLD fibrosis score. PPV was low in this cohort, though it had a wider range than the US cohort (19%–52%). NPV was high for all prediction rules, though slightly lower than the US cohort (≥92%). The overall diagnostic accuracy for detecting cirrhosis, using AUROC, was numerically the highest for the NAFLD fibrosis score (0.89) and FIB-4 (0.87).