Combined Analysis of Gut Microbiota, Diet and PNPLA3 Polymorphism in Biopsy-proven Non-alcoholic Fatty Liver Disease

Sonja Lang; Anna Martin; Xinlian Zhang; Fedja Farowski; Hilmar Wisplinghoff; Maria J.G.T. Vehreschild; Marcin Krawczyk; Angela Nowag; Anne Kretzschmar; Claus Scholz; Philipp Kasper; Christoph Roderburg; Raphael Mohr; Frank Lammert; Frank Tacke; Bernd Schnabl; Tobias Goeser; Hans-Michael Steffen; Münevver Demir

Disclosures

Liver International. 2021;41(7):1576-1591. 

In This Article

Results

A total of 180 NAFLD patients were enrolled, of whom 131 returned fecal samples, and 107 returned the dietary record, PNPLA3 genotypes could be determined in 146 patients, and liver histology data were available for 98 patients (Figure 1A). The final study cohort was composed of 57 patients with NAFLD where all data were available (16S gene sequencing results, dietary record, PNPLA3 genotypes and liver histology, Figure 1). Their median age was 52 years, and 44% were female. The median BMI was 30.0 kg/m2, 21% suffered from type 2 diabetes, 40% carried the heterozygous PNPLA3 genotype, and 18% were homozygous PNPLA3 risk allele carriers (Table 1).

Multiple Ordinal Regression Analyses Reveal Associations With Liver Histology Features

First, we performed PC analyses in order to reduce dimensionality of the high-dimensional dietary data and 16S gene sequencing data. The first dietary PC (PC1) was predominantly represented by several amino acids, sulphur, niacin, phosphor, uric acid and purine. PC2 was mainly represented by fat components, sugar and carbohydrates; PC3 was represented by fibre and several vitamins (Figure 2). These three individual PCs were included in the multiple regression models. For the 16S gene sequencing data, the first six PCs that accounted for more than 90% of the variances in the data were all predominantly represented by the bacterial genera Faecalibacterium, Bacteroides, Blautia, Prevotella, Bifidobacterium, Roseburia, Ruminococcus, Eubacterium and Streptococcus (Figure S1). To improve interpretability of the regression models, we included these specific bacterial taxa individually in the models, together with the Shannon diversity index, which was calculated including all detected bacterial taxa. We further included the PNPLA3 variant, age, BMI, gender, type 2 diabetes, dyslipidemia and arterial hypertension as risk factors for NAFLD progression and the EI:BMR ratio as well as proton pump inhibitor, metformin and statin use as potential confounding factors.

We first visualized associations between explanatory variables that were entered in the simple and multiple regression models to obtain an overview about collinearity. As expected, we observed positive associations between clinical variables such as type 2 diabetes and arterial hypertension and the body mass index (BMI). We also observed several correlations between bacteria. A higher BMI was further associated with a lower EI:BMR ratio, indicating higher energy misreporting with increasing BMI as well as a lower intake of fibre and fibre components, the vitamins E-, A-, C- and K, copper, short chain fatty acids and poly-unsaturated fatty acids with increasing obesity (Figure S2). Because collinearity of the explanatory variables was expected, we have chosen a forward selection process in which a choice is made over two variables that are correlated with each other and where the variable with the higher importance over the other variable regarding the respective outcome is included in the final multiple regression model.

Hepatic Steatosis

We first looked at histological steatosis, measured in grades G0-G3 (Table 1). In the simple regression analysis, variant PNPLA3 risk allele, dietary factors and clinical features had a higher importance as indicated by lower AIC values than features of the gut bacterial microbiota (Figure 3A, Table S2). In the multiple regression analysis, the presence of PNPLA3 p.148IM or PNPLA3 p.148MM was significantly associated with higher histological grades of hepatic steatosis (P = .007) and had the highest importance in relation to other variables, followed by a significant negative association with the dietary PC3 (P = .001) (Figure 3A, Table 2). Some major components of PC3 were fibre and fibre components, the vitamins E-, A-, C- and K, copper, short chain fatty acids and poly-unsaturated fatty acids (Figure 2). A lower intake of these compounds and a higher intake of sodium were significantly associated with higher degrees of steatosis on liver histology. Similarly, PC1 was significantly (P = .027) positively associated with higher degrees of steatosis (Figure 3A, Table 2), whereas PC1 was composed of several amino acids, sulphur, niacin, phosphor, uric acid and purine (Figure 2).

Figure 3.

Simple and stepwise multiple ordinal regression analyses using liver histology features as outcome parameter. The Akaike information criterion (AIC) is a measurement of the regression model performance and can be used to compare the importance of individual features, whereas a lower AIC indicates a higher importance. In the multiple forward stepwise approach, variables are stepwise selected and added to the model based on the ability of each variable to improve the model performance. Adding variables to the model was stopped as soon as the model performance could not be further improved by adding other variables. In the forward selection process, in the case of collinearity, a choice is made over two variables that are correlated with each other, and the variable with the higher importance over the other variable regarding the respective outcome is included in the final multiple regression model. Simple regression analyses correspond to Table S2. The final multiple regression models can be found in Table 2. The corresponding loadings of the dietary principal components (PCs) can be found in Figure 2. BMI, body mass index; EI:BMR ratio, energy intake (EI) and the basal metabolic rate (BMR); PPI, proton pump inhibitor; PNPLA3, patatin-like phospholipase domain-containing protein 3

Liver Inflammation

The presence of type 2 diabetes, the related metformin use and lower relative abundances of Prevotella were the most important factors associated with higher degrees of hepatic inflammation in simple regression analyses (Figure 3B, Table S2). Low Prevotella abundance (P = .009) and an increased BMI (P = .041) remained significant in the final multiple regression model (Figure 3B, Table S2).

Hepatocellular Ballooning

We next investigated hepatocellular ballooning. In simple regression analyses, besides the relative Streptococcus abundance, several clinical factors (type 2 diabetes, arterial hypertension, age and dyslipidemia) were significantly positively associated with higher degrees of ballooning (Figure 3C). The most important features in the simple regression models that also remained significant in the final model were a higher relative Streptococcus abundance (P = .021), a higher age (P = .003), higher relative abundances of Gemmiger (P < .001) and lower relative abundances of Faecalibacterium and Blautia (Figure 3C, Table 2).

NAFLD Activity Score (NAS)

A higher NAS indicates more severe liver damage and has been associated with worse outcome in patients with NAFLD.[25] In the simple regression analysis, type 2 diabetes, arterial hypertension and high relative abundances of Streptococcus had the highest association with a higher NAS (Figure 4A, Table S2). In the multiple model however, the most important variables that remained significant in the final model were the PNPLA3 risk genotypes PNPLA3 p.148IM or PNPLA3 p.148MM (P = .016), diet enriched in several amino acids, sulphur, niacin, phosphor, uric acid and purine (P = .033, corresponding to PC1, Figure 2), low relative abundances of Prevotella (P = .012) and a higher BMI (P = .001) (Figure 4A, Table 2).

Figure 4.

Simple and stepwise multiple ordinal regression analyses using (A) the non-alcoholic fatty liver disease (NAFLD) activity score (NAS) (Grades 0–8) and (B) the fibrosis stages (Stages 0–4) as outcome parameter. The AIC is a measurement of the regression model performance and can be used to compare the importance of individual features, whereas a lower AIC indicates a higher importance. In the multiple forward stepwise approach, variables are stepwise selected and added to the model based on the ability of each variable to improve the model performance. Adding variables to the model was stopped as soon as the model performance could not be further improved by adding other variables. In the forward selection process, in the case of collinearity, a choice is made over two variables that are correlated with each other, and the variable with the higher importance over the other variable regarding the respective outcome is included in the final multiple regression model. Simple regression analyses correspond to Table S2. The final multiple regression models can be found in Table 2. The corresponding loadings of the dietary principal components (PCs) can be found in Figure 2. BMI, body mass index; EI:BMR ratio, energy intake (EI) and the basal metabolic rate (BMR); PPI, proton pump inhibitor; PNPLA3, patatin-like phospholipase domain-containing protein 3

Liver Fibrosis

Among all histological parameters, the fibrosis stage represents the strongest predictor feature of future liver-related complications in NAFLD patients.[26] In the simple regression analysis, clinical factors (BMI, P = .001; type 2 diabetes, P = .005; arterial hypertension, P = .005; female gender, P = .046) were overall the most important features associated with higher fibrosis stages (Figure 4B, Table S2). In the multiple stepwise regression analysis, a higher BMI remained the most important and significant feature associated with higher degrees of fibrosis (P < .001), followed by low relative abundances of Bacteroides (P = .002) and Faecalibacterium (P = .002) (Figure 4B, Table 2).

To visualize some key findings of our stepwise models, we plotted the predicted probabilities generated by the final multiple stepwise regression for the individual histological stages of the NAS and fibrosis (Figure 5). These data demonstrate that the probability of having higher degrees of the NAS was low in patients with high abundances of Prevotella, whereas it was high among patients with PNPLA3 risk genotypes PNPLA3 p.148IM or PNPLA3 p.148MM and with increasing BMI (Figure 5A). Patients with NAFLD and obesity, low abundances of Bacteroides or Faecalibacterium and NAFLD patients with a male gender were more likely to have significant or advanced liver fibrosis already (Figure 5B).

Figure 5.

Predicted probabilities of the multiple regression models corresponding to Table 2 for individual stages of (A) the non-alcoholic fatty liver disease (NAFLD) activity score (NAS), dependent on the expression of specific clinical, genetic, dietary or gut microbial features. The NAS is composed of the histological features steatosis, inflammation and ballooning and is ranging from 0 to 8. No patient had a NAS of 8. (B) Predicted probabilities for individual stages of fibrosis dependent on the body mass index, gender, the relative Bacteroides and Faecalibacterium abundance

Comments

3090D553-9492-4563-8681-AD288FA52ACE

processing....