Transcriptional response to an alternative diet on liver, muscle, and rumen of beef cattle – Scientific Reports

Phenotypic data

In this study, we analyzed 14 traits: ADG_G, ADG_F, ADG_GF, BW1, BW2, BW3, DMI_G, DMI_F, DMI_GF, ME, HCW, CY, REA and FT_REA. Summary statistics for all phenotypes are outlined in Table 2.

Table 2 Descriptive statistics from phenotypic values for traits analysed in the current study (N = 52 cattle).

On average, the ANOVA model for the analysis of the phenotypes accounted for 30% of the variance, ranging from 5.32% for DMI_G to 62.14% for BW1 (Table 3). Also presented in Table 3 is the effect of diet on each phenotype. At a nominal P value < 0.05 there were no statistically significant differences for nine of the 14 phenotypes: ADG_G, ADG_GF, BW1, BW2, BW3, DMI_G, DMI_GF, HCW and REA. However, significant differences were found for ADG_F, DMI_F, ME, CY, and FT_REA. These include an effect of the alternative diet resulting in a 12.9% higher growth at finishing, 22.1% higher dry matter intake at finishing, 10.9% lower methane emission, 1.7% higher carcass yield, and 31.5% higher subcutaneous fat thickness at the rib-eye muscle area.

Table 3 Percentage of variation (R2), least-square means, and P-value for the test of differences between diets.

RNA-seq data analysis

We obtained 7.02 billion clean reads from 153 RNA-seq samples including 52 from liver, 50 from muscle, and 51 from rumen tissue. An average of 48.6, 51.1 and 37.2 million reads were obtained from each tissue, respectively. After quality control, 90.4%, 90.8%, and 83.5% of unique reads, on average, from liver, muscle, and rumen, respectively, were mapped to the reference genome (Table 4). The full table with alignment statistics per sample and tissue is reported in Supplementary Files S1, S2 and S3 for liver, muscle, and rumen, respectively.

Table 4 Summary of RNA-sequencing throughput and mapping rates per tissue.

Differential expression analysis

Under the expression threshold of CPM ≥ 1 in at least 50% of samples (multi-tissue normalization approach, see “Methods”), 14,776 genes (53.5%) out of 27,607 reported on the Ensembl annotation file, were expressed in liver, muscle, and rumen tissues. Table 5 shows the number of DEGs that would have been captured at various P value thresholds and the corresponding FDR. As expected, the number of DEGs and FDR decreases with increasing significance (i.e. decreasing P value thresholds). At the nominal P value of < 0.01 the FDR was ~ 30% for each of the tissues. While this FDR value could be considered high, similar FDR has been reported in the context of GWAS (see for instance Table 2 in Reverter et al.40) and this threshold gave us a sizable number of genes of ~ 500 from each tissue, the relevance of which will be revealed in the forthcoming network-based analyses.

Table 5 Number of significant differentially expressed genes and FDR at decreasing P value thresholds for each tissue.

Therefore, we retained a total of 487, 484, and 499 genes for liver, muscle, and rumen, respectively, identified as DEGs (P value < 0.01) (Fig. 2). The full list of DEGs at a P value < 0.01 for each tissue is presented in Supplementary Files S4, S5 and S6 for liver, muscle, and rumen, respectively.

Figure 2
figure 2

MA-Plots for expression (left panels) and connectivity (right panels) comparing the difference (Log Fold Change, y-axis) against the mean (Mean of Log Normalized Counts, x-axis) for liver (A,D), muscle (B,E) and rumen (C,F). Highlighted in red (blue) are the differentially over-expressed (under-expressed) or over-connected (under-connected) genes. Location of relevant genes and phenotypes is also given with top-ten up or down in black font.

From the 487 DEGs in liver, 194 genes were up-regulated (log2FoldChange ranging from + 0.36 to + 1.41) and 293 genes were down-regulated (log2FoldChange ranging from − 1.10 to − 0.39). The genes with the most altered expression were ENSBTAG00000047136 (log2FoldChange =  + 1.41; q-value = 7.63 × 10–19) and MYO7B (log2FoldChange = − 1.10; q-value = 1.18 × 10–11). The ENSBTAG00000047136 gene has been previously found within a genomic region on the bovine chromosome 14, associated with carcass-related metrics in dairy and beef sires41. MYO7B was identified within imprinted lead SNPs of QTL regions for carcass conformation in cattle42.

Regarding muscle, within the 484 DEGs, 234 were up-regulated (log2FoldChange ranging from + 0.34 to + 1.29) and 250 were down-regulated (log2FoldChange ranging from − 1.96 to − 0.34). The genes with the most altered expression were SLC16A6 (log2FoldChange =  + 1.29; q-value = 1.18 × 10–18) and SPP1 (log2FoldChange = − 1.96; q-value = 5.86 × 10–41). SLC16A6 is a common DEG reported to harbour QTLs related to carcass and meat quality traits43, whereas SPP1 is a candidate gene influencing carcass traits44.

Finally for rumen, among the 499 DEGs, 397 were up-regulated (log2FoldChange ranging from + 0.48 to + 2.06) and 102 were down-regulated (log2FoldChange ranging from − 1.18 to − 0.49). The genes with the most altered expression were ENSBTAG00000048135 (log2FoldChange =  + 2.06; q-value = 1.70 × 10–23) and ALOX15B (log2FoldChange = − 1.18; q-value = 7.93 × 10–9). ENSBTAG00000048135, an unannotated gene with unknown function in cattle, is a homologue of human IGHG gene45, known to influence the innate immune function of IgG molecules and B-cells46. The ALOX15B gene has been previously identified as a down-regulated DEG within the inflammatory response function in feedlot-crossbred cattle47.

Of the genes designated as differentially expressed within this study, 11 were commonly reported as differentially expressed across the three tissues. Within the shared genes, 3 were known-genes—BOLA, PCK1, PRSS2—whereas 8 were unannotated—ENSBTAG00000005146, ENSBTAG00000037743, ENSBTAG00000040409, ENSBTAG00000048353, ENSBTAG00000051836, ENSBTAG00000051845, ENSBTAG00000052465 and ENSBTAG00000053570 (Fig. 3). The bovine leukocyte antigen (BOLA) system is the major histocompatibility complex (MHC) of cattle. The MHC genes, mapped to the bovine chromosome 23, play key roles in immune susceptibility and resistance to pathogens48. PCK1 is a key enzyme for gluconeogenesis in bovine49 and in gene PRSS2, a substitution effect of the T allele of single nucleotide polymorphism (SNP) rs41256901 in protease was significantly associated with feed conversion ratio (FCR) and residual feed intake (RFI), and suggestively associated with DMI in beef cattle50.

Figure 3
figure 3

Venn diagram of differentially expressed genes (left) and key transcription factors and cofactors (right) identified across liver, muscle, and rumen tissue samples for the contrast between TRAD and ALT diets.

Key regulators

Regulatory impact factors, RIF1 and RIF2, were used to identify candidate TFs and COFs modulating the expression of DEGs. Based on those metrics, we identified 176, 185, and 214 TFs and (or) COFs for liver, muscle, and rumen tissues (Fig. 3), respectively (P value ≤ 0.05). Shared TFs and COFs were identified between liver and muscle (n = 13), muscle and rumen (n = 23), rumen and liver (n = 21) and liver, muscle, and rumen (n = 1, cofactor BTK). BTK is a key regulator of the B-cell receptor (BCR) signaling pathway51. This pathway plays a crucial role in regulating B-cell survival, proliferation, and maturation, and also regulates several downstream signaling pathways, including the MAPK and AKT pathways52. The full list of key regulators for each tissue is presented in Supplementary Files S7, S8 and S9 for liver, muscle, and rumen, respectively.

Functional enrichment analysis

Based on the resultant list of DEGs, TFs and COFs identified across all tissues, and used as input for the construction of the gene co-expression networks, with the genes expressed in at least one tissue (14,776 genes) as a background list, a total of 464 gene ontology (GO) terms in the biological process (BP) category were significantly enriched (P-value < 0.05) (Supplementary File S10). In broad terms and across tissues, we observed enriched BPs such as immune function (FDR = 2.65 × 10–20), energy production (FDR = 1.36 × 10–3) and lipid metabolism (FDR = 2.56 × 10–3) (Fig. 4). Meanwhile the enriched tissue-specific BPs were indeed related to the tissues where they were preferentially expressed, that is regulation of primary metabolic process (FDR = 7.38 × 10–8) in liver; skeletal muscle tissue development (FDR = 5.02 × 10–3) in muscle and, regulation of inflammatory response (FDR = 3.28 × 10–3()) in rumen.

Figure 4
figure 4

Pie charts representing the three key biological processes significantly enriched based on the list of differentially expressed genes and key regulators identified across all tissues, input for construction of the gene co-expression networks.

Furthermore, we performed functional overrepresentation analysis including only shared or unique genes identified across tissues. First, based on 235 shared genes among at least two out of the three tissues, a total of 205 BP GO terms were significantly enriched (Supplementary File S11). Similarly, based on tissue-specific genes, 60, 103 and 613 BP GO terms were significantly enriched in liver, muscle, and rumen, respectively (Supplementary File S12). As expected, the enriched BPs identified using shared and unique gene lists were comprised within the enriched BPs identified when analyzing all tissues together. Additionally, we constructed a hierarchical cluster heatmap pointing out the location of genes that became relevant after the differential expression and connectivity subsequent analyses (Fig. 5).

Figure 5
figure 5

Heatmap of the hierarchical cluster analysis of differentially expressed genes, transcription factors and cofactors and the 14 phenotypes across the three tissues (liver, muscle and rumen) and two diets (Alt. = alternative diet; Trad. = traditional diet). Low and high expression are represented by green and red, respectively.

Gene co-expression networks and differential connectivity

Based on the PCIT algorithm, to infer the diet-specific networks involving all tissues we prioritized unique informative genes considering the following criteria: (1) DEG between TRAD and ALT diets; and (2) key regulators (TFs or COFs) based on RIF1 or RIF2 (Supplementary File S13). Thus, 1774 genes (DEGs and key regulators) + 14 phenotypes were used to construct the TRAD and ALT gene co-expression networks with significantly correlated pairs of nodes selected when at least one of the nodes was a phenotype. Based on that, the TRAD network contained 835 nodes and 1813 connections (implying a percentage of total possible connection or clustering coefficient of 0.52%), whereas the ALT network involved 525 nodes and 998 connections (clustering coefficient of 0.72%) (Fig. 6, Supplementary Files S14 and S15). Most TRAD network connections (N = 733, ~ 40%) belonged to muscle compared to 33% and 27% for liver (N = 595) and rumen (N = 485), while half of the ALT network connections (N = 499, 50%) belonged to liver compared to 31% and 19% for muscle (N = 309) and rumen (N = 190).

Figure 6
figure 6figure 6

Gene co-expression networks with the unique significant connections for TRAD (A) and ALT (B) diets. Upper panels correspond to the visualization of the entire networks while the lower panels correspond to sub-networks, focusing on the phenotypes with the lowest and largest average changes in connectivity within diets and tissues (ME and CY, highlighted in dotted and solid black boxes, respectively). For the visualization schema, colours represent tissue of maximum expression: liver (green), muscle (red) and rumen (blue), with phenotypes represented in light purple with black borders; shapes represent genes or phenotypes (rectangle), transcription factors (triangle) or cofactors (inverted triangle); and, the colour of the edges represent type of correlation: positive (dark grey) or negative (pink).

To explore differentially connected phenotypes across both networks, and under the assumption that gain or loss of connectivity in different biological situations would indicate changes in regulatory mechanisms, two sub-networks, one for each diet, were contrasted by focusing on the phenotypes with the largest change in average number of connections within diets and tissues. Specifically, methane emission had the lowest change in number of connections within the networks, while carcass yield had the highest change in connectivity between the two networks (Fig. 6, Table 6). In particular, among all connections, we highlight SREBF2, master regulator of sterol and fatty acid synthesis, which showed significant connections to ME and CY in the ALT network, however was only connected to CY (not to ME) and in liver (not in muscle) in the TRAD network.

Table 6 Number of connections by phenotype in each network, sorted by highest to lowest (last column).

A closer examination of the number of connections for each phenotype in the six networks (Table 6) revealed that for CY the tissue with the largest change of connectivity between the ALT and the TRAD networks was muscle. Similarly, for ME the tissue with the largest change of connectivity between the ALT and the TRAD networks was rumen. This finding is of striking relevance because CY is highly influenced by muscularity while ME finds its origin in the rumen fermentation.

We also identified which significant connections were shared across the competing diets, creating a network with the connections present in both networks (Fig. 7, Supplementary File S16). The significant effect of diet on ME was further validated by the fact that no gene connections with ME were simultaneously observed in both the TRAD and ALT networks. In other words, the gene connections to ME were diet-specific. The attributes table we used to assist in the visualization and interpretation of the gene co-expression networks is presented in full in Supplementary File S17.

Figure 7
figure 7

Gene co-expression network with the shared significant connections present in both diets. The line type represents the type of connection: solid (same signal in both networks), dash (negative in ALT and positive in TRAD) and dots (positive in ALT and negative in TRAD).

We determined which genes were differentially connected (DCGs) across diets, representing genes whose correlated expression pattern differed between both conditions. Note that all DCGs were also simultaneously DEGs, and genes with the highest change in the number of connections are likely to be key regulators53. We identified 162 DCGs between the TRAD and ALT networks (Supplementary File S18). The top 10 most DCGs were SEMA7A, BASP1, NAAA, HNMT, TSPAN32, XAF1, RASGEF1A, CDCA5, ENSBTAG00000055140 and AQP9. Interestingly, all displayed maximum expression in rumen. Supplementary File S19 displays the CA-Plots comparing the number of connections of a gene against its average expression value, considering each tissue and diet comparison.

Lastly, we focused on the 17 TFs and COFs contained among the list of 162 genes that were differentially expressed and differentially connected (DEDC) (Table 7), highlighted as well in the hierarchical cluster heatmap (Fig. 5). The top 5 most differentially connected TFs or COFs across all tissues were BASP1, ELF5, STAT4, PRRX1 and RXRG; all of them also displaying maximum expression in rumen. Within each tissue, the top ranked most differentially connected TF or COF were MEOX1 (from 25 connections in the TRAD network to 100 in ALT), PTTG1 (from 150 connections in the TRAD network to 18 in ALT) and BASP1 (from 57 connections in the TRAD network to 366 in ALT) in liver, muscle, and rumen, respectively.

Table 7 DEDC TFs and COFs by diet and across tissues sorted by differential connectivity (Diff. Connec.)