Close this search box.

Charting cellular differentiation trajectories with Ricci flow – Nature Communications

Network entropy calculation

The computation of network entropy was as previously described9,10,11 employing the SCENT package in R and the symmetric PIN compiled from multiple sources in 2016 available at We denote the adjacency matrix of the PIN by (A={({a}_{ij})}_{i,j=1}^{n}).

For each gene expression sample, genes were matched to proteins in the PIN, when multiple genes were mapped to a single protein, expression levels were averaged over and only the largest connected component of the PIN was considered post-matching. For each matched sample ({{{{{{{bf{x}}}}}}}}={({x}_{i})}_{i=1}^{n} , > ,0) a weighted network (W({{{{{{{bf{x}}}}}}}})={({a}_{ij}{x}_{i}{x}_{j})}_{i,jin V}), and row-stochastic matrix, (P({{{{{{{bf{x}}}}}}}}),=,{({p}_{ij}({{{{{{{bf{x}}}}}}}}))}_{i,jin V}), where:

$${p}_{ij}({{{{{{{bf{x}}}}}}}})=frac{{a}_{ij}{x}_{j}}{{sum }_{kin V}{a}_{ik}{x}_{k}}$$


were constructed.

We define the local entropy of node i as

$${S}_{i}({{{{{{{bf{x}}}}}}}})=-mathop{sum}limits_{kin V}{p}_{ik}({{{{{{{bf{x}}}}}}}})log {p}_{ik}({{{{{{{bf{x}}}}}}}})$$


the entropy rate associated with P(x) is then given by

$${S}_{R}({{{{{{{bf{x}}}}}}}})=mathop{sum}limits_{iin V}{pi }_{i}({{{{{{{bf{x}}}}}}}}){S}_{i}({{{{{{{bf{x}}}}}}}}).$$


Where (pi ({{{{{{{bf{x}}}}}}}})={({pi }_{i}({{{{{{{bf{x}}}}}}}}))}_{iin V}) is the stationary distribution of P(x) satisfying

$$pi ({{{{{{{bf{x}}}}}}}})=P({{{{{{{bf{x}}}}}}}})pi ({{{{{{{bf{x}}}}}}}}).$$


As G is undirected and a single connected component, by the Perron-Frobenius theorem the stationary distribution π has an analytical solution given by:

$${pi }_{i}({{{{{{{bf{x}}}}}}}})=frac{{sum }_{kin V}{a}_{ij}{x}_{i}{x}_{j}}{{sum }_{k,jin V}{a}_{kj}{x}_{k}{x}_{j}}.$$


When presented in figures network entropy was calculated as the above entropy rate SR(x) normalised by the maximal entropy rate possible from the topology of the matched PIN, following our prior convention, to allow comparison across different networks10,11.

Construction of the Ricci flow equation

Formally for a smooth manifold Y a Ricci flow defines for an open interval ((a,b)in {{mathbb{R}}}^{+}) a Riemannian metric dt such that:

$$frac{partial {d}_{t}}{partial t}=-2Ric({d}_{t})$$


the constant − 2 is largely conventional and can be replaced with any k < 0, to ensure existence of a unique solution in finite time. Normalised Ricci flows are typically employed for convergence studies when certain properties, e.g., volume, are required to be finite

$$frac{partial {d}_{t}}{partial t}=-!2Ric({d}_{t})+overline{Ric}$$


where (overline{Ric}) is a normaliser.

In 2 dimensions normalised Ricci flow is well-studied theoretically47 and takes a special form:

$$frac{partial {d}_{t}}{partial t}=(Ric({d}_{t})-overline{Ric}){d}_{t}$$


For normalised discrete Ricci Flow we employ the following expression described in the main text and applied previously21:

$${d}_{t+Delta t}(i,j)={d}_{t}(i,j)+Delta t(Ric{({{{{{{{{bf{x}}}}}}}}}^{{{{{{{{bf{t}}}}}}}}})}_{(i,j)}-{overline{Ric}}_{(i,j)}){d}_{t}(i,j)$$


for Δt > 0, where dt(i, j) is a distance between connected nodes i, jV at time t, (Ric{({{{{{{{{bf{x}}}}}}}}}^{{{{{{{{bf{t}}}}}}}}})}_{(i,j)}) is the Ricci curvature on edge (i, j) E at time t and ({overline{Ric}}_{(i,j)}) is an edge-wise normaliser to which we want to converge.

We next must choose expressions for dt(i, j) and (Ric{({{{{{{{{bf{x}}}}}}}}}^{{{{{{{{bf{t}}}}}}}}})}_{(i,j)}) which satisfy our required and desired properties outlined in the Results.

We select (Ric{({{{{{{{{bf{x}}}}}}}}}^{{{{{{{{bf{t}}}}}}}}})}_{(i,j)}) to be a Forman-Ricci curvature ({R}_{F}^{t}(i,j)), as this discrete form of Ricci curvature is fast to compute compared to other versions such as Ollivier-Ricci curvature, and we must compute ~ 150,000 edge-wise curvatures per iteration of our Ricci flow. We choose the edge weights of this curvature to be ({omega }_{ij}: !!={omega }_{ij}^{t}=frac{{a}_{ij}}{{x}_{i}^{t}{x}_{j}^{t}}) and node weights ({W}_{i}=frac{1}{{{{{{{mathrm{deg}}}}}}}(i)}). ({R}_{F}^{t}(i,j)) thus obeys:

$${R}_{F}^{t}(i,j)= {{{{mathrm{deg}}}}}{(i)}^{-1}+{{{{mathrm{deg}}}}}{(j)}^{-1}- {({x}_{i}^{t}{x}_{j}^{t})}^{-1/2} left[{{{{mathrm{deg}}}}}{(i)}^{-1}mathop{sum}limits_{kne j}{({a}_{ik}{x}_{i}^{t}{x}_{k}^{t})}^{1/2}+{{{{mathrm{deg}}}}}{(j)}^{-1}mathop{sum}limits_{kne i}{a}_{ik}{({a}_{kj}{x}_{k}^{t}{x}_{j}^{t})}^{1/2}right].$$


We also choose ({d}_{t}(i,j)={omega }_{ij}^{t}). We note that, as for other discrete Ricci flow studies30,33, dt(i, j) is not a metric, as it fails the triangle inequality, however, it is small, implying “close proximity” of connected vertices i, jV if the corresponding transcript levels of genes i and j are high at time t. In addition at each iteration of (5), this choice of dt(i, j) allows computation of ({({omega }_{ij}^{t+Delta t})}_{(i,j)in E}), which can be input into (4), allowing computation of ({({R}_{F}^{t+Delta t}(i,j))}_{(i,j)in E}) and thus the next iteration of (5). This iterated dtt can simply be inverted to give W(xt+Δt) which allows direct comparison of the Ricci flow generated transcriptomic distribution with real biological data. Our choice of dt thus satisfies all our desired properties and is a reasonable distance measure.

Wi is chosen to be independent of xt as the Ricci flow iteration only provides enough equations to calculate updates of edge weights, thus if Wi depends on t we cannot compute ({R}_{F}^{t}(i,j)) over each iteration of (5). We select ({W}_{i}=frac{1}{{{{{{{mathrm{deg}}}}}}}(i)}) to normalise the sums in (4), which is important when comparing total Forman-Ricci curvature and network entropy (see below).

We further note that:

$$frac{partial {R}_{F}(i,j)}{partial {omega }_{ij}}=-frac{1}{2}{({omega }_{ij})}^{-1/2}left[{W}_{i}mathop{sum}limits_{kne j}{({omega }_{ik})}^{-1/2}+{W}_{j}mathop{sum}limits_{kne i}{({omega }_{kj})}^{-1/2}right], < ,0$$


implying that as ωij decreases, based on our definition of the distance d(i, j) = ωij, i and j become “closer”, and the Forman-Ricci curvature increases, and vice versa (Fig. 1C). This behaviour is as expected from a curvature. Moreover, considering our Ricci flow construction in (5), if ({R}_{F}^{t}(i,j) , > ,overline{{R}_{F}(i,j)}) then ({d}_{t+Delta t}(i,j)={omega }_{ij}^{t+Delta t}) will increase, leading to a reduction in ({R}_{F}^{t}(i,j)) via (17), driving convergence to (overline{{R}_{F}(i,j)}).

Thus our choice of Ricci flow construction is computationally efficient, facilitates convergence of the flow towards the normaliser and satisfies all our required and desired properties outlined in the results.

Investigating the correlation between network entropy and total Forman-Ricci curvature on a simple network

We consider the simple k-star network displayed in Fig. 2A, consisting of k + 1 vertices, of which k have a single edge connecting them to a central vertex i. We assign each vertex l ≠ j a weight xl = 1 and assign vertex j a weight xj = ϵ > 0.

Our Forman-Ricci curvature is defined on an edge as follows:

$${R}_{F}(i,j)= {{{{mathrm{deg}}}}}{(i)}^{-1}+{{{{mathrm{deg}}}}}{(j)}^{-1}-{({x}_{i}{x}_{j})}^{-1/2} left[{{{{mathrm{deg}}}}}{(i)}^{-1}mathop{sum}limits_{kne j}{({a}_{ik}{x}_{i}{x}_{k})}^{1/2}+{{{{mathrm{deg}}}}}{(j)}^{-1}mathop{sum}limits_{kne i}{a}_{ik}{({a}_{kj}{x}_{k}{x}_{j})}^{1/2}right]$$



$${R}_{F}(i,j)={{{{{{mathrm{deg}}}}}}}{(i)}^{-1}left[1-mathop{sum}limits_{kin N(i)setminus j}sqrt{frac{{x}_{k}}{{x}_{j}}}right]+{{{{{{mathrm{deg}}}}}}}{(j)}^{-1}left[1-mathop{sum}limits_{kin N(j)setminus i}sqrt{frac{{x}_{k}}{{x}_{i}}}right]$$


Which we denote as:

$${R}_{F}(i,j)={r}_{F}(i| j)+{r}_{F}(j| i)$$


for notational ease, where:

$${r}_{F}(i| j)={{{{{{mathrm{deg}}}}}}}{(i)}^{-1}left[1-mathop{sum}limits_{kin N(i)setminus j}sqrt{frac{{x}_{k}}{{x}_{j}}}right]$$


We note that via (1):

$$frac{{x}_{k}}{{x}_{i}}=frac{{x}_{k}/{sum }_{lin N(j)}{x}_{l}}{{x}_{i}/{sum }_{lin N(j)}{x}_{l}}=frac{{p}_{jk}}{{p}_{ji}}$$


which gives us the alternative expression, which can be helpful when considering stochastic matrices

$${r}_{F}(i| j)={{{{{{mathrm{deg}}}}}}}{(i)}^{-1}left[1-mathop{sum}limits_{kin N(i)setminus j}sqrt{frac{{p}_{ik}}{{p}_{ij}}}right].$$


Employing the results above it is a simple deduction that for our toy network:

$$left{begin{array}{ll}{p}_{ij}=frac{epsilon }{k+epsilon -1}quad & {p}_{il}=frac{1}{k+epsilon -1}quad &lne j {p}_{li}=1hfillquad &lne i {p}_{lj}=0hfillquad &lne iend{array}right.,.$$


The stationary distribution of the network is also easily calculated from (11) as:

$$left{begin{array}{ll}{pi }_{i}=frac{1}{2}hfillquad & {pi }_{l}=frac{1}{2(k+epsilon -1)}quad &lne j,i {pi }_{j}=frac{epsilon }{2(k+epsilon -1)}quad &end{array}right.,.$$


It is also clear that the local entropies will satisfy:

$$left{begin{array}{l}{S}_{i}=-frac{epsilon }{k+epsilon -1}log left(frac{epsilon }{k+epsilon -1}right)-frac{k-1}{k+epsilon -1}log left(frac{1}{k+epsilon -1}right)quad {S}_{l}=0quad ,lne i hfillend{array}right..$$


The network entropy of this network is thus simply:

$${S}_{R}=-frac{1}{2}left[frac{epsilon }{k+epsilon -1}log left(frac{epsilon }{k+epsilon -1}right)+frac{k-1}{k+epsilon -1}log left(frac{1}{k+epsilon -1}right)right]$$


Which is a convex function of ϵ maximal at ϵ = 1 (Fig. 2B).

We now consider the total Forman-Ricci curvature, defined by:

$${R}_{F}=mathop{sum}limits_{lin V}{pi }_{l}{R}_{F}(l)$$



$${R}_{F}(l)=frac{1}{{{{{{{mathrm{deg}}}}}}}(l)}mathop{sum}limits_{rin V}{a}_{lr}{R}_{F}(l,r).$$


In our example, the following can be deduced from equation (23):

$$left{begin{array}{ll}{r}_{F}(l| i)=1hfillquad &lne i {r}_{F}(i| l)=frac{1}{k}(3-k-sqrt{epsilon })hfillquad &lne j {r}_{F}(i| j)=frac{1}{k}left(1-(k-1)sqrt{frac{1}{epsilon }}right)quad &end{array}right.,.$$


Which allows the calculation of

$$left{begin{array}{l}{R}_{F}(i)=frac{1}{k}left[k+frac{1}{k}left(1-(k-1)sqrt{frac{1}{epsilon }}right)+frac{k-1}{k}(3-k-sqrt{epsilon })right]quad {R}_{F}(j)=1+frac{1}{k}left(1-(k-1)sqrt{frac{1}{epsilon }}right)hfillquad {R}_{F}(l)=1+frac{k-1}{k}(3-k-sqrt{epsilon })qquad lne j.hfillquad end{array}right.,.$$



$${R}_{F}= frac{1}{2k}left[k+frac{1}{k}left(1-(k-1)sqrt{frac{1}{epsilon }}right)+frac{k-1}{k}(3-k-sqrt{epsilon })right] +frac{epsilon }{2(k+epsilon -1)}left[1+frac{1}{k}left(1-(k-1)sqrt{frac{1}{epsilon }}right)right] +frac{k-1}{2(k+epsilon -1)}left[1+frac{k-1}{k}(3-k-sqrt{epsilon })right].$$


Network entropy and total Forman-Ricci curvature comparison

Network entropy was calculated on each gene expression sample as described above. Forman-Ricci curvature was computed over an edge (i, j) using the following expression:

$${R}_{F}(i,j)= {{{{mathrm{deg}}}}}{(i)}^{-1}+{{{{mathrm{deg}}}}}{(j)}^{-1}-{({x}_{i}{x}_{j})}^{-1/2} left[{{{{mathrm{deg}}}}}{(i)}^{-1}mathop{sum}limits_{kne j}{({a}_{ik}{x}_{i}{x}_{k})}^{1/2}+{{{{mathrm{deg}}}}}{(j)}^{-1}mathop{sum}limits_{kne i}{a}_{ik}{({a}_{kj}{x}_{k}{x}_{j})}^{1/2}right]$$


Nodal average Forman-Ricci curvature was computed as previously described22,25 via:

$$Ri{c}_{i}({{{{{{{bf{x}}}}}}}})=frac{1}{{{{{{{mathrm{deg}}}}}}}(i)}mathop{sum}limits_{jin V}{a}_{ij}{R}_{F}(i,j)$$


and network average, or total Forman-Ricci curvature was computed via:

$$Ric({{{{{{{bf{x}}}}}}}})=mathop{sum }limits_{iin V}{pi }_{i}({{{{{{{bf{x}}}}}}}})Ri{c}_{i}({{{{{{{bf{x}}}}}}}}),$$


where ({({pi }_{i}({{{{{{{bf{x}}}}}}}}))}_{i=1}^{n}) is the stationary distribution of P(x).

The choice of node weights for our Forman-Ricci curvature ({W}_{i}=frac{1}{{{{{{{mathrm{deg}}}}}}}(i)}) is important here as it ensures that the upper bounds of each of the two sums comprising edge-wise Forman-Ricci curvature defined in (4) are not dependent on node degree, and so nodal average Forman-Ricci curvature is also independent of degree. This is required as the local entropy of a node i (defined in (8)) takes values on [0, deg(i)] and thus has a degree dependence. We define total Forman-Ricci curvature here, to mirror network entropy, as a weighted sum of nodal average curvatures, using the stationary distribution ({({pi }_{i}({{{{{{{bf{x}}}}}}}}))}_{i=1}^{n}) as the weights. Our choice of node weights ({W}_{i}=frac{1}{{{{{{{mathrm{deg}}}}}}}(i)}) thus prevents total Forman-Ricci curvature and network entropy from correlating purely because of a shared degree dependence. We note that while our choice of Wi prevents degree dependence of edge wise and nodal Forman-Ricci curvature, the use of the stationary distribution in calculation of total Forman-Ricci curvature introduces the relative biological importance of hub nodes9.

Associations between network entropy and total Forman-Ricci curvature were assessed using Pearson correlation with significance at the 5% level.

Computing linear and Ricci flow trajectories between time-ordered gene expression samples

Trajectories for time course gene expression data were derived via two approaches, a null Euclidean straight line trajectory and by employing our discrete normalised Ricci flow. For both approaches the first gene expression time point (x0) was used as a starting state and the final time point (xT) was the end state. Intermediate time points were not used in the derivation of the trajectory only for its validation.

For normalised discrete Ricci flow we employ the following expression described above:

$${d}_{t+Delta t}(i,j)={d}_{t}(i,j)+Delta t(Ric{({{{{{{{{bf{x}}}}}}}}}^{{{{{{{{bf{t}}}}}}}}})}_{(i,j)}-{overline{Ric}}_{(i,j)}){d}_{t}(i,j).$$


This flow will deform the weight on an edge of the PIN at a rate proportional to the difference between the edge curvature at a starting state and a final state determined by the normaliser.

We set the normaliser of our Ricci flow as the Forman-Ricci curvature calculated at the final time point T: ({overline{Ric}}_{(i,j)}={R}_{F}^{T}(i,j)). The time increment Δt was selected empirically. If Δt is too large then negative values of the incremented distance dtt are possible, which are not acceptable by definition, however, if Δt is very small convergence of the Ricci flow to the normaliser will require a great number of iterations and will not be computationally practical. We therefore considered a range of values for Δt {10−3, …, 10−1}. For each gene expression time course, we implemented one time step of the Ricci flow from the first time point x0 using each Δt value and selected the optimal Δt as the largest which does not admit negative values of d0+Δt. For both time courses considered this value was Δt = 0.06.

We note that the maximal value of Δt which does not admit negative values of d0+Δt can also be derived theoretically and depends on the differences between the edge-wise Forman-Ricci curvatures at t = 0 and those of the normaliser via:

$$Delta {t}^{*}=mathop{min }limits_{(i,j)in {E}^{*}}left(frac{1}{(overline{Ri{c}_{(i,j)}}-Ric{({x}^{0})}_{(i,j)})}right),$$


where, ({E}^{*}={(i,j)in E:overline{Ri{c}_{(i,j)}}-Ric{({x}^{0})}_{(i,j)} , > , 0}). For both time courses considered Δt* [0.06, 0.065] and Ricci flow was thus implemented using close to the maximal value of Δt possible. Smaller values of Δt can be used to obtain a more fine-grain approximation of the network rewiring trajectory, at the cost of increased computation time and the need for more iterations before convergence.

For both gene expression time courses, we found that after 150 iterations the normalised Ricci flow converged very close to the normaliser, with little change in dtt with subsequent iterations, we thus selected 150 as the optimal number of iterations in the flow. We note that by construction the final transcriptomic time point will always be closest to the end of the trajectory. As the number of iterations is selected as sufficiently large to ensure convergence, rather than the minimum number of iterations required for convergence, the end of the trajectory represents signalling in a steady state, as opposed to the precise moment gene expression matches the final time point.

To derive the Euclidean linear trajectory null model, from the starting gene expression time point to the final, we constructed a straight line from ({W}^{0}={({a}_{ij}{x}_{i}^{0}{x}_{j}^{0})}_{i,jin V}) to ({W}^{T}={({a}_{ij}{x}_{i}^{T}{x}_{j}^{T})}_{i,jin V}) in ({{mathbb{R}}}^{ntimes n}). We selected 150 equally spaced points along this line via the following expression



Comparing inferred trajectories to true time course gene expression data

For both normalised discrete Ricci flow and the Euclidean linear trajectory null model we derived a trajectory described by 150 discrete points from the starting gene expression state to the final, as above. Each of these discrete data points can be transformed into a prediction of the weighted network: ({W}_{p}({{{{{{{{bf{x}}}}}}}}}^{{{{{{{{bf{r}}}}}}}}})={a}_{ij}{x}_{i}^{r}{x}_{j}^{r}) for r {1, …, 150}. In the case of the Euclidean trajectory, the inferred point is exactly this weighted network, while for the normalised Ricci flow ({W}_{p}({{{{{{{bf{{x}}}}}}}^{r}}})={(1/{d}_{r}(i,j))}_{i,jin V}).

For each true intermediate time point in the gene expression time course {1, …, T − 1} we computed the Euclidean distance between each of the 150 predictions of Wp(xr) in each inferred trajectory and the true data points {W(x1), …, W(xT1)}.

The value of r which minimised the distance between Wp(xr) and W(xt) was considered the point along the trajectory which most closely corresponded to the true gene expression trajectory at time t.

The association between the trajectory points corresponding to the measured time points and the true intermediate time points themselves (excluding starting and ending time points) was assessed via Pearson correlation, with significance at the 5% level.

Entropy and Ricci curvature on networks and metric-measure spaces

A connection between Ricci curvature and relative entropy has been explored in the setting of metric-measure spaces by several investigators24,40,48. Formally let (M, d, m) be a metric-measure space, where (M, d) is a metric space and m is a measure on the Borel σ-algebra of M, the authors typically aim to define a notion by which (M, d, m) has a Ricci curvature bounded below by (Kin {mathbb{R}}) and explore the consequences. To do so they consider the metric space P2(M) = (P(M), W2), associated with the metric space (M, d), where P(M) is the space of Borel probability measures on M and W2 is the Wasserstein-2 distance. W2 is a distance measure commonly used in optimal transport, to provide intuition if m1, m2P(M) then ({W}_{2}{({m}_{1},{m}_{2})}^{2}) is the smallest cost of transporting the total mass from the measure m1 to the measure m2, where the cost of transporting a unit mass between points a1 and a2M is (d{({a}_{1},{a}_{2})}^{2}). Employing results on displacement convexity along geodesics in P(M), a connection between an entropy functional defined on P(M) and the Ricci curvature of (M, d, m) can be proposed.

Formally, using the notation of Strum 200640, we define a relative entropy functional with respect to m on P(M) via:

$$,{{mbox{Ent}}},(nu | m)={int}_{M}frac{dnu }{dm}log left(frac{dnu }{dm}right)dm$$


It has been proposed (based on results for Riemannian manifolds48) that (M, d, m) has Ricci curvature bounded below by (Kin {mathbb{R}}) if and only if, for any ν0, ν1P(M), where Ent(ν0m), Ent(ν1m) < , there exists a geodesic γ: [0, 1] → P(M), where γ(0) = ν0 and γ(1) = ν1 such that:

$$,{{mbox{Ent}}}(gamma (t)| m), le , (1-t){{mbox{Ent}}}(gamma (0)| m)+t{{mbox{Ent}}},(gamma (1)| m) -frac{K}{2}t(1-t){W}_{2}{(gamma (0),gamma (1))}^{2}.$$


Sandhu et al.19, use this statement to infer a positive correlation between an entropy defined as the negative of Ent( m) and the Ricci curvature of (M, d, m).

In our setting of networks, there is not an unambiguous way to map to a metric-measure space. The definition of the space (M, d, m) could have many choices in terms of network topology as well as vertex and edge weights. Moreover, the definition of Forman-Ricci curvature applied to networks is again non-unique, depending on edge and vertex weights and the validity of this curvature depends upon an interpretation of the network as a cell complex approximation to a Riemannian manifold. The definition of network entropy as an entropy rate is also not equivalent to the definition of Ent( m), and again the choice of m for the network setting is non-unique. Collectively this highlights a distinction between the network setting and metric-measure spaces, and results in one setting cannot be expected to be valid in the other, in particular correlation between entropy and curvature.

Statistics and reproducibility

The association between network entropy and total Forman-Ricci curvature in transcriptomic data sets was evaluated using Pearson’s correlation coefficient. The comparison between network entropy and total Forman-Ricci curvature in cancerous and healthy single cells was evaluated using two-tailed Wilcoxon tests. The association between closest pass Ricci flow iteration/straight line trajectory iteration and true differentiation time was evaluated using Pearson’s correlation coefficient. No statistical method was used to predetermine the sample size. No data were excluded from the analyses. The experiments were not randomised. The Investigators were not blinded to allocation during experiments and outcome assessment.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.