Close this search box.

Integrative metabolomics-genomics analysis identifies key networks in a stem cell-based model of schizophrenia – Molecular Psychiatry

Cell culture

Eight human iPSC lines were employed in this study (Supplementary Table S1). The cells were cultured using Corning® Matrigel® hESC-Qualified Matrix (Corning, Cat. No. 354277) coated plates with the use of StemMACS™ iPS Brew XF Medium (Miltenyi Biotec, Cat. No. 130-104-368) or Essential 8™ medium (ThermoFisher Scientific, Cat. No. A157001), in antibiotic-free conditions, and maintained at 37 °C, 5% CO2. iPSCs were passaged every 3–5 days using either Accutase™ or 0.5 mM phosphate-buffered saline (PBS)/EDTA. Briefly, when passaging the cells with Accutase™, cells were firstly washed with DMEM, 1 ml of Accutase™ was added per 6-well and the cells were incubated at 37 °C for 3–4 min, to ensure proper cell detachment. After the incubation an equal volume of DMEM was added to the well and the cells were collected and centrifuged at 1200 rpm for 3 min at 4 °C. For splitting with PBS/EDTA (ThermoFisher Scientific, Cat. No. 15575020), cells were briefly washed with DMEM, 1 ml PBS/EDTA was added per 6-well and the cells were incubated until they started to roughly dissociate. The EDTA was aspirated and the cells or the cell pellet (when splitting using Accutase), were resuspended in fresh medium supplemented with 10 μM ROCK inhibitor, Y-27632, (Miltenyi Biotec, Cat. No. 130-106-538). The next day the medium was changed back to iPSC medium without ROCK inhibitor. All cell lines were thoroughly characterized for their pluripotency (Supplementary Fig. 1A, B) and were tested frequently for mycoplasma contamination.

Cortical neuronal differentiation

The generation of cortical progenitors and neurons was performed as described before [27, 28] with minor modifications. Briefly, iPSCs from five 6-wells were collected with Accutase™ and seeded onto an ES-Matrigel coated 12-well. When 100% confluency was reached, StemMACS™ iPS Brew XF Medium was replaced by neural induction medium (NIM; DMEM/F12 Glutamax, Neurobasal, 100 mM L-Glutamine, 0.5 × N-2, 0.5 × B-27 + Vitamin A, 50 μM Non-Essential Amino Acids, 50 μM 2-mercaptoethanol, 2.5 μg/ml insulin, 1 μM dorsomorphine, 10 μM SB431542). The medium was changed every day until the appearance of a tightly packed neuroepithelial sheet (NES). NES was passaged with 0.5 mM EDTA in a ratio of 1:2 or 1:3 to Corning® Matrigel® Growth Factor Reduced (GFR) Basement Membrane Matrix (GFR-Matrigel; Corning, Cat. No. 354230) coated plates. The next day, the medium was switched to neural maintenance medium (NMM; DMEM/F12 Glutamax, Neurobasal, 100mM L-Glutamine, 0.5 × N-2, 0.5 × B-27 + Vitamin A, 50 μM Non-Essential Amino Acids, 50 μM 2-mercaptoethanol, 2.5 μg/ml insulin) and was changed every other day. Upon the appearance of rosettes, 20 ng/ml FGF2 (Peprotech, Cat. No. 100-18C) were added to the medium for four days. On the fourth day of FGF2 treatment, the cells were split again with 0.5 mM EDTA in a ratio of 1:2 to 1:3 onto GFR-Matrigel coated plates. The medium was switched back to NMM and the cortical progenitors were maintained for about 5–10 days until neurons accumulated outside of the rosettes. At this point, cells were passaged with Accutase™, and 50,000 cells/cm² were seeded on poly-L-ornithin/Laminin coated plates for further neuronal differentiation. Alternatively, 2–4 million cells/ml were frozen with neural freezing medium. Neurons differentiated further with half medium changes every two to three days. Samples were harvested at day (d) 0, 7, 16, 27, 50, and 100.

For the DFMO treatment, adherent cell cultures were treated daily with 10 µM DFMO (difluoromethylornithine hydrochloride hydrate; Merck, Cat. No. D193) starting from the first day of differentiation until the collection of cellular pellet and supernatant for mass spectrometry analysis or fixation for subsequent immunocytochemistry (ICC).


Cells were fixed in 4% paraformaldehyde (PFA; Sigma, Cat. No. 158127-500G) in PBS solution for 20 min at room temperature (RT). The non-specific binding was blocked with incubation with blocking buffer (3% bovine serum albumin (BSA), 0.2% Triton ×100 in PBS) for 1 h at RT. The primary antibody (Ab) was diluted in the blocking buffer in the recommended concentration and the Ab solution was applied overnight at 4 °C. The following primary Abs were used in the following concentrations: AFP 1:400 (Dako, Cat. No. A000829-2), GAD65/67 1:100 (Abcam, Cat. No. AB183999), GFAP 1:400 (Sigma, Cat. No. G3893-.2 ML), Ki67-VioR667 1:200 (Miltenyi, Cat. No.130-120-422), MAP2 1:1,000 (SynapticSystems, Cat. No.188006), NEUN 1:500 (Sigma, Cat. No. ABN78), OCT3/4 1:200 (Szabo-Scandic, Cat. No. GTX101497-100), PAX6 1:500 (Invitrogen, Cat. No. 42-6600), S100b 1:750 (Abcam, Cat. No. ab52642), SMA 1:500 (Abcam, Cat. No. ab7817), SOX1 1:200 (R&D Systems, Cat. No. AF3369), SOX2 1:500 (R&D Systems, Cat. No. MAB2018), TAU 1:200 (Cell Signaling Technology, Cat. No. 4019), TUBB3 1:1,000 (BioLegend, Cat. No. 801202 and Abcam, Cat.No. ab52623), vGLUT 1:100 (SynapticSystems, Cat. No. 135311). The secondary Ab was diluted 1:500 in 1.5% BSA, 0,2% Triton ×100 in PBS, and the solution was applied for 2 h at RT. The secondary Abs used in this study were: donkey anti-rabbit Alexa FluorTM 488 (ThermoFisher Scientific, Cat. No. A-21206), donkey anti-rabbit Alexa FluorTM 546 (ThermoFisher Scientific, Cat. No. A-10040), donkey anti-mouse Alexa FluorTM 594 (ThermoFisher Scientific, Cat. No. A-21203), donkey anti-mouse Alexa FluorTM 647 (ThermoFisher Scientific, Cat. No. A-31571), donkey anti-goat Alexa FluorTM 594 (ThermoFisher Scientific, Cat. No. A-11058), goat anti-chicken Alexa FluorTM 594 (ThermoFisher Scientific, Cat. No. A32759). Finally, the nuclei were counterstained using 4’,6-diamidino-2-phenylindole (DAPI; ThermoFisher Scientific, Cat. No. D21490) in PBS in 1:5000 dilution for 5 min at RT. The coverslips were mounted using Aqua-Poly/Mount mounting medium (PolySciences, Cat. No. 18606-20).

Microscopy, image acquisition and image analysis

Fluorescent pictures were acquired with the Zeiss Axio Observer Z1 inverted fluorescent microscope and the Leica DMi8 inverted microscope. The image acquisition was performed under the same exposure and laser intensity settings for each set of analyses. For each sample, ten random fields of view were acquired, with a minimum of 20 z-stacks collected per field to ensure proper signal coverage. Further image processing was carried out using the ImageJ software. For quantitative fluorescence intensity analysis, maximum intensity projection was applied and the mean fluorescence intensity values were calculated after background noise subtraction. These values were then normalized to the DAPI+ nuclear area to account for variations in cell density in the different fields of view.

Reverse transcription quantitative PCR

Total RNA was extracted from cells using TRI Reagent® (Merck, Cat. No. T9424), according to the manufacturer’s instructions. Genomic DNA was removed through treatment with DNase I (Sigma-Aldrich, Cat. No. AMPD1). Subsequently, 1 µg of purified RNA was reverse transcribed into cDNA using the RevertAid RT Reverse Transcription Kit (ThermoFisher Scientific, Cat. No. K1691), following the manufacturer’s guidelines. The expression levels of specific target genes at the mRNA level were quantified via reverse transcription quantitative PCR (RT-qPCR) using the 5× HOT FIREPol EvaGreen qPCR Mix Plus (no ROX) (Solis BioDyne, Cat. No. 08-25-00001-10). Samples were analyzed in technical triplicates to ensure data reliability. Non-template controls (NTCs) were included for each primer pair in every assay to monitor for reagent contamination and primer-dimer formation. To confirm the absence of genomic DNA contamination, random RNA samples were evaluated through gel electrophoresis. The RT-qPCR assays were conducted on the CFX Connect Real-Time PCR Detection System (Bio-Rad). Gene expression levels were normalized to the housekeeping gene ACTB. Relative expression changes were calculated employing the ΔΔCt method [29]. The list of the primers used for RT-qPCR assays is shown in Table 1.

Table 1 Primers used for qPCR.

Bulk RNA sample collection, quality control, library preparation, and bulk RNA sequencing

Total RNA was isolated from cells at six time points during the cortical differentiation and was prepared for paired-end mRNA sequencing. RNA extraction was performed using the TRI Reagent® (Merck, Cat. No. T9424) according to the manufacturer’s guidelines. Genomic DNA digest was performed with the use of the TURBO DNA-free™ Kit (ThermoFisher Scientific, Cat. No. AM2238). For the library preparation, the Illumina TruSeq RNA Library Prep Kit v2 was used (Illumina, Cat. No. RS-122-2001, RS-122-2002). Quality, as well as concentration of RNA were assessed employing the Agilent RNA 6000 Pico kit (Agilent, cat. no. 5067-1513), Nanodrop, the NEBNext® Library Quant Kit for Illumina® (New England Biolabs, Cat. No. E7630S) and the Qubit RNA Integrity and Quality (IQ) Assay Kit (ThermoFisher Scientific, Cat. No. Q33222). All the kits were used according to the manufacturer’s guidelines. Paired-end sequencing was performed with the NextSeq 500/550 v2 Kit (150 cycles) (Illumina).

Transcriptomic data pre-processing, heatmap generation, and differential gene expression analysis

Low-quality ends and adapter sequences were trimmed using the wrapper Trim Galore!. Reads were mapped to the human reference genome (GRCh38) using the open-source software STAR [30]. The raw counts were generated with the Hypergeometric Optimization of Motif EnRichment (HOMER) suite [31]. All the subsequent analysis was performed using R [32]. Differential gene expression analysis was performed using the DESeq2 package [33]. Raw counts were normalized using the median of ratios (variance stabilization transformation; vst) [34]. Heatmaps were generated with the ClustVis [35] tool, using the z-score of the vst transcriptomic data for every gene. Gene ontology (GO) enrichment analysis was performed using the ShinyGO 0.76 online tool [36].

A likelihood ratio test (LRT) was used to identify the differentially expressed genes (DEGs) of SCZ and control (CTRL) across the multiple time points of neuronal differentiation [32]. The LRT compared the full model containing the covariates ‘sex’, ‘batch’, ‘time point’, and ‘disease’ with a model reducing the covariates ‘sex’, ‘batch’, and ‘time point’. Statistical values were corrected for FDR using the Benjamini-Hochberg method.

Weighted gene correlation network analysis (WGCNA) and module-traits relationships

Weighted Gene Correlation Network Analysis (WGCNA) allows the generation of modules that include genes that are co-expressed in the same manner. The vst counts were used to build a co-expression network using the WGCNA [37] package in R [32]. The data were corrected for sex and batch effects using the ComBat function that is implemented in the sva package [38]. The topological overlap measure was calculated using the adjacency matrix. The DynamicTree Cut algorithm, implemented in the WGCNA package, was used to identify the different modules. The gray module contains all the genes that were not assigned to any of the other modules. The module eigengene were calculated. Pearson’s correlation was used to compare modules to each other and to the traits SCZ and the differentiation time points in the adjacency matrix. The top 25% of genes with the highest module membership (MM) were identified as hub genes.

Gene ontology annotation

Functional enrichment analysis was performed with an input gene ID list using the tool g:GOSt from the g:Profiler [39] R package. Statistical significance was computed and the g:SCS-threshold was corrected at p < 0.05.

Targeted metabolomics, sample collection, and data processing

The cells were washed with 1 ml sterile 1x PBS for 60 s. After the wash, the cells were scraped using 1 ml PBS and the suspension was collected and centrifuged at 4000 rpm for 5 min, at RT. The cell pellets were kept constantly on dry ice and stored at −80 °C until further processing. The cell supernatant was collected after a 24-h incubation, centrifuged at 4000 rpm for 10 min, immediately placed on dry ice and stored at −80 °C. Samples were analyzed using the biocrates MxP® Quant 500 (biocrates life sciences AG, Cat. No. 21094.12). Liquid chromatography-tandem mass spectrometry (LC-MS/MS) was employed to analyze small molecules, including analyte classes such as amino acids, biogenic amines, carboxylic acids, and amino acid-related molecules [40]. Lipid species were measured using flow injection analysis tandem mass spectrometry (FIA–MS/MS). Small molecules were quantified with external 7-point calibrations and internal standards and lipids were quantified by internal standards [41]. The raw data were processed by applying a modified 80% rule to reduce the false positive measurements [42]. The actual missing values, i.e., the values over the level of detection (LOD) for one time point but not for another time point, were uniformly at random imputed with a non-zero value between LOD/2 and LOD. Missing values within one class (i.e., time points and metabolites) were imputed using the arithmetic mean of the class. Batch effects were corrected by centering the data within the groups (i.e., time points) and batches. The performance of the normalization was assessed by plotting the row standard deviations versus the row means and the principal component analysis (PCA). In addition, variancePartition analysis was performed to evaluate the contribution of each individual component of the study design (i.e., time point, batch, and condition), to the measured variation of each metabolite [43].

For metabolite extraction, cell pellets were resuspended in 500 µL ice-cold methanol. Metabolites from supernatants (50 µL) were extracted using 450 µL 8:1 methanol:water. Fully 13C, 15N labeled amino acid standard (Cambridge Isotope Laboratories, Cat. No. MSK-CAA-1) and 6D-gamma hydroxybutyrate (Sigma-Aldrich, Cat. No. 615587) were spiked into samples at the first step of the extraction. After simultaneous proteo-metabolome liquid-liquid extraction [44], protein content was determined from extracted cellular interphases using a Pierce Micro BCA Protein Assay Kit (Thermo Fisher Scientific, Cat. No. 23235). Dried metabolite samples from cell pellets were dissolved in 20 µL 0.1% formic acid (FA) or 50 µL 0.1% FA for the analysis from the supernatant samples. The sample (1 µL) was injected on an Atlantis Premier BEH C18 AX column (1.7 µm, 2.1 × 150 mm, Waters, 186009361) equilibrated at 40 °C using an Acquity Premier UPLC system (Waters). A gradient was run at a flowrate of 0.4 mL/min with mobile phase A (0.1% FA in water) and mobile phase B (0.1% FA in acetonitrile) as follows: 1 min at 1% B, to 40% B in 1 min, 40% B to 99% B in 0.5 min, hold at 99% B for 1.1 min, 99% B to 1% B in 0.1 min followed by 1.8 min of re-equilibration at 1% B. GABA and Glutamate (Glu) were detected using a Xevo-TQ XS Mass spectrometer (Waters) equipped with an electrospray ionization source running in positive mode. The transitions 104–>69 (endogenous GABA), 110–>73 (labeled GABA), 148–>102 (endogenous Glu) and 154–>107 (labeled Glu) were used for quantification. The raw files were processed using MS Quan in waters connect (Waters, V1.7.0.7). The data was further analyzed in R and normalized to the protein content.

Short time-series expression miner (STEM) analysis of metabolomic and transcriptomic data

To analyze time-related cluster dynamics, the non-parametric clustering algorithm of Short Time-series Expression Miner (STEM) was used [45]. STEM is an online tool that assigns genes or metabolites to significant temporal expression profiles. The Maximum Number of Model Profiles and the Maximum Unit Change in Model Profiles between time points were set to 50 and 2, respectively. Data were normalized to d0. Integrated into the STEM tool is a GO enrichment analysis. All annotations (Biological Process (BP), Molecular Function (MF), and Cellular Component (CC)) were selected and applied. Statistical significance was computed and FDR-corrected at p < 0.05.

Network analysis

The network establishment was based on the gene expression and metabolite level changes across the five successive time point comparisons, along the cortical differentiation. The connectivity information for the initial network was acquired from the publicly available recon3D stoichiometric model data set (available at, retrieved in September 2020) [46]. Ultimately, 51 metabolites and 1135 genes were matched with their corresponding IDs.

Briefly, the construction of the network was performed based on the following steps. Initially, all the reactions associated with any of the target genes were extracted. The metabolites associated with these reactions were identified and the educt-product stoichiometry was applied for every metabolite involved in the network. Subsequently, the reaction data were filtered to extract and proceed only with the genes and metabolites measured in our dataset. The network was further enriched with protein-protein interaction information, derived using the signor database (available at, retrieved in September 2020) [47]. Finally, the network vertices were constructed after examining the unique metabolites and genes, existing in the edge dataset and were further enriched with vertex attributes, such as the vertex type (i.e., gene/metabolite). Log2 fold changes (log2FC) were converted to a color gradient scale, ranging from blue (indicating a downregulation compared to the previous time point) to red (indicating upregulation).

Extraction of subnetworks from the parental network, was based on assigning membership to the pathways, as defined by the KEGG pathway database, and selecting the subnetwork that included the highest number of differentially expressed genes and metabolites, with the closest degree distribution of the vertices. Pie charts with five equal fractions were used in order to visualize the fold changes occurring across a single metabolite or gene, corresponding to the transitions between two succeeding time points. Additionally, ellipses were used for visualizing the metabolites, while the genes were visualized with circles.

Metabolites that were needed as substantial interconnections between measured metabolites, but were not measured in our dataset, were visualized as small dots. The position for every node was provided as coordinates on a 2D plane. Network visualizations were performed using the R igraph package [48].

Latest Intelligence