Ethical statement
The work described here complies with all ethical regulations relevant to the committees approving the two respective permits (Södersjukhuset hospital in Stockholm, Sweden for #2018/1003-31 and Candiolo Cancer Institute FPO – IRCCS, Turin, Italy for 001-IRCC-00IIS-10). The patient-derived tumor samples used in this study have been collected and used in conformity to the permits. All donors analyzed were of European ethnicity. None of the donors received compensation and written informed consent was obtained for all donors involved in both cohorts. During clinical evaluation, the seven donors available in the prostate cancer cohort reported male sex and ages ranging from 43–65 years old. One sample was excluded due to low sample quality, and one of the six remaining samples was used here. For the breast cancer samples, both donors included here reported female sex and they were 58 and 65 years of age at the time of sample collection.
Experimental methods
Samples
Cell lines
We obtained all the cell lines used in the study from ATCC: Colo320DM (cat. no. CCL-220), PCR3 (cat. no. CRL-1435), HeLa (cat. no. CCL-2), HEK293T (cat. no. CRL-1573), K562 (cat. no. CCL-243). We authenticated all cell lines by STR genotyping. We cultured Colo320DM and PC3 cells in RPMI-1640 medium (Gibco, cat. no. C11875500BT) supplemented with 15% fetal bovine serum (Gibco, cat. no. 10091148) and 1% penicillin–streptomycin (Gibco, cat. no. 15140122); HeLa and HEK293T cells in DMEM medium (Gibco, cat. no. C11995500BT) supplemented with 10% fetal bovine serum; and K562 cells in IMDM medium (Gibco, cat. no. C12440500BT) supplemented with 15% fetal bovine serum. Colo320DM and K562 cells were cultured at 37 °C with 10% CO2 while PC3, HeLa, and HEK293T cells were cultured at 37 °C with 5% CO2.
Patient-derived tumor samples
For scCircle-seq on prostate cancer, we extracted nuclei from one frozen tissue block excised from one of six prostatectomy samples that we previously collected at the Södersjukhuset hospital in Stockholm, Sweden for single-cell spatially resolved profiling DNA copy number alterations (ethical permit #2018/1003-31). For scCircle-seq on breast cancer, we extracted nuclei from two frozen breast cancer specimens (one classified as Luminal B-like [herein labeled LumB] and the other as triple-negative [herein labeled TNBC]) previously collected and stored at the Pathology Unit of the Candiolo Cancer Institute FPO – IRCCS, Turin, Italy (ethical permit “Profiling”, 001-IRCC-00IIS-10). All the patient-derived samples described in this study are unique biological samples that cannot be distributed to other researchers.
scCircle-seq
A step-by-step scCircle-seq protocol is available in Protocol Exchange at the following https://protocolexchange.researchsquare.com/article/pex-2385/v1.
scCircle-seq on cell lines
We mouth pipetted single cells into PCR tubes containing 0.25 µL of Dynabeads MyOne Silane beads (Invitrogen, cat. no. 37002D) diluted in 6.75 µL of nucleus isolation buffer containing 10 mM Tris-HCl pH 7.5, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween-20 (Invitrogen, cat. no. 003005), IGEPAL CA-630 0.3% (Sigma-Aldrich, cat. no. 18896), 0.1% bovine serum albumin (Sigma-Aldrich, cat. no. A2934), and 2 mM dithiothreitol. After incubation on ice for 30 min we gently vortexed the tubes for 1 min, followed by centrifugation at 500 × g for 5 min at 4 °C. Afterwards, we transferred 5.4 µL of supernatant containing cytoplasmic RNA into a new 0.2 mL tube for scRNA-seq using the Smart-seq2 library preparation approach (see below), leaving the bead pellet containing the nuclei undisturbed. After nucleus isolation, we added 0.4 µL of NEBNext FFPE DNA Repair Mix (New England Biolabs, cat. no. M6630S) containing 0.25 ng/mL linear spike-in DNA, 0.25 ng/mL circular spike-in DNA, 5X NEBNext FFPE DNA Repair Buffer, and 0.1225X NEBNext FFPE DNA Repair Mix into each tube containing a bead pellet, gently vortexed the samples, and incubated them at 20 °C for 1 h. After nick repair, we added 1.54 µL of nuclear lysis mix containing 40 mM Tris-HCl, 40 mM NaCl, 0.2% TritonX-100 (Sigma-Aldrich, cat. no. T9284), 30 mM dithiothreitol, 2 mM EDTA, and 1.6 µg/µL Qiagen Protease (Qiagen, cat. no. 19157) into each tube, gently vortexed the samples, and incubated them at 50 °C for 30 min followed by holding at 4 °C. Next, we added 0.05 µL of Protease Inhibitor cocktail (Sigma-Aldrich, cat. no. 8340) and 0.45 µL of water to each tube and incubated the samples at 37 °C for 1 h. After nuclear lysis, we performed linear DNA digestion by adding 1.2 µL of digestion mix containing 8.3 mM ATP, 4.16X Plasmid-Safe Reaction Buffer (Lucigen, cat. no. E3101K), 0.83–2.49 U/µL Plasmid-Safe ATP-Dependent DNase (Lucigen, cat. no. E3101K) depending on the ploidy of the cells (for diploid cells: 0.83 U/µL, for tetraploid: 2.49 U/µL), and 2.08 mM dithiothreitol into each tube, and incubated the samples at 37 °C for 20 h, followed by 70 °C for 10 min and holding at 4 °C. Next, we added 5 µL of amplification mix containing 2X phi29 buffer (New England Biolabs, cat. no. M0269), 2 mM dNTPs (Thermo Fisher Scientific, cat. no. R0192), 100 µM Exo-Resistant Random Primer (Thermo Fisher Scientific, cat. no. SO181), 0.002 U/µL Pyrophosphatase inorganic (Therm-Fisher Scientific, cat. no. EF0221), and 1.6 U/µL Phi29 DNA Polymerase (New England Biolabs, cat. no. M0269) into each tube, and incubated the samples at 30 °C for 2 h followed by 65 °C for 10 min and holding at 4 °C. We purified the amplified circular DNA on DNA Clean & Concentrator-5 columns (Zymo Research, cat. no. D4014), after which we used 10 ng of purified circular DNA as input for the Nextera XT DNA Library Preparation Kit (Illumina, cat. no. FC-131-1024).
scCircle-seq on nuclei from tumor biopsies
We first isolated nuclei from the prostate cancer (PRAD), luminal B (LumB) and triple-negative breast cancer (TNBC) samples described above, using an ad hoc modified version of the Nuclei extraction from frozen tissue for single-nuclei sequencing protocol from Mission Bio (https://support.missionbio.com/hc/en-us/articles/360042902014-Nuclei-Extraction-From-Frozen-Tissue-User-Guide). Briefly, we first prepared a tissue lysis solution (TLS) containing 0.03 mg/mL Trypsin-EDTA (0.25%), phenol red (Thermo Fischer Scientific, cat. no. 25200072), 0.1 mg/mL Collagenase type7 (Worthington, cat. no. CLS-7 LS005332) and 0.1 mg/mL Dispase II (Gibco, cat. no. 17105-041) in a spermine solution (pH 7.6) containing 3.4 mM sodium citrate tribasic dihydrate (Sigma-Aldrich, cat. no. C8532), 1.5 mM spermine tetrahydrochloride (Sigma-Aldrich, cat. no. S1141), 0.5 mM tris (hydroxymethyl) aminomethane (Sigma-Aldrich, cat. no. 252859), and 0.1% v/v IGEPAL CA-630 (Sigma-Aldrich, cat. no. I8896) in molecular biology grade water. We then added 200 μl of ice-cold TLS onto each tissue block kept into a pre-chilled Petri dish on dry ice and incubated the samples until the TLS had frozen (~3 min). After initial mincing on dry ice using a pair of pre-chilled sterile scalpels, we transferred the tissue to room temperature and continued mincing until the tissue was dissociated into small fragments that could flow through a 1 mL pipette tip. We then added 1.8 mL of TLS and transferred the whole volume of TLS solution with the tissue fragments into a 5 mL low-binding tube (Sigma-Aldrich, cat. no. EP0030108310-200EA). We incubated the samples for 15 min at room temperature on a device rotating at 20 rpm, after which we added 2 mL per sample of a stop solution containing 25 mg of Trypsin inhibitor from chicken egg white, Type II-O (Sigma-Aldrich, cat. no. T9253), and 5 mg Ribonuclease A from bovine pancreas, Type I-A (Sigma-Aldrich, cat. no. R4875) dissolved in 49.8 mL of spermine solution. We gently inverted each tube 15 times, after which we filtered the tissue suspension through a 50 μm CellTrics cell strainer (Sysmex, cat. no. 04-004-2327) and centrifuged the flowthrough at 300 × g for 5 min at room temperature. We discarded the supernatant, resuspended the pellet containing the nuclei in 400 μL of nuclei fixation solution (66% Methanol, 33% Acetic acid) and incubated the samples on ice for 15 min. Following centrifugation at 300 × g for 5 min at room temperature, we discarded the supernatant, resuspended the nuclei pellet in 1 mL of 1X PBS with 5 mM EDTA, and filtered the nuclei suspension again through a 10 μm CellTrics cell strainer (Sysmex, cat. no. 04-004-2324). We stored the nuclei suspension at 4 °C until sorting.
For single-nucleus sorting, we first added propidium iodide (PI, Thermo Fisher Scientific, cat. no. P3566) to the fixed nuclei suspensions to reach 1 mg/mL final concentration. We sorted PI+ nuclei into low DNA binding 96-well plates (Eppendorf, cat. no. 0030129504) pre-filled with 1.4 mL/well of nucleus isolation buffer, using the MoFlo Astrios EQs (Beckman Coulter) FACS system, excluding the doublets. Immediately after sorting, we quickly centrifuged the nuclei at 300 × g for 3 min at +4 °C and then stored the plates at –20 °C before proceeding to scCircle-seq using the same procedure as described above for cell lines.
For preparing libraries, we diluted the RCA products 20 times with nuclease-free water and transferred 1 µL of each RCA product (corresponding to one nucleus) into a new 96-well plate as input for tagmentation using the Nextera XT DNA Library Preparation Kit (Illumina, cat. no. FC-131-1024).
Sequencing
For scCircle-seq on cultured cells, we pooled all the libraries in paired-end mode on a HiSeq X Ten (Illumina) machine. For scCircle-seq on prostate cancer nuclei, we sequenced all the cells in paired-end mode on a NextSeq 2000 (Illumina) machine using the NextSeq 1000/2000 P2 Reagents (300 Cycles) v3 (Illumina, cat. no. 20046813). For scCircle-seq on breast cancer nuclei, we pooled the LumB and TNBC samples and sequenced them in paired-end mode on a NovaSeq 6000 (Illumina) machine using the NovaSeq 6000 SP Reagent Kit v1.5 (300 cycles) (Illumina, cat. no. 20028400).
scCircle-seq in daughter cells
To study how eccDNAs are passed to daughter cells during mitosis, we cultured Colo320DM cells in a 10 cm Petri dish at low density (1000 cells per plate) to ensure that each cell is separated from its neighbors. After most cells underwent one mitosis (~10 h), we gently discarded the medium, added 4 mL trypsin (Gibco, cat. no. 25200056) onto the cells and incubated for 1 min at room temperature. Using a mouth pipette, we isolated several pairs of daughter cells and placed each daughter cell into a separate tube for scCircle-seq.
Bulk Circle-Seq
We performed bulk Circle-Seq according to the previously described protocol7, 10, including the following modifications. In brief, we harvested 1 million cells during the exponential growth phase and extracted high molecular weight genomic DNA with the MagAttract HMW DNA Kit (Qiagen, cat. no. 67563). Next, we digested 1 µg of DNA in 100 µL of digestion mix containing 20 U of Plasmid-Safe ATP-Dependent DNase (Lucigen, cat. no. E3101K), 25 mM ATP (Lucigen, cat. no. E3101K), 1X Plasmid-Safe Reaction Buffer (Lucigen, cat. no. E3101K) for 6 days at 37 °C. Every 24 h we replenished the enzymes in the digestion mix by adding 2 µL of fresh Plasmid-Safe ATP-Dependent DNase, 4 µL of ATP, and 0.6 µL of Plasmid-Safe 10X Reaction Buffer. After 6 days, we purified circular DNA with 1X AMPure XP beads (Beckman Coulter, cat. no. A63881) following the manufacturer’s instructions. Lastly, we used 20 ng of purified circular DNA as input for the Nextera XT DNA Library Preparation Kit (Illumina, cat. no. FC-131-1024) and sequenced the libraries on a HiSeq X Ten (Illumina) machine, aiming at generating around 10 million reads per sample.
Smart-seq2
We performed Smart-seq2 following the previously published protocol17. Briefly, we mixed 5.4 µL of supernatant containing cytoplasmic RNA obtained from the nucleus isolation step in scCircle-seq with 1.27 µL of oligo-dT mix containing 1 µM oligo-dT30VN (5′–AAGCAGTGGTATCAACGCAGAGTACT30VN-3′) and 1 mM (each) dNTPs and incubated the mix for 5 min at 72 °C followed by 5 min on ice. Next, we added 7.2 µL of a reverse transcription mix containing 10 U/µL SuperScript II reverse transcriptase (Invitrogen, cat. no. 18064071), 1 U/µL SUPERase (Invitrogen, cat. no. AM2696), 1X Superscript II first-strand buffer (Invitrogen, cat. no. 18064071), 1 mM GTP, 5 mM dithiothreitol, 1 M betaine (Sigma-Aldrich, cat. no. B0300), 1 mM MgCl2, and 1 µM template-switching oligo (5′-AAGCAGTGGTATCAACGCAGAGTACATrGrG+G-3′) to each sample and incubated the samples for 90 min at 42 °C, followed by 11 cycles of: 50 °C for 2 min, 42 °C for 2 min, 70 °C for 5 min. Lastly, we added 14.52 µL of amplification mix containing 1X KAPA HiFi HotStart ReadyMix (Roche, cat. no. KK2602) and 0.1 µM ISPCR oligo (5’-AAGCAGTGGTATCAACGCAGAGT-3’) to each sample and performed PCR with the following settings: 98 °C for 3 min; 21 cycles of: 98 °C for 20 s, 65 °C for 30 s, 72 °C for 4 min; 72 °C for 15 min; 4 °C on hold. We purchased all the primers from IDT as standard desalted primers.
Induction of eccDNAs with methotrexate
To induce the production of eccDNAs, we used an approach based on cell treatment with the chemotherapeutic agent, methotrexate, as previously described39. Briefly, we grew HeLa cells into 6-well plates containing medium supplemented with 100 nM methotrexate (Sigma-Aldrich, cat. no. 1414003), replacing the medium with fresh one every 2 days. We mouth pipetted single cells for scCircle-seq at day 7 and 17.
DNA fluorescence in situ hybridization (FISH)
To demonstrate the extra-chromosomal nature of eccDNAs, we performed DNA FISH on metaphase spreads of Colo320DM, K562, and PC3 cells, targeting some of the high frequency high uniformity (HFHU) eccDNAs identified by scCircle-seq in these cells (see Supplementary Fig. 4). To prepare metaphase spreads, we grew Colo320DM, K562 and PC3 cells for 24 hours in their culture medium supplemented with colcemid (KaryoMax, Thermo Fisher Scientific, cat. no. 15210040) at a concentration of 2 µg/mL, 0.2 µg/mL and 0.1 µg/mL, respectively. Afterwards, we collected the cells, permeabilized them with 0.075 M KCl hypotonic solution for 15 min at 37 °C and fixed them with Carnoy’s fixative (methanol:acetic acid 3:1, v/v) for 10 min at room temperature. To obtain metaphase spreads, we gently dropped the fixed cells onto cold, pre-humidified coverslips from a height of ~1 m above, and left the coverslips air-dry. We designed and produced oligonucleotide DNA FISH probes using our previously described iFISH pipeline22. The sequences of all the oligos composing the probes used are available in Supplementary Data 2. To perform DNA FISH, we followed the step-by-step protocol for oligo-based DNA FISH that we previously described22. We imaged the samples on ×100 1.45 NA objective mounted on a custom-built Eclipse Ti-E inverted microscope system (Nikon) controlled by the NIS Elements software (Nikon) and equipped with an iXON Ultra 888 ECCD camera (Andor Technology), selecting 6–10 fields of view (FOVs) per sample containing metaphase spreads. In these FOVs, eccDNAs appear as individual fluorescence spots clearly separated from metaphase chromosomes and interphase nuclei, and not overlapping with the DNA signal.
Single-cell DNA-seq by Acoustic Cell Tagmentation (ACT)
To study the relationship between eccDNA production and DNA copy number alterations, we adapted the protocol for Acoustic Cell Tagmentation (ACT)44 on a non-acoustic based nanodispensing device (I.DOT, Dispendix GmbH). Briefly, we FACS-sorted single nuclei in 384-well plates prefilled with 5 μL of Vapor-Lock (Qiagen, cat. no. 981611) per well. For cell lysis, we lysed each nucleus in 150 nL of lysis buffer containing 20 mM Tris pH8, 20 mM NaCl, 25 mM DTT, 0.15% Triton X-100, 1 mM EDTA, and 25 µg/mL Qiagen Protease (Qiagen, cat. no. 19157). After dispensing, we centrifuged the plate at 3000 × g for 3 min, vortexed it at 1000 rpm for 1 min, and then again centrifuged it at 3220 × g for 3 min. This was done after every dispensing step with I.DOT. For lysis, we incubated the plate at 50 °C for 1 h followed by heat inactivation at 70 °C for 15 min. To neutralize EDTA in the lysis buffer, we dispensed 50 nL of 4 mM MgCl2 into each well and then vortexed and centrifuged the plate. For tagmentation, we dispensed 600 nL of tagmentation reaction mix containing Tagmentation DNA buffer (TD) and Amplicon Tagment Mix (ATM) at 2:1 v/v ratio (Nextera kit, Illumina, cat. no. FC-131-1096) into each well and performed tagmentation at 55 °C for 5 min followed by hold at 4 °C in a PCR thermocycler. To stop the reaction, we dispensed 200 nL of neutralization (NT) buffer into each well and incubated the plate for 5 min at room temperature. Lastly, we performed single-nucleus indexing by dispensing 1.35 μL of PCR master mix containing 1.3 μL of 2X Q5 Master Mix (New England Biolabs, cat. no. M0492L) and 50 nL of 100 mM MgCl2, and then 100 nL each of P5 and P7 Nextera index primers (Illumina, cat. no. 20027213, 20027214, 20042666, 20042667) into each well. PCR settings were as following: 72 °C for 3 min; 98 °C for 20 s; 16 cycles of: 98 °C for 10 s, 62 °C for 1 min, 72 °C for 2 min; 72 °C 5 min; hold +4 °C. Subsequently, we pooled the contents of all the wells of a 384-well plate together and purified the resulting library using AMPure XP beads (Beckman Coulter, cat. no. A63881) at 0.8 v/v ratio. We sequenced all the libraries in paired-end mode on a NovaSeq 6000 machine using the NovaSeq 6000 SP Reagent Kit v1.5 (300 cycles) (Illumina, cat. no. 20028400). See Supplementary Data 3 for a summary of sequencing statistics.
Computational methods
scCircle-seq data processing
We trimmed the sequencing reads by removing Nextera adapter sequences and overlapping R1-R2 read pairs. We then mapped the filtered reads to the human reference genome (GRCh38.p13) using the Burrows–Wheeler Aligner MEM (v0.7.17-r1188)46 with -p flag. We removed duplicates with the MarkDuplicates module in Picard Tools (v2.25.5-2)47. We calculated the mapping rate using the flagstat option in SAMtools (1.13-5)48. To calculate the enrichment of circular over linear DNA, we divided the number of reads mapped to the circular spike-in DNA by the total number of reads mapped to all spike-in DNA (circular and linear references). If the mapping rate of a sample was less than 90% and the enrichment of circular spike-in DNA was less than 80%, we discarded the sample from downstream analyses. The success rate is above 90% for all cell lines tested in this work.
CPR identification and classification
To identify circle-producing regions (CPRs), we used the same approach previously described for Circle-Seq13. Briefly, we first identified genomic regions enriched in raw reads using the findPeaks option in Homer49. We then merged the regions in the raw circle BED file to obtain the merged circle BED file. Next, we refined the borders of the CPRs using the closest option in BEDtools (v2.30.0)50 with the coverage calculated from the BAM file to get the final circle BED file. We extracted circle-supporting reads (i.e., discordant reads and split reads) from the called CPRs and filtered them with a threshold of mapQ > 20, while we removed R2R1 reads. Next, we identified chimeric junctions by extracting both ends of split reads and retained chimeric junctions with at least 2 recurrent reads within 500 base-pairs (bp) from each end of a CPR for visualization and downstream analysis. In parallel, we extracted circle-supporting reads overlapping the edge of CPRs. We calculated the circle read enrichment by dividing the number of reads mapped inside CPRs by the total number of reads.
To classify the identified CPRs, we first merged the single-cell BAM files into a pseudo-bulk BAM file for each cell type. For every CPR called in the pseudo-bulk sample, (j), we calculated the raw frequency of occurrence for this CPR as:
$${f}_{j-{raw}}=frac{{N}_{{pos}}}{{N}_{{tot}}}$$
(1)
where ({N}_{{pos}}) is the number of cells containing circular DNAs that overlap at least 10% of the corresponding CPRs, and ({N}_{{tot}}) is the total number of cells profiled by scCircle-seq for the same cell line. For each single cell, (i), we calculated the Jaccard index ({J}_{{ij}}) between the CPR (j) in the pseudo-bulk sample and the corresponding overlapping circles in the single cell. We then calculated the mean Jaccard index ({J}_{j-{raw}}) by averaging the ({J}_{{ij}}) over all the cells. Next, we normalized ({f}_{j-{raw}}) and ({J}_{j-{raw}}) to the frequency of occurrence of mitochondria DNA, ({f}_{{mt}}) and the Jaccard index of mitochondria DNA, ({J}_{{mt}}) as following:
$${f}_{j-{norm}}=frac{{f}_{j-{raw}}}{{f}_{{mt}}}$$
(2)
$${J}_{j-{norm}}=frac{{J}_{j-{raw}}}{{J}_{{mt}}}$$
(3)
and set values larger than 1 to 1. For each CPR, (j), we calculated a uniformity score, ({U}_{j}) as:
$${{U}_{j}=f}_{j}cdot {J}_{j}$$
(4)
We classified CPRs as high-frequency high uniformity (HFHU) when (f , > ,0.65) and (U , > ,0.3).
Intersection of CPRs with enhancers
For each cell line, we downloaded the files with genomic regions containing enhancers from EnhancerAtlas 2.031. We then intersected the CPRs called in single cells with the list of enhancer regions using BEDtools (v2.26.0)50 to identify enhancer-containing circles in each cell. Next, we computed the enhancer fraction as the normalized read count from the enhancer-containing circles. Lastly, we performed motif enrichment analysis on the CPRs overlapping with enhancers using the findMotifsGenome.pl tool with a background file with comparable GC-content and genomic size.
Topic modeling and dimensionality reduction
First, we filtered reads in the single-cell BAM files out using SAMtools48 if they mapped outside the CPRs called in each single-cell BAM sample. Then, we counted the number of filtered reads in 2 kilobase (kb) genomic bins along the genome and merged the counts into a single matrix. After normalizing based on the total number of reads in each sample (single cell) and filtering out bins without any read counts, we then used the matrix as input for cisTopic36 with default parameters for model training. We selected the best model based on the log likelihood, the second derivative of the likelihood curve, and the perplexity. Next, we subjected the topics obtained from cisTopic to dimensionality reduction and visualization using Uniform Manifold Approximation and Projection (UMAP)42. We annotated selected topics using the getSignatureRegions command in cisTopic with minOverlap set to 0.4. Lastly, we calculated the enrichment of genomic features across all single cells using the AUCell_buildRankings and signatureCellEnrichment commands in cisTopic. For differential topic analysis, we extracted the topic-cells matrix and used it as input for DESeq251. Then we selected the topic pairs with adjusted P value lower than 0.5 and fold-change greater than 2.
SMART-seq2 data analysis
We first mapped the reads to the human reference transcriptome (GRCh38) using HISAT2 (v2.1.0)52 and quantified and merged the single-cell RNA counts with RSEM53. We then used the merged matrix as input for Seurat (v4.0)54 for all subsequent analyses.
ChIP-seq data analysis
We downloaded ChIP-seq data for various histone modifications in the cell lines used in this study from the Encyclopedia of DNA Elements (ENCODE) (www.encodeproject.org) and the National Institutes of Health (NIH) Sequence Read Archive (SRA) portal (https://www.ncbi.nlm.nih.gov/sra). To calculate the enrichment of eccDNAs over specific genomic features we used the computeMatrix tool in deepTools (v3.5.0)55 with the scale-regions flag and visualized them with the plotProfile tool in deepTools.
ACT data pre-processing and copy number calling
We demultiplexed raw sequence reads to fastq files using the BaseSpace Sequence Hub cloud service of Illumina. Following this, we aligned the reads to the Hg38 reference genome using bwa-mem (version 0.7.17-r1188)46. Next, we deduplicated reads using gatk MarkDuplicates (version 4.2.5.0)56. To call absolute copy numbers in single cells we used ASCAT.sc (https://github.com/VanLoo-lab/ASCAT.sc). Briefly, we binned the genome in 240 kilobase (kb) bins and counted the number of reads in each bin, discarding cells with fewer than 300,000 reads. We then normalized binned read counts for GC-content using LOESS smoothing. We segmented GC-corrected read counts using the multipcf function from the Copynumber package (version 1.29.0.9)57 with a penalty of 6. Finally, we inferred integer copy numbers using a grid search between different purity and ploidy values (purity being set to 1 due to single-cell data) and selecting the best goodness-of-fit. A small proportion of copy number profiles had extremely high and inconsistent absolute copy numbers and were filtered out by calculating the average copy number for all cells and removing cells with an average copy number >2.8.
Statistics & reproducibility
No statistical method was used to predetermine sample size. The experiments were not randomized. The Investigators were not blinded to allocation during experiments and outcome assessment. We excluded cells yielding low-quality sequencing data from the analyses, where low quality was defined as cells for which fewer than 100 circle-producing regions (CPRs) were found and sequence data mappability was below 70%.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
- SEO Powered Content & PR Distribution. Get Amplified Today.
- PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
- PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
- PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
- PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
- Source: https://www.nature.com/articles/s41467-024-45972-y