Search
Close this search box.

Bidirectional epigenetic editing reveals hierarchies in gene regulation – Nature Biotechnology

Cell culture of cell lines

Lenti-X HEK293T (Clontech) cells were cultured in DMEM (Gibco) with L-glutamine and sodium pyruvate supplemented with 10% FBS (Gibco) and 1% penicillin–streptomycin (Gibco) and passaged using TrypLE Express (Gibco). K562 (American Type Culture Collection (ATCC), CCL-238) was cultured in RPMI 1640 (Gibco) with L-glutamine supplemented with 10% FBS and 1% penicillin–streptomycin. Jurkat clone E6-1 cells (ATCC, TIB-152) were cultured in RPMI 1640 with L-glutamine (Gibco) supplemented with 10% FBS, 10 mM HEPES (Gibco), 1 mM sodium pyruvate (Gibco) and 1% penicillin–streptomycin. Cells were routinely tested for mycoplasma using a MycoAlert PLUS Detection Kit (Lonza) and found to be negative.

Isolation and culture of primary human T cells

Human T cells were sourced from peripheral blood mononuclear cell (PBMC)-enriched leukapheresis products (Leukopaks, STEMCELL Technologies) from healthy donors, after institutional review board (IRB)-approved informed written consent (STEMCELL Technologies). T cell populations (bulk or CD4+ memory cells) were isolated from Leukopaks using EasySep magnetic selection following the manufacturer’s recommended protocol (STEMCELL Technologies, 100-0695, 19157). T cells were cultured in X-VIVO 15 (Lonza) supplemented with 5% FBS and 100 IU ml−1 recombinant human IL-2 (AmerisourceBergen).

CRISPRai construct generation

The CRISPRai construct was cloned in the following format: TRE3G-VPR-dSaCas9-P2A-dSpCas9-BFP-KRAB-EF1a-Bleo-T2A-rtTA. The vector containing the TRE3G and Tet-On system was PiggyBac; the zeocin resistance gene and the Tet-On 3G transactivator were driven by the EF1a promoter (gifted by the Stanley Qi laboratory)98. The Super PiggyBac transposase plasmid was obtained from System Biosciences. VPR was obtained from pSLQ2349 (gifted by the Stanley Qi laboratory); dSaCas9 was obtained from pSLQ2840 (Addgene, 84246); and dSpCas9-BFP-KRAB was obtained from pHR-SFFV-dCas9-BFP-KRAB (Addgene, 46911). The ZNF10 (KOX1) KRAB domain7 was used. Constructs were cloned using Gibson Assembly (NEBuilder HiFi DNA Assembly) and confirmed by Sanger sequencing (Elim Biopharmaceuticals). Primers and oligos were obtained from Elim Biopharmaceuticals and Integrated DNA Technologies (IDT). Selected constructs are available on Addgene (https://www.addgene.org/Howard_Chang/).

CRISPR gRNA cloning

Primers and oligos for bulk validation experiments were obtained from Elim Biopharmaceuticals and IDT. Plasmids were confirmed by Sanger sequencing (Elim Biopharmaceuticals). Individual single gRNAs were cloned using Gibson Assembly (NEBuilder HiFi DNA Assembly). For validation and Perturb-seq experiments, gRNAs were constructed from pSLQ2853-3 pHR: U6-Sasgv2CXCR4-1 CMV-EGFP (Addgene, 84254) and pSLQ1852-2 pHR: U6-SpsgCD95-1 CMV-EGFP (Addgene, 84151). For dSaCas9 gRNAs, GFP was replaced with mScarlet (pmScarlet_Giantin_C1; Addgene, 85048).

For Perturb-seq single gRNAs, gRNAs pools were constructed from two gRNA backbones, with the dSpCas9 or dSaCas9 gRNA scaffold. Pools were cloned in arrayed format by ordering top and bottom approximately 31–33-bp gRNA oligos from IDT with appropriate overhangs. Top and bottom oligos were combined at 100 mM in annealing buffer (potassium acetate, 30 mM HEPES-KOH pH 7.4 and 2 mM magnesium acetate in water, adapted from Jonathan Weismann laboratory protocols, https://weissman.wi.mit.edu/resources/), annealed on a thermocycler at 95 °C for 4 min, cooled slowly for 3 h, pooled, phosphorylated using T4 PNK (NEB) at 37 °C for 30 min with 65 °C for 20-min PNK inactivation, ligated into the previously digested and dephosphorylated (Fast AP, Thermo Fisher Scientific) lentiviral gRNA backbone using T4 ligase (NEB) and transformed by heat shock into Stbl3 competent cells (Thermo Fisher Scientific).

For Perturb-seq double gRNAs, gRNA pools were constructed in a two-step cloning process (Extended Data Fig. 2a). Oligo pools (IDT) containing approximately 200-bp oligos were cloned with the format: (amplification primer)-(digest site)-(gRNA1)-(scaffold1)-(hu6 landing pad)-(digest site)-(amplification primer). For step 1, oligo pools were PCR amplified in multiple reactions with low cycle number (NEB Ultra II Master Mix), digested and size selected via gel purification (E-Gel EX, Thermo Fisher Scientific), ligated into predigested gRNA backbones with T4 ligase overnight at 16 °C for 16 h and inactivated at 65 °C for 10 min and transformed into Stbl3 competent cells and grown at 30 °C. For step 2, plasmid products were digested, dephosphorylated and gel size selected, and the previously digested hu6 PCR fragment (from pMJ117; Addgene, 85997) with appropriate overhangs was inserted via T4 ligation. Original vector backbone and intermediate backbone product were designed for digestion with Esp3I (BsmBI, NEB), and inserts were designed for digestion with BsaI (NEB).

For the enhancer gRNA enrichment screen, double gRNA pools were constructed in a one-step cloning process (Extended Data Fig. 2b). Primer pools were obtained from IDT and contained gRNA sequences and primer sequences for dSpCas9 gRNA scaffold and the hu6 promoter. Primers were used to generate a PCR product in the format of [mu6 fragment-gRNA1-Sp gRNA scaffold-hu6-gRNA2-Sa gRNA scaffold fragment], flanked by BsmBI digestion sites. The PCR product and backbone were digested separately and ligated with T4 ligase following recommended protocols.

gRNAs are listed in Supplementary Tables 4 and 7 for Jurkat flow cytometry and ATAC-seq experiments (Validation Experiment A) and primary T cell flow cytometry experiments (Validation Experiment B). Two gRNAs per RE were used. For primary CAR T cell CRISPRi screens, the same gRNA pool was used as for the IL2 validation screen, which included eight gRNAs per enhancer.

Stable cell line generation

Stable cell lines were generated by electroporation via the Neon Transfection System (Thermo Fisher Scientific). Cells were electroporated using recommended parameters, recovered in fresh media for 3 d, selected with zeocin (Thermo Fisher Scientific) for 10 d and then analyzed by flow cytometry for BFP to confirm dCas9 cassette expression near 100% of cells.

qPCR

Brilliant II SYBR Green qPCR Master Mix (Agilent Technologies) was used. Primers (Elim Biopharmaceuticals) were validated before use by examining the melt curve. Analysis was performed using the ΔΔCt method, relative to the housekeeping gene ACTB and NTC gRNA controls. For ATAC–qPCR, Jurkat ATAC-seq libraries were used as input to qPCR, and optimal primers were designed in RE peaks using the ATAC Primer Tool83; one biological replicate of ATAC-seq was used as input to ATAC–qPCR due to sample volume constraints.

Lentivirus production

For cell line experiments, Lenti-X HEK293T cells were seeded on plates overnight to achieve 95% confluency at time of transfection and transfected with packaging plasmids psPAX2 (1.5 µg; Addgene, 12260) and pMD2.G (4.5 µg; Addgene, 12259) and viral expression vector (6 µg) per 10-cm plate using Opti-MEM (Gibco) and Lipofectamine 3000 transfection reagents (Thermo Fisher Scientific). Viral supernatant was collected at 48 h and concentrated using Lenti-X Concentrator (Clontech) following the manufacturer’s instructions, resuspended in cell culture media at 10× the original culture volume and stored at −80 °C.

For primary T cell experiments, similar steps were followed with the following modifications. Cells were seeded in Opti-MEM I Reduced Serum Medium with L-glutamine (Gibco) supplemented with 5% FBS, 1 mM sodium pyruvate (Gibco) and 1× non-essential amino acids (Gibco) (cOpti-MEM) in T25 flasks in 5 ml. Cells were transfected with psPAX2 (3.1 µg; Addgene, 12260), pMD2.G (1.5 µg; Addgene, 12259), expression vector (4.2 µg), Lipofectamine 3000 (20.1 µl) and P3000 (18.5 µl; Thermo Fisher Scientific) in 3.7 ml. At 6 h, media were replaced with cOpti-MEM supplemented with ViralBoost at 1:500 dilution (ALSTEM). Lentiviral supernatant was harvested 24 h and 48 h after transfection, centrifuged at 500g for 5 min at 4 °C to remove debris, concentrated with Lenti-X Concentrator and resuspended in Opti-MEM at 100× the original culture volume.

Flow cytometry and fluorescence-activated cell sorting

All antibodies were used at 1:20–1:200 dilutions. All cells were stained in flow cytometry staining buffer (eBioscience). FlowJo (version 10.6.1) software was used for all analysis. Cells were analyzed by flow cytometry (Attune NxT, Thermo Fisher Scientific, or LSR II, BD Biosciences) or sorted based on stained markers and gRNA expression (GFP or mScarlet) (FACSAria II, BD Biosciences). Fluorescence-activated cell sorting (FACS) was performed at the Stanford Shared FACS Facility.

For Jurkat intracellular cytokine staining, cells were stained with Zombie NIR viability dye at 1:1,000 dilution in PBS at 10 million cells per 100 µl for 15 min at 4 °C, washed, fixed using Cyto-Fast Fix/Perm Buffer Set (BioLegend) for 25 min at 22 °C, washed and stored in Cyto-Last Buffer (BioLegend) at 4 °C in the dark for 1–3 d. Before sorting, fixed cells were permeabilized and stained with IL2-BV711 (BioLegend, clone MQ1-17H12, cat. no. 500346, lot no. B354636) and IFNG-APC (BioLegend, clone B27, cat. no. 506510, lot no. B329616) antibodies for 45 min at 22 °C, washed with fix/perm buffer and resuspended in staining buffer. For Perturb-seq, cells were similarly stained with Zombie NIR fixable viability dye. For Jurkat validation CRISPRi experiments, cells were stained with CD3E-BV785 (BioLegend, clone OKT3, cat. no. 317329, lot no. B311209) or CD47-BV605 (BioLegend, clone CC2C6, cat. no. 323119, lot no. B300088) antibodies.

For primary T cell flow cytometry experiments, cells were stained with Ghost Dye Red 780 (Tonbo Biosciences), CD4-BV510 (BioLegend, clone OKT4, cat. no. 317444) and CD8-PerCP/Cyanine5.5 (BioLegend, clone SK1, cat. no. 344710), fixed and permeabilized with BD Cytofix/Cytoperm (BD Biosciences), stained for intracellular IL2 with IL2-APC (BioLegend, clone MQ1-17H12, cat. no. 500310) as described for Jurkat T cells and analyzed by flow cytometry (Attune NxT, Thermo Fisher Scientific). Plots shown are for live gated cells from a culture of CD3+ T cells (from which CD4+ and CD8+ are gated) or pre-isolated memory CD4+ cells. For CD4+ and CD8+ cell analysis, data were normalized to NTC cells on a per-donor basis. For memory CD4+ cell analysis, data were normalized to NTC and unstimulated cells on a per-donor basis. Perturbation strength was calculated by additionally normalizing by normalized TSS percent IL2+ values on a per-donor basis. Jurkat validation flow cytometry data were analyzed similarly.

For primary T cell pooled gRNA screens, cells were stained with Ghost Dye Red 780 (Thermo Fisher Scientific), fixed and permeabilized with Cyto-Fast Fix/Perm Buffer Set (BioLegend) and stained for intracellular IL2 (BioLegend, clone MQ1-17H12, cat. no. 500346, lot no. B354636). CD4+ memory primary T cell phenotype was verified using the following cell surface markers: CD3-PE (BioLegend, clone UCHT1, cat. no. 300441); CD4-BV511 (BioLegend, clone OKT4, cat. no. 317444); CD8-PerCP/Cyanine5.5 (BioLegend, clone SK1, cat. no. 344710); CD45RA-BV711 (BioLegend, clone HI100, cat. no. 304138); CD45RO-FITC (BioLegend, clone UCHL1, cat. no. 304204); CD62L-PE/Cy7 (BioLegend, clone DREG-56, cat. no. 304822); and CCR7-BV421 (BioLegend, clone G043H7, cat. no. 353208).

Pooled K562 and Jurkat screening

Cells were infected with lentivirus gRNA pools in polybrene (8 µg ml−1) at a multiplcity of infection (MOI) of 0.1 (K562) or 0.2 (Jurkat), as confirmed by flow cytometry for GFP or mScarlet expression on days 2 and 3 after infection. Dox (1 µg ml−1) was added at the time of infection or 6 d before the screen endpoint and refreshed every 24 h. For the K562 screen, cells were expanded for 6 d after infection and frozen in aliquots on day 6 in CryotStor CS10 (STEMCELL Technologies). Before sorting, cells were thawed and allowed to recover in culture in dox+ media for 18 h and then sorted for live, gRNA+ cells.

For Jurkat screens, 0.5 µg ml−1 puromycin (Thermo Fisher Scientific) was added on day 3 after infection, puromycin selected for 4 d and confirmed by flow cytometry to have near 100% gRNA expression. On day 7, dox induction was started and continued for 6 d. On day 13, cells were activated at approximately 2–4 million cells per millilter for 8 h using CD3 antibody (BioLegend, clone OKT3, cat. no. 317347, lot no. B338622) coated tissue culture plates and media containing dox (1 µg ml−1), CD28 antibody (3 µg ml−1; BioLegend, clone CD28.2, cat. no. 302943, lot no. B335272), PMA (1×), ionomycin (1×) and Brefeldin A (1×) (PMA/iono/BrefA were used from Cell Activation Cocktail, BioLegend). gRNA+ cell number (accounting for MOI) was maintained at 1,000× the number of gRNAs included in the gRNA pool throughout the screen. For the initial screen, the above steps were modified to begin dox induction at the time infection; puromycin selection was performed from day 3 to day 7; and the screen was stopped on day 7.

Single-cell library preparation

Single and double perturbations were performed in separate single-cell captures. Sorted cells were prepared using the Chromium Next GEM Single Cell 5′ Kit v2, Chromium Next GEM Chip K Single Cell Kit and Library Construction Kit (10x Genomics), following the Chromium Next GEM Single Cell 5′ Reagent Kits v2 (Dual Index) with Feature Barcoding user guide (CG000330 Rev A).

GEX libraries were constructed as recommended. For gRNA detection, oligos complementary to each of the gRNA scaffolds (Sa and Sp) were spiked into the RT reaction at 11.43 pmol each.

Sa: AAGCAGTGGTATCAACGCAGAGTACacaagttgacgagataaacacgg

Sp: AAGCAGTGGTATCAACGCAGAGTACcgactcggtgccactttttc

For step 2.2, cDNA primers were used (green; 10x Genomics, PN 2000089) instead of feature cDNA primers (purple; 10x Genomics, PN 2000277). For step 2.3, GEX is in the pellet (2.3 A), and gRNAs are in the supernatant (2.3B); both portions were retained; and library construction was performed separately. gRNA library construction was performed using a custom PCR protocol, and Sa and Sp gRNA libraries were constructed separately. PCR1: outer nested PCR, F CTACACGACGCTCTTCCGATCT, R_sa acaagttgacgagataaacacgg, R_sp CGACTCGGTGCCACTTTTTC (98 °C for 3 min; 20 cycles at 98 °C for 20 s, 66 °C (Sa)/68 °C (Sp) for 30 s and 72 °C for 20 s; and 72 °C for 5 min). PCR2: inner nested PCR and adapter common region addition, F same primer as PCR1,

R_sa GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTgataaacacggcattttgccttg,

R_sp GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTcaagttgataacggactagcctt

(same cycling conditions as PCR1, with annealing temperatures 66 °C (Sa)/65 °C (Sp)). PCR3: sample index PCR, P5 and P7 Dual Index TT Set A (98 °C for 3 min; 15 cycles at 98 °C for 20 s, 54 °C for 30 s and 72 °C for 1 min; and 72 °C for 5 min). After each PCR, products were run on E-Gel EX 2% agarose and size selected.

Design parameters for single-cell screens

CRISPRai was specifically designed to be highly scalable, and there is no inherent limitation on the number of perturbations CRISPRai can perform. Similar to direct capture Perturb-seq, CRISPRai screens have a tradeoff between the number of targets in the pool and the number of single cells that the user wants to assay at once. CRISPRai is highly scalable because we leverage the simultaneous direct capture of two gRNAs, which enables pooled cloning and virus production. Using current commercially available technologies, CRISPRai can be scaled to thousands of perturbations, as recently demonstrated by Replogle et al.91 for genome-scale direct capture Perturb-seq, which can be further expanded using emerging technologies. To maximize CRISPRai Perturb-seq data quality, we suggest the following: (1) analyze at least 40–50 cells per gRNA genotype; (2) anticipate approximately 50% efficacy of dual gRNA detection (for example, plan for sequencing 80–100 cells per gRNA to yield 40–50 cells with high-confidence gRNA detection) to accommodate the lower gRNA detection rate of two compared to one gRNA per cell; (3) include two or more gRNAs per gene to enable gRNA correlation analysis; and (4) incorporate single perturbation (CRISPRi and CRISPRa) controls to enable genetic interaction analysis and to identify which gene pairs are amenable to bidirectional control.

CRISPR gRNA enrichment library preparation

Genomic DNA (gDNA) was extracted from sorted cells for different cytokine populations. Initial screen: IL2+IFNG (IL2), IFNG+IL2 (IFNG), IL2+IFNG+(DP), IL2IFNG (NEG) and unsorted (UN) cells. Validation screen: IL2+ (IL2), IL2 (NEG) and UN cells. Cells were washed with PBS and resuspended in 1× lysis buffer (10 mM Tris pH 8, 5 mM EDTA, 0.5% SDS, 1× (0.4 mg ml−1) Proteinase K) (Thermo Fisher Scientific) in water at 10 million cells per 800 µl, incubated at 55 °C for 2 h and then 65 °C for 16–20 h overnight. Samples were then cooled to room temperature for 10 min, and Triton X-100 (Sigma-Aldrich) was added to a final concentration of 0.5%. The number of cells per population used for gDNA extraction was 0.2–15 million and 10–20 million for the initial and validation screens, respectively. For samples with more than 2 million sorted cells, gDNA was then purified using the Quick-DNA Miniprep Kit (Zymo Research), following the ‘Cell Suspensions and Proteinase K Digested Samples’ recommended protocol. For samples with fewer than 2 million cells, a precipitate-based method was used for gDNA extraction. After addition of Triton X-100 and sodium acetate to 10%, 2.5× volumes of 100% EtOH was added; samples were placed at −20 °C for 1 h followed by centrifugation at 20,000g for 15 min at 4 °C; the supernatant was removed; 75% EtOH was added; centrifugation was performed again; and pellets were dried overnight at room temperature and resuspended in elution buffer.

Library preparation from gDNA was performed by three PCR steps. PCR1: multiple reactions per sample were set up with 2 µg or less of gDNA with outer nested primers complementary to the gRNA cassette (98 °C for 3 min; 14 cycles of 98 °C for 20 s, 58 °C for 20 s and 72 °C for 40 s; and 72 °C for 2 min) and concentrated with DNA Clean & Concentrator (Zymo Research). PCR2: inner nested primers (98 °C for 30 s; six cycles of 98 °C for 15 s, 60 °C for 15 s and 72 °C for 45 s; and 72 °C for 2 min) and size selected using SPRI beads 0.75× cleanup. PCR3: Tru-seq-based indexing primers (98 °C for 30 s; six cycles of 98 °C for 15 s, 63 °C for 15 s and 72 °C for 45 s; and 72 °C for 2 min) and size selected using SPRI beads 0.75× cleanup. After each PCR, products were checked on E-Gel EX 2% agarose.

Primer sequences:

PCR1

mU6_outer_fw: cagcacaaaaggaaactcaccctaactgtaaag

sasgRNA_PCR_3Rev: tctcgccaacaagttgacgagataaaca

PCR2

p7_saRNA_stagger2_rev: GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTccttgttatagtagattctgtttccagagtactaTAAC

p7_saRNA_stagger1_rev: GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTcttgttatagtagattctgtttccagagtactaTAAC

p7_saRNA_stagger0_rev: GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTtgttatagtagattctgtttccagagtactaTAAC

p5_mU6_0nt_stagger: ACACTCTTTCCCTACACGACGCTCTTCCGATCTtcccttggagaaccaccttgt

p5_mU6_1nt_stagger: ACACTCTTTCCCTACACGACGCTCTTCCGATCTCtcccttggagaaccaccttgt

p5_mU6_2nt_stagger: ACACTCTTTCCCTACACGACGCTCTTCCGATCTGCtcccttggagaaccaccttgt

ATAC-seq in Jurkat T cells

Jurkat CRISPRai T cells were transduced with individually cloned gRNAs (two gRNAs per RE) and processed under the same conditions as the Jurkat enhancer pooled screens. On the day of collection, cells were harvested for bulk ATAC-seq library preparation according to published protocols99. ATAC-seq reads were aligned to reference genome hg19 with Bowtie 2 (ref. 100) (version 2.3.4.1) using the parameter –very-sensitive. Data were filtered to remove mitochondrial reads, retain proper pairs (-f 0×2) and remove ambiguously mapped reads (MAPQ > 10, -q 10). BAM files were sorted and indexed with SAMtools (version 1.8). BedGraph coverage files were generated using bamCoverage from deepTools (version 3.3.1_py36)101 with parameters –numberOfProcessors 10–binSize 50–normalizeUsing CPM–region chr4. For quantification, data were further normalized by the total signal for chr4 per sample using a pseudocount of 1 × 10−4 and scaled to 1 × 106.

Primary T cell CRISPRi experiments and pooled screen

The CRISPRi plasmids used for primary T cell experiments were SFFV-ZIM3KRAB-dCas9-2A-mCherry or SFFV-ZIM3KRAB-dCas9-BlastR. To generate these plasmids, we replaced dCas9-VP64 on Lenti-SFFV-dCas9-VP64-2A-mCherry (Addgene, 180263) with ZIM3KRAB-dCas9 from Addgene, 154472, using Gibson assembly. The ZIM3 KRAB domain was used. Next, mCherry was replaced with BlastR (Addgene, 52962) using Gibson assembly. The Lenti 1928z CAR construct was a gift from Dan Goodman. The high-affinity HA-GD2-28z CAR sequence was a gift from the Crystal Mackall laboratory82 and was cloned into the Lenti-1928z plasmid, replacing the 1928z CAR with the HA-GD2-28z CAR using Gibson assembly.

For all primary T cell experiments, cells were activated on day 0 using anti-human CD3/CD28 CTS Dynabeads (Thermo Fisher Scientific) at a 1:1 cell:bead ratio at 1 million cells per milliliter. Cells were transduced with each lentivirus sequentially after Dynabead activation: dCas9-KRAB at 18 h, CAR constructs at 26 h (when added) and gRNAs at 40 h. On day 9, cells were reactivated with ImmunoCult Human CD3/CD28/CD2 T Cell Activator (STEMCELL Technologies) with 6.25 μl ml−1 culture medium at 2 million cells per milliliter. One hour after reactivation, GolgiPlug Protein Transport Inhibitor (BD Biosciences) was added at a 1:1,000 dilution, and, after 7 h, cells were stained for cell surface proteins, fixed and permeabilized and stained for intracellular cytokines.

For arrayed primary T cell flow cytometry experiments, the above steps were followed with the following modifications. Fresh Leukopak cells were pre-enriched for CD3+ T cells using an EasySep Human T Cell Isolation Kit (STEMCELL Technologies) before experiments.

For pooled primary T cell screens, the above steps were followed with the following modifications. Fresh Leukopak cells were pre-enriched for CD4+ memory T cells using an EasySep Human Memory CD4+ T Cell Enrichment Kit (STEMCELL Technologies) before experiments. CD4+ memory T cell phenotype was verified by flow cytometry immediately after isolation using cell surface markers CD3, CD4, CD8, CD45RA, CD45RO, CD62L and CCR7. Cells were treated with 10 µg ml−1 blasticidin for 6 d starting on day 3 after activation. Cells were collected on day 9 and stained for live/dead and intracellular IL2. IL2 and IL2+ populations were sorted by FACS; gDNA was isolated; and gRNA enrichment libraries were constructed as described for Jurkat T cell screens.

Sequencing

Library quality was checked by Bioanalyzer (Agilent Technologies) and quantified by KAPA Library Quantification Kit (Roche). Sequencing was performed on a NovaSeq 6000 (Illumina, Novogene) or a NextSeq 550 (Illumina). For single-cell Perturb-seq libraries, libraries were sequenced at approximately 6,000 reads per cell for gRNA and approximately 30,000–50,000 reads per cell for GEX. For the Jurkat enhancer screens, gRNA enrichment libraries were sequenced at approximately 1.5 million reads per sample for the initial screen and approximately 7.5 million reads per sample for the validation screen (~1,200 reads per gRNA after filtering), and gRNA1 and gRNA2 in the double gRNA cassette were sequenced in R1 and R2 paired reads, respectively, and paired in silico. For primary T cell enhancer screens, gRNA enrichment libraries were sequenced at 2.5–3 million reads per sample (on average ~50,000 reads per gRNA and minimum ~2,400 reads per gRNA). For bulk Jurkat ATAC-seq, libraries were sequenced at more than 29 million reads per sample.

Single-cell gRNA and transcriptome analysis

scRNA-seq reads were aligned to the GRCh38 reference genome and quantified using cellranger count (10x Genomics, version 5.0.0). CRISPR gRNA expression was quantified using cellranger count (10x Genomics, version 5.0.0) by specifying gRNA sequences and corresponding genes in features.csv. Downstream data analysis was done in R (version 3.6.1) using Seurat (version 3.2.3).

Data from five total captures were combined to one Seurat object. Cells were filtered: number of genes > 200, number of genes < 5,500–8,300, transcriptome unique molecular identifiers (UMIs) < 27,000–75,000, percent mitochondrial reads < 10%, detected gRNAs > 20 (background signal distribution) and gRNA UMIs > 50, with exact parameters differing for each capture for ranges listed above. gRNA labels for each cell were assigned based on cellranger feature calls. Only cells with one or two cellranger-detected gRNAs were retained for single gRNA and double gRNA captures, respectively. gRNA groups with fewer than 250 cells with target gene detected or with low cell numbers (n < 20) were removed. gRNA pools contained two gRNAs per gene for single perturbations and one gRNA per gene for double perturbations. For each gene, the gRNA with higher magnitude log2FC in the single perturbations was used for the double perturbation gRNA. If the two gRNAs for a given gene were not concordant in target gene log2FC expression for single perturbations, only the gRNA with greater magnitude change was retained for analysis. For these reasons, the following gRNAs were removed from the dataset: CEBPA.a1, CEBPA.a2, MAP2K3.a1, MYC.a1, MYC.a2, MYC.e1.a1, MYC.e1.a2, SPI1.a2, RIPK2.a2, ATF5.i1, CEBPB.i1 and FOSL1.i2. Gene expression was log normalized with a scaling factor of 1 × 104. gRNA expression was normalized using relative counts with a scaling factor of 100. To quantify the number of gRNAs expressed per cell above a certain expression level threshold, we applied a threshold of 20% and 10% of total gRNA expression reads for cells expected to have single and double perturbations, respectively, and applied these thresholds to cells after filtering out cells without any gRNA expression and after filtering for quality control metrics described previously. We estimated gRNA detection false-negative rate (FNR, defined as true double but detected to be single) and false-positive rate (FPR, defined as true single but detected to be double) to be 48% and 29%, respectively, using non-filtered data. It should be noted that high FNRs are expected for single-cell data due to dropout. FPRs and FNRs can be corrected for by grouping single and double perturbation cells via cell hashing antibodies88 or separate captures that impart separate sample barcodes.

For Fig. 1, only gRNA groups with more than 40 cells were included. Differential expression for CRISPR target genes was performed FindMarkers() using normalized counts and a logistic regression model with batch as a latent variable. Batch was defined as the day on which 10x captures were performed, either day 1 or day 2. For Fig. 2, the top 2,000 most variable genes were found using variance stabilization transformation (vst). All genes were centered and scaled, and batch and percent mitochondrial reads were regressed out using ScaleData(). Principal component analysis (PCA) was performed on the top 2,000 most variable genes, followed by nearest neighbor graph construction, cluster determination using the original Louvain algorithm and UMAP dimensionality reduction using the top principal components (PCs). All further analyses were performed with regression on batch as the only latent variable except for UMAP reduction of 24,661 cells, which was regressed on batch and percent mitochondrial reads.

Next, the subset of cells with SPI1 and GATA1 gRNAs was retained, and variable gene selection was repeated. Perturbation-driven cells were identified as clusters that were not composed of equal representations from all gRNA groups. Non-perturbation-driven cells were removed, and variable feature selection, PCA, neighbor graph construction, clustering and UMAP reduction were performed again. All DE testing was performed on either all genes or genes in the indicated TF target gene sets using normalized counts and logistic regression with batch as a latent variable. For module score analysis, ENCODE TF target gene sets for SPI1 and GATA1 were downloaded from Harmonizome70,71,72, and genes were identified as being unique to either set or shared. Erythroid (n = 419, human bone marrow CD34 negative-selected and GYPA positive-selected erythroblasts, single-cell RNA-seq68) and myeloid (n = 394, human peripheral blood LIN(CD3, CD19, CD56)CD14+/lo monocytes, single-cell RNA-seq69) gene sets were obtained from the literature. Module scores were calculated using AddModuleScore() using normalized, scaled and batch-regressed counts. GO term enrichment was performed with clusterProfiler (version 3.14.0) enrichGO().

For all DE gene analysis, statistical significance was determined by genes passing P_adj < 0.05 and abs(log2FC) > 0.5. For analysis of regulatory modes for downstream target genes of SPI1 and GATA1, DE genes were filtered for statistical significance. Regulatory modes for downstream target genes were defined using the following thresholds (difference = log2FC observed − log2FC expected from additive model for double perturbation): synergy difference > 0.1 (greater magnitude than expected or opposite sign than expected) and buffer difference > 0.7 (lower magnitude than expected), and the remaining genes were classified as additive. For TF ChIP-seq analysis, the top 50 genes with the most additive phenotype were selected. The random gene subset was generated by randomly selecting 300 genes detected in the Perturb-seq experiment that were not contained in the SPI1 and GATA1 DE gene sets. For SPI1 and GATA1 TF ChIP-seq analysis, bigWig files containing ‘fold change over control’ were downloaded from ENCODE70,71: ENCFF080RWW, ENCFF838RXA and ENCFF334KVR (GATA1) and ENCFF172UZW, ENCFF454PTX and ENCFF216QNX (SPI1). log2FC was calculated with a pseudocount of 0.01. log2FC ChIP-seq signal of GATA1 or SPI1 within 1 kb of the promoter (2-kb window) or within ABC model enhancers for a given gene was calculated by taking the average signal across the RE. A gene was classified as being bound by a TF if one or more REs (including promoter or enhancers) had log2FC ChIP-seq signal > 5 (normalized to input).

All functions referenced above are from Seurat unless noted otherwise. Statistical testing was performed using stat_compare_means() from ggpubr or FindMarkers() and FindAllMarkers() from Seurat. All plots were generated in R using Seurat, ggplot2 (version 3.3.2), ggpubr (version 0.2.4) and pheatmap (version 1.0.12).

CRISPR gRNA enrichment analysis

For Jurkat gRNA enrichment analysis, CRISPR gRNA enrichment reads were counted; dual gRNAs were paired in silico from paired-end reads; and a raw read counts per gRNA matrix was created using Python 3 (version 3.7.4). Downstream data analysis was done in R (version 3.6.1). gRNA pairs were filtered for pairs with the sum of raw read counts across all sorted populations > 300 reads. Reads were normalized per sample by dividing by the total reads per sample and scaling by 1 × 106 and log2 transformed with a pseudocount of 1. Fold change was calculated between each population versus the cytokine-negative population. z-scores were computed by centering and scaling relative to the mean and standard deviation of all NTC gRNAs. z-scores were used for the majority of further analyses. z-scores were calculated independently for the initial and validation IL2 locus screens. To ensure that the initial CRISPRai screens were benchmarked to positive control gRNAs with strong effects, the TSS gRNAs with the strongest effects were retained, and the following gRNAs were removed from the analysis: TSS.a.2 and TSS.i.2 (IL2) and TSS.a.1 and TSS.i.2 (IFNG). For the IL2 validation screen, the following gRNAs were removed from the analysis because they exhibited strong outlier behavior: NTC.a.1.val, TSS.a.3.val, TSS.i.1.val, or because it was not detected: E6.a.8.val.

Expected double gRNA enrichment was calculated by summing the log2FC gRNA enrichment z-scores of the corresponding single perturbations: log2FC z-score(single1) + log2FC z-score(single2). Residuals were calculated from the line of best fit between expected and observed double log2FC z-score. Pearson correlations were calculated using cor(). Perturbation strength was calculated through a second normalization step relative to TSS log2FC (E log2FC − TSS log2FC). log2FC difference was calculated through an alternate second normalization step relative to the average NTC log2FC (E log2FC − average NTC log2FC).

The genome-wide off-target analysis in Supplementary Table 7 was performed using the publicly available web tool from IDT (https://www.idtdna.com/site/order/designtool/index/CRISPR_SEQUENCE). For each targeting gRNA, the 20-bp 5′ of the PAM was uploaded in FASTA format to the web tool, generating a list of potential off-target sites genome wide, with associated metadata such as number of mismatches, genomic location and whether the off-target location overlaps a gene. We then intersected these results with the CRISPRi/a screening data of Schmidt et al.50 to evaluate whether any off-target sites overlapped genes known to impact IL2 or IFNG expression. For each identified potential off-target gene, we queried the Schmidt et al. screen hits to see if targeting that gene impacted expression of the relevant cytokine when targeted with the relevant guide type (that is, CRISPRi or CRISPRa). If the gene was a hit in any of the relevant conditions in Schmidt et al., we included the condition with the most significance (lowest false discovery rate (FDR)) into Supplementary Table 7. This analysis revealed that 14 of the 19,999 (0.07%) potential off-target sites analyzed overlapped with a gene that was a hit in a relevant condition of Schmidt et al., and 13 of our 204 targeting gRNAs (6.3%) contained at least one off-target site overlapping one of these genes. Thus, off-target overlap with coding genes is unlikely to play a major role in the observed efficacy of our gRNAs.

For histone ChIP-seq analysis, bigWig files containing ‘fold change over control’ were downloaded from ENCODE70,71. File accessions used were as follows: for activated T cells, ENCFF233LPC, ENCFF370YXG, ENCFF356ZKI, ENCFF704NYS, ENCFF741XLV, ENCFF158HYB, ENCFF232FZK, ENCFF206YVE, ENCFF336KWY, ENCFF164WIU, ENCFF060VND, ENCFF398QTX, ENCFF940OQY, ENCFF903VVJ, ENCFF356TWG, ENCFF248VJB, ENCFF690AHR, ENCFF243FBP, ENCFF624BMC and ENCFF352EYP and, for resting T cells, ENCFF906URN, ENCFF787PDH, ENCFF787LLC, ENCFF820GJE, ENCFF984ZEE, ENCFF829WQD, ENCFF055UPO, ENCFF459VQV, ENCFF041OBG, ENCFF543OQM, ENCFF863YFO, ENCFF896VDJ, ENCFF560YNU, ENCFF309ISK, ENCFF953MIX and ENCFF478JER. Regions overlapping each enhancer were used to estimate enhancer-specific histone signatures using GRanges and IRanges. For TF motif enrichment analysis, position frequency matrices (PFMs) were downloaded from JASPAR85:

JASPAR2022_CORE_vertebrates_non-redundant_pfms_jaspar.txt. TF motif score calculation in each enhancer was performed using matchMotifs() from ChromVar97 and motifmatchr102 using parameters genome = hg38, out = scores, bg = subject and p.cutoff = 5 × 10−5 and filtered for the top-scoring motifs.

For genome tracks, the following datasets were used. ABC model predictions used for tracks and all other ABC model analyses: AllPredictions.AvgHiC.ABC0.015.minus150.ForABCPaperV3.txt.gz45. For Jurkat cell type predicted enhancers, the ABC model uses Jurkat ATAC-seq and Jurkat H3K27ac ChIP-seq and mixed cell type Hi-C4. The following file accessions were downloaded from ENCODE70,71: H3K27ac-activated T cell ChIP-seq ENCFF370YXG; H3K27ac resting T cell ChIP-seq ENCFF787LLC; H3K4me3-activated T cell ChIP-seq ENCFF940OQY; H3K4me3 resting T cell ChIP-seq ENCFF863YFO; H3K4me1-activated T cell ChIP-seq ENCFF755MCS; H3K4me1 resting T cell ChIP-seq ENCFF041OBG; activated T cells DNase-seq ENCFF997BFO; and CTCF-activated T cell ChIP-seq ENCFF523IEI. H3K27ac resting Jurkat ChIP-seq (Gene Expression Omnibus (GEO) GSM1697882)41; BRD4 activated T cell ChIP-seq GSM5573170_Stim_BRD4.bw (GEO GSM5573170)103; JUNB and cFOS activated CD4 T cell ChIP-seq (GEO GSE116695; Sequence Read Archive (SRA) SRR7475866 and SRR7475865)86; RUNX1 resting Jurkat ChIP-seq (GEO GSM1697879)41; and resting Jurkat ATAC-seq (GEO GSM4130892)104. FASTQ files downloaded from the SRA were converted to bigWig files using Galaxy tools (https://galaxyproject.org/, version 22.05) and recommended pipelines105. Activated Jurkat ATAC-seq shown in tracks was generated for this manuscript using cells receiving NTC gRNAs.

For SRE score analysis, enhancer coordinates and SRE scores were downloaded from the Multiplexed CRISPRi EnhancerNet website (http://enhancer.stanford.edu/, not versioned)30 for the IL2 gene in Jurkat T cells. For the subset of enhancers shared between our screen and the SRE dataset, SRE score was plotted for all enhancer pairs. The following enhancers were shared between the CRISPRai screen and the SRE dataset: E4, E5, E7, E8 and E9.

For genetic interaction analysis, we took the following approach. (1) Calculate expected double perturbation log2FC z-scores by summing the values of the single perturbations. (2) Fit a linear model to the relationship between expected and observed log2FC z-scores for double perturbations. (3) Calculate the residual between the linear model and observed double perturbation log2FC z-score. (4) Determine significance by using two methods as described below. For method 1, we determined which RE pairs are outside 1 s.d. from the mean of residuals and required that this ‘hit’ be shared by all three replicates, which yielded three significant enhancer pairs. For method 2, we checked the normality of the residual z-scores using a Shapiro–Wilk normality test, which gave P = 0.17, P = 0.65 and P = 0.40 for Rep1, Rep2 and Rep3, respectively, indicating that these follow a normal distribution; assuming normality, we calculated P values for each residual z-score using pnorm() and took a cutoff of P < 0.05 as significant (without multiple hypothesis correction), which yielded six, five and three significant enhancer pairs for Rep1, Rep2 and Rep3, respectively. To take a stringent approach, we took only RE pairs that were called significant by both methods to be true significant hits, which yielded three RE pairs, as all pairs passing method 1 criteria also passed method 2 criteria.

Primary T cell gRNA enrichment data were analyzed as described above for Jurkat gRNA enrichment data. As sequencing depth was high for all gRNAs, no pseudocount was added.

All plots were generated in R using ggplot2 (version 3.3.2), ggpubr (version 0.2.4) and pheatmap (version 1.0.12). Genome tracks were generated using rtracklayer (version 1.46.0) and Gviz (version 1.30.0). In all box plots, statistical analysis was performed using stat_compare_means() from ggpubr. Statistical significance was performed using a two-sided Wilcoxon test using wilcox.test() unless otherwise noted. P values were corrected for multiple hypothesis testing using the Benjamini–Hochberg procedure where indicated.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.