Yang, F. et al. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat. Mach. Intell. 4, 852–866 (2022).
Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618, 616–624 (2023).
Cui, H., Wang, C., Maan, H. & Wang, B. scGPT: towards building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024).
Vaswani, A. et al. Attention is all you need. In Proc. 31st International Conference on Neural Information Processing Systems Vol. 30 (eds Guyon, I. et al.) 6000–6010 (Curran Associates, 2017).
OpenAI. New and improved embedding model. https://openai.com/blog/new-and-improved-embedding-model (2023).
OpenAI. GPT-4 technical report. Preprint at https://arxiv.org/abs/2303.08774 (2023).
Chen, Q. et al. A comprehensive benchmark study on biomedical text generation and mining with ChatGPT. Preprint at bioRxiv https://doi.org/10.1101/2023.04.19.537463 (2023).
Biswas, S. S. Role of ChatGPT in public health. Ann. Biomed. Eng. 51, 868–869 (2023).
Ayers, J. W. et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern. Med. 183, 589–596 (2023).
Strong, E. et al. Chatbot vs medical student performance on Free-Response clinical reasoning examinations. JAMA Intern. Med. 183, 1028–1030 (2023).
Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at https://arxiv.org/abs/2108.07258 (2021).
Connell, W., Khan, U. & Keiser, M. J. A single-cell gene expression language model. Preprint at https://arxiv.org/abs/2210.14330 (2022).
Chen, J. et al. Transformer for one stop interpretable cell type annotation. Nat. Commun. 14, 223 (2023).
Hao, M. et al. Large scale foundation model on single-cell transcriptomics. Nat. Methods 21, 1481–1491 (2024).
Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).
Lotfollahi, M., Wolf, F. A. & Theis, F. J. scGen predicts single-cell perturbation responses. Nat. Methods 16, 715–721 (2019).
Clough, E. & Barrett, T. The Gene Expression Omnibus database. Methods Mol. Biol. 1418, 93–110 (2016).
Cellxgene Data Portal. https://cellxgene.cziscience.com/docs/08__Cite%20cellxgene%20in%20your%20publications (2023).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Vol. 1 (eds Burstein, J. et al.) 4171–4186 (Association for Computational Linguistics, 2019).
Du, J. et al. Gene2vec: distributed representation of genes based on co-expression. BMC Genom. 20, 82 (2019).
Duong, D., Ahmad, W. U., Eskin, E., Chang, K.-W. & Li, J. J. Word and sentence embedding tools to measure semantic similarity of Gene Ontology terms by their definitions. J. Comput. Biol. 26, 38–52 (2019).
Chen, Q. et al. BioConceptVec: creating and evaluating literature-based biomedical concept embeddings on a large scale. PLoS Comput. Biol. 16, 1007617 (2020).
Hou, W. & Ji, Z. Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis. Nat. Methods 21, 1462–1465 (2024).
Wysocki, O. et al. Transformers and the representation of biomedical background knowledge. Comput. Linguist. 49, 73–115 (2023).
Ye, R., Zhang, C., Wang, R., Xu, S. & Zhang, Y. Natural language is all a graph needs. In Findings of the Association for Computational Linguistics: EACL 2024 (eds Graham, Y. & Purver, M.) 1955–1973 (Association for Computational Linguistics, 2024).
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 47, 23–28 (2019).
Levine, D. et al. Cell2Sentence: teaching large language models the language of biology. In Proc. 41st International Conference on Machine Learning (ICML 2024) (PMLR, 2024)
Brown, G. R. et al. Gene: a gene-centered information resource at NCBI. Nucleic Acids Res. 43, 36–42 (2015).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Bruford, E. A. et al. Guidelines for human gene nomenclature. Nat. Genet. 52, 754–758 (2020).
Microsoft Research AI4Science & Microsoft Azure Quantum. The impact of large language models on scientific discovery: a preliminary study using GPT-4. Preprint at https://arxiv.org/abs/2311.07361 (2023).
Touvron, H. et al. LLama: open and efficient foundation language models. Preprint at https://arxiv.org/abs/2302.13971 (2023).
Chaffin, M. et al. Single-nucleus profiling of human dilated and hypertrophic cardiomyopathy. Nature 608, 174–180 (2022).
He, B. et al. Cloudpred: predicting patient phenotypes from single-cell RNA-seq. In Proc. Pacific Symposium on Biocomputing 2022 337–348 (2021).
Marian, A. J. & Braunwald, E. Hypertrophic cardiomyopathy: genetics, pathogenesis, clinical manifestations, diagnosis, and therapy. Circ. Res. 121, 749–770 (2017).
Son, M., Kim, S. J. & Diamond, B. SLE-associated risk factors affect DC function. Immunol. Rev. 269, 100–117 (2016).
Li, Y. et al. Single-cell transcriptome analysis reveals dynamic cell populations and differential gene expression patterns in control and aneurysmal human aortic tissue. Circulation 142, 1374–1388 (2020).
Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118, 2016239118 (2021).
Visscher, P. M., Brown, M. A., McCarthy, M. I. & Yang, J. Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012).
Lubiana, T. et al. Ten quick tips for harnessing the power of ChatGPT in computational biology. PLoS Comput. Biol. 19, 1011319 (2023).
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 1–5 (2018).
Pliner, H. A., Shendure, J. & Trapnell, C. Supervised classification enables rapid annotation of cell atlases. Nat. Methods 16, 983–986 (2019).
Pasquini, G., Arias, J. E. R., Schäfer, P. & Busskamp, V. Automated methods for cell type annotation on scRNA-seq data. Comput. Struct. Biotechnol. J. 19, 961–969 (2021).
Traag, V. A., Waltman, L. & Van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9, 5233 (2019).
Welcome to MyGene.py’s documentation!—MyGene.py v3.1.0 documentation. https://docs.mygene.info/projects/mygene-py/en/latest/ (2023).
Seal, R. L. et al. Genenames.org: the HGNC resources in 2023. Nucleic Acids Res. 51, 1003–1009 (2023).
Yasunaga, M., Leskovec, J. & Liang, P. LinkBERT: Pretraining language models with document links. In Proc. 60th Annual Meeting of the Association for Computational Linguistics Vol. 1 (eds Muresan, S. et al.) 8003–8016 (Association for Computational Linguistics, 2022).
Luck, K. et al. A reference map of the human binary protein interactome. Nature 580, 402–408 (2020).
Rolland, T. et al. A proteome-scale map of the human interactome network. Cell 159, 1212–1226 (2014).
Greene, C. S. et al. Understanding multicellular function and disease with human tissue-specific networks. Nat. Genet. 47, 569–576 (2015).
UniProt Consortium. UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 51, 523–531 (2023).
Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19, 41–50 (2022).
Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 10008 (2008).
Alsaigh, T., Evans, D., Frankel, D. & Torkamani, A. Decoding the transcriptome of calcified atherosclerotic plaque at single-cell resolution. Commun. Biol. 5, 1084 (2022).
Chou, C.-H. et al. Synovial cell cross-talk with cartilage plays a major role in the pathogenesis of osteoarthritis. Sci. Rep. 10, 10868 (2020).
Cheng, S. et al. A pan-cancer single-cell transcriptional atlas of tumor infiltrating myeloid cells. Cell 184, 792–80923 (2021).
Schirmer, L. et al. Neuronal vulnerability and multilineage diversity in multiple sclerosis. Nature 573, 75–82 (2019).
Subramaniam, M. Implementing and Applying Multiplexed Single Cell RNA-sequencing to Reveal Context-specific Effects in Systemic Lupus Erythematosus. PhD thesis, UCSF (2019).
- SEO Powered Content & PR Distribution. Get Amplified Today.
- PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
- PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
- PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
- PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
- Source: https://www.nature.com/articles/s41551-024-01284-6