6565 Fannin Street

View map

Xuegong Zhang
Department of Automation, Tsinghua University
zhangxg@tsinghua.edu.cn 

Abstract
Large language models (LLMs) pretrained on massive data have become foundation models for pervasive tasks in natural language understanding and beyond. Developing foundation models for deciphering the “language of life” at molecular, cellular and system levels to promote biological and medical research is promising yet challenging in many aspects. We have developed foundation models for single-cell transcriptomics toward this goal using two approaches, which produced the two large models scFoundation and scMulan. After pretraining on tens of millions of human scRNA-seq data from a variety of healthy and diseased samples covering almost all known cell types and states, the models have shown evidence of capturing complex context relations among gene expressions as well as features in metadata. Experiments showed that the pretrained model can achieve state-of-the-art performances in zero-shot manner or with light fine-tuning on a diverse array of single-cell analysis tasks. This success has given us the perspective of building cross-domain foundation models of life as “digital twins” or “AI patients” to empower biomedical research and future healthcare applications. 

Brief Bio
Xuegong Zhang received his BS degree in Industry Automation in 1989 and his Ph.D. degree in Pattern recognition and Machine Intelligence in 1994, both from Tsinghua University, after which he joined the faculty of Tsinghua University. He had visited Harvard School of Public Health in 2001-2002, and is now a Professor of Pattern Recognition and Bioinformatics in the Department of Automation, Tsinghua University, and Adjunct Professor of the School of Life Sciences and School of Medicine. He is ISCB Fellow and CAAI Fellow. He is also the chairman of the Committee of Bioinformatics and Artificial Life, Chinese Association of Artificial Intelligence, and the chairman of the Committee of Intelligent Health and Bioinformatics, Chinese Association of Automation. His major research interests include machine learning, bioinformatics, human cell atlas, intelligent precision medicine, and foundation models of life.


Sponsored by The Ting Tsung and Wei Fong Chao BRAIN Center, Houston Methodist Cancer Center