Lecture or Panel
Title: Modern Nonlinear Embedding Methods Unpacked: Empowering Biological Discoveries with Statistical Insights
Abstract: Learning and representing low-dimensional structures from noisy, high-dimensional data is a cornerstone of modern data science. Stochastic neighbor embedding algorithms, a family of nonlinear dimensionality reduction and data visualization methods, with t-SNE and UMAP as two leading examples, have become very popular in recent years. Yet despite their wide applications, these methods remain subject to points of debate, including limited theoretical understanding, ambiguous interpretations, and sensitivity to tuning parameters. In this talk, I will present our recent efforts to decipher and improve these nonlinear embedding approaches. Our key results include a rigorous theoretical framework that uncovers the intrinsic mechanisms, large-sample limits, and fundamental principles underlying these algorithms; a set of theory-informed practical guidelines for their principled use in trustworthy biological discovery; and a collection of new algorithms that address current limitations and improve performance in areas such as bias reduction and stability. Throughout the talk, I will highlight how these advances not only deepen our theoretical understanding but also open new avenues for scientific discovery.
Rong Ma is an Assistant Professor of Biostatistics at the Harvard T.H. Chan School of Public Health. His current research focuses on (i) statistical inference for large random matrices and high-dimensional models, (ii) theoretical and computational underpinning of modern nonlinear embedding techniques and manifold learning algorithms, and (iii) developing principled and interpretable machine learning methods for biomedical sciences, especially for single-cell integrative genomics and multiomics.
If you would like to join via Zoom, please register here.
All seminars are held at 12:05 PM via Zoom and onsite. View all seminar information here.