这是一篇来自science杂志的论文,非常经典!介绍了测地距离在流行降维中的应用。
Scientists working with large volumes of high-dimensional data, such as global
climate patterns, stellar spectra, or human gene distributions, regularly confront
the problem of dimensionality reduction: Þnding meaningful low-dimensional
structures hidden in their high-dimensional observations. The human
brain confronts the same problem in everyday perception, extracting from its
high-dimensional sensory inputsÑ30,000 auditory nerve Þbers or 106 optic
nerve ÞbersÑa manageably small number of perceptually relevant features.
Here we describe an approach to solving dimensionality reduction problems
that uses easily measured local metric information to learn the underlying
global geometry of a data set. Unlike classical techniques such as principal
component analysis (PCA) and multidimensional scaling (MDS), our approach
is capable of discovering the nonlinear degrees of freedom that underlie complex
natural observations, such as human handwriting or images of a face under
different viewing conditions. In contrast to previous algorithms for nonlinear
dimensionality reduction, ours efÞciently computes a globally optimal solution,
and, for an important class of data manifolds, is guaranteed to converge
asymptotically to the true structure.
1