Title: | Multiple Maps t-SNE |
---|---|
Description: | An implementation of multiple maps t-distributed stochastic neighbor embedding (t-SNE). Multiple maps t-SNE is a method for projecting high-dimensional data into several low-dimensional maps such that non-metric space properties are better preserved than they would be by a single map. Multiple maps t-SNE with only one map is equivalent to standard t-SNE. When projecting onto more than one map, multiple maps t-SNE estimates a set of latent weights that allow each point to contribute to one or more maps depending on similarity relationships in the original data. This implementation is a port of the original 'Matlab' library by Laurens van der Maaten. See Van der Maaten and Hinton (2012) <doi:10.1007/s10994-011-5273-4>. This material is based upon work supported by the United States Air Force and Defense Advanced Research Project Agency (DARPA) under Contract No. FA8750-17-C-0020. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Air Force and Defense Advanced Research Projects Agency. Distribution Statement A: Approved for Public Release; Distribution Unlimited. |
Authors: | Benjamin J. Radford |
Maintainer: | Benjamin J. Radford <[email protected]> |
License: | FreeBSD | file LICENSE |
Version: | 0.1.0 |
Built: | 2025-02-25 03:59:29 UTC |
Source: | https://github.com/cran/mmtsne |
hbeta
returns the perplexity and probability values for a row
of data D
.
hbeta(D, beta = 1)
hbeta(D, beta = 1)
D |
A distance vector. |
beta |
A constant scalar. |
mmtsne
estimates a multiple maps t-distributed stochastic neighbor
embedding (multiple maps t-SNE) model.
mmtsne(X, no_maps = 1, no_dims = 2, perplexity = 30, max_iter = 500, momentum = 0.5, final_momentum = 0.8, mom_switch_iter = 250, eps = 1e-07)
mmtsne(X, no_maps = 1, no_dims = 2, perplexity = 30, max_iter = 500, momentum = 0.5, final_momentum = 0.8, mom_switch_iter = 250, eps = 1e-07)
X |
A dataframe or matrix of |
no_maps |
The number of maps (positive whole number) to be estimated. |
no_dims |
The number of dimensions per map. Typical values are 2 or 3. |
perplexity |
The target perplexity for probability matrix construction. Commonly recommended values range from 5 to 30. Perplexity roughly corresponds to the expected number of neighbors per data point. |
max_iter |
The number of iterations to run. |
momentum |
Constant scaling factor for update momentum in gradient descent algorithm. |
final_momentum |
Constant scaling factor for update momentum in gradient descent algorithm after the momentum switch point. |
mom_switch_iter |
The iteration at which momentum switches from
|
eps |
A small positive value near zero. |
mmtsne
is a wrapper that performs multiple maps t-SNE on an input
dataset, X
. The function will pre-process X
, an
by
matrix or dataframe, then call
mmtsneP
.
The pre-processing steps include calls to x2p
and
p2sp
to convert X
into an by
symmetrical joint probability matrix.
The mmtnseP
code is an almost direct port of the original multiple
maps t-SNE Matlab code by van der Maaten and Hinton (2012). mmtsne
estimates a multidimensional array of N x no_dims x no_maps
. Each
map is an N x no_dims
matrix of estimated t-SNE coordinates. When
no_maps=1
, multiple maps t-SNE reduces to standard t-SNE.
A list that includes the following objects:
An N x no_dims x no_maps
array of predicted coordinates.
An N x no_maps
matrix of unscaled weights. A high
weight on entry indicates a greater contribution of point
on map
.
An N x no_maps
matrix of scaled weights. A high
weight on entry indicates a greater contribution of point
on map
.
L.J.P. van der Maaten and G.E. Hinton. “Visualizing Non-Metric Similarities in Multiple Maps.” Machine Learning 87(1):33-55, 2012. PDF.
# Load the iris dataset data("iris") # Estimate a mmtsne model with 2 maps, 2 dimensions each model <- mmtsne(iris[,1:4], no_maps=2, max_iter=100) # Plot the results side-by-side for inspection # Points scaled by map proportion weights plus constant factor par(mfrow=c(1,2)) plot(model$Y[,,1], col=iris$Species, cex=model$proportions[,1] + .2) plot(model$Y[,,2], col=iris$Species, cex=model$proportions[,2] + .2) par(mfrow=c(1,1))
# Load the iris dataset data("iris") # Estimate a mmtsne model with 2 maps, 2 dimensions each model <- mmtsne(iris[,1:4], no_maps=2, max_iter=100) # Plot the results side-by-side for inspection # Points scaled by map proportion weights plus constant factor par(mfrow=c(1,2)) plot(model$Y[,,1], col=iris$Species, cex=model$proportions[,1] + .2) plot(model$Y[,,2], col=iris$Species, cex=model$proportions[,2] + .2) par(mfrow=c(1,1))
mmtsneP
estimates a multiple maps t-distributed stochastic neighbor
embedding (multiple maps t-SNE) model.
mmtsneP(P, no_maps, no_dims = 2, max_iter = 500, momentum = 0.5, final_momentum = 0.8, mom_switch_iter = 250, eps = 1e-07)
mmtsneP(P, no_maps, no_dims = 2, max_iter = 500, momentum = 0.5, final_momentum = 0.8, mom_switch_iter = 250, eps = 1e-07)
P |
An |
no_maps |
The number of maps (positive whole number) to be estimated. |
no_dims |
The number of dimensions per map. Typical values are 2 or 3. |
max_iter |
The number of iterations to run. |
momentum |
Constant scaling factor for update momentum in gradient descent algorithm. |
final_momentum |
Constant scaling factor for update momentum in gradient descent algorithm after the momentum switch point. |
mom_switch_iter |
The iteration at which momentum switches from
|
eps |
A small positive value near zero. |
This code is an almost direct port of the original multiple maps t-SNE Matlab
code by van der Maaten and Hinton (2012). mmtsne
estimates a
multidimensional array of N x no_dims x no_maps
. Each map is an
N x no_dims
matrix of estimated t-SNE coordinates. When
no_maps=1
, multiple maps t-SNE reduces to standard t-SNE.
A list that includes the following objects:
An N x no_dims x no_maps
array of predicted coordinates.
An N x no_maps
matrix of unscaled weights. A high
weight on entry indicates a greater contribution of point
on map
.
An N x no_maps
matrix of scaled weights. A high
weight on entry indicates a greater contribution of point
on map
.
L.J.P. van der Maaten and G.E. Hinton. “Visualizing Non-Metric Similarities in Multiple Maps.” Machine Learning 87(1):33-55, 2012. PDF.
# Load the iris dataset data("iris") # Produce a symmetric joint probability matrix prob_matrix <- p2sp(x2p(as.matrix(iris[,1:4]))) # Estimate a mmtsne model with 2 maps, 2 dimensions each model <- mmtsneP(prob_matrix, no_maps=2, max_iter=100) # Plot the results side-by-side for inspection # Points scaled by map proportion weights plus constant factor par(mfrow=c(1,2)) plot(model$Y[,,1], col=iris$Species, cex=model$proportions[,1] + 0.2) plot(model$Y[,,2], col=iris$Species, cex=model$proportions[,2] + 0.2) par(mfrow=c(1,1))
# Load the iris dataset data("iris") # Produce a symmetric joint probability matrix prob_matrix <- p2sp(x2p(as.matrix(iris[,1:4]))) # Estimate a mmtsne model with 2 maps, 2 dimensions each model <- mmtsneP(prob_matrix, no_maps=2, max_iter=100) # Plot the results side-by-side for inspection # Points scaled by map proportion weights plus constant factor par(mfrow=c(1,2)) plot(model$Y[,,1], col=iris$Species, cex=model$proportions[,1] + 0.2) plot(model$Y[,,2], col=iris$Species, cex=model$proportions[,2] + 0.2) par(mfrow=c(1,1))
p2sp
returns a symmetrical pair-wise joint probability
matrix given an input probability matrix P.
p2sp(P)
p2sp(P)
P |
An |
An N x N
symmetrical matrix of pair-wise probabilities.
x2p
returns a pair-wise conditional probability matrix given an input
matrix X.
x2p(X, perplexity = 30, tol = 1e-05)
x2p(X, perplexity = 30, tol = 1e-05)
X |
A data matrix with |
perplexity |
The target perplexity. Values between 5 and 50 are generally considered appropriate. Loosely translates into the expected number of neighbors per point. |
tol |
A small positive value. |
This function is an almost direct port of the original Python implementation
by van der Maaten and Hinton (2008). It uses a binary search to estimate
probability values for all pairwise-elements of X
. The conditional
Gaussian distributions should all be of equal perplexity.
An N x N
matrix of pair-wise probabilities.
L.J.P. van der Maaten and G.E. Hinton. “Visualizing High-Dimensional Data Using t-SNE.” Journal of Machine Learning Research 9(Nov):2579-2605, 2008. PDF.