Two datasets are available; mnist_embeddings_8d
contains 8-dimensional
embedding vectors and mnist_embeddings_32d
contains 32-dimensional
embedding vectors.
The neural network that produced these embeddings was created using TensorFlow (Abadi et al. (2016)) with a variation of the code found in this example: https://www.tensorflow.org/addons/tutorials/losses_triplet
Format
An object of class tbl_df
(inherits from tbl
, data.frame
) with 10000 rows and 34 columns.
An object of class tbl_df
(inherits from tbl
, data.frame
) with 10000 rows and 10 columns.
Details
A data frame with 10,000 rows and p variables:
id: sequential ID or row number of the image
label: the digit 0, 1, ..., 9
X1--Xp: elements 1--p of the embedding vector
References
LeCun, Y (1998). The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/.
Abadi, M, P Barham, J Chen, Z Chen, A Davis, J Dean, M Devin, S Ghemawat, G Irving, M Isard, et al. (2016). TensorFlow: A System for Large-Scale Machine Learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp.265–283.