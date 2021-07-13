As data scientists, we often work with high-dimensional data with more than 3 features, or dimensions, of interest. In supervised machine learning, we may use this data for training and classification for example and may reduce the dimensions to speed up the training. In unsupervised learning, we use this type of data for visualization and clustering. In single-cell RNA sequencing (scRNA-seq), for example, we accumulate measurements of tens of thousands of genes per cell for upwards of a million cells. That’s a lot of data that provides a window into the cell’s identity, state, and other properties. More importantly, these properties put it in relation with the myriad of other cells in the dataset. Nevertheless, this creates a massive matrix of 1,000,000 cells and 10,000 genes, with each gene representing a dimension or axis. How do we interpret such data? As humans in a 3-D world, we cannot see anything beyond the three physical dimensions and need a way to capture the essence of the datasets like these while not losing anything of value.