Hi, thanks for your response. I am using t-SNE to visualize sets of contextualized word embeddings generated by BERT encoders. The datasets each contain approximately 100,000 vectors, of 768 dimensions each; I am using t-SNE to reduce the the datasets to 2 dimensions to enable visualization of the data, to see the clusters and outliers therein. Regarding running time requirements: I'm hoping for a solution which can generate the t-SNE result for a given dataset within a minute or so, or at most within 5-10 minutes, given 30 parallel threads.
|