T-SNE high-level data visualization

Question

T-SNE high-level data visualization

I have a twitter body that I use to create a mood analysis app. The body has 5k tweets that have been marked negative, neutral, or positive

To represent the text - I use gensim word2vec preliminary vectors. Each word is displayed in 300 dimensions. For a tweet, I add all the word vectors to get one 300-fold vectors. Thus, each tweet is mapped to a single vector of size 300.

I view my data using t-SNE (tsne python package). See Attached Image 1 - Red dots = negative tweets, blue dots = neutral tweets and green dots = positive tweets

: () . , 300 ?

i.e t-SNE, ?

+4

python scikit-learn machine-learning nlp data-analysis

Anuj Gupta 21 . '16 12:18

1

Farseer · Answer 1 · 2016-01-21T13:49:47+0000

: () . , 300 ?

NO. , , , . , , - (, 3d-) .

, , . :

PCA, 300 , , 10. 300 ( ) 10 ( 10 , ) sum(top-10-eigenvalues)/sum(300-eigenvalues). " ", .

T-SNE high-level data visualization

More articles: