I need to use the word2vec module containing tons of Chinese characters. The module was prepared by my colleagues using Java and saved as a bin file.
I installed gensim and trying to load the module, but an error occurred:
In [1]: import gensim
In [2]: model = gensim.models.Word2Vec.load_word2vec_format('/data5/momo-projects/user_interest_classification/code/word2vec/vectors_groups_1105.bin', binary=True)
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 96-97: unexpected end of data
I tried loading the module in both python 2.7 and 3.5, and failed. So how can I load a module in gensim? Thanks.
source
share