Extending the discussion with @ user1735003 - Let's look at both ways of representing numbers:
- Considering it as a string and considering it as another word and assigning it an identifier when creating a dictionary. Or
- Converting numbers to actual words: '1' becomes "one", "2" as "two", etc.
Does he change context in the second? To test this, we can find a similarity between the two representations using word2vec . Grades will be high if they have a similar context.
For example, 1 and one have a similarity index of 0.17, 2 and two have a similarity index of 0.23 . They seem to suggest that the context of how they are used is completely different.
Considering numbers as another word, you do not change the context, but by doing any other transformation to these numbers, you cannot guarantee it better. Therefore, it is better to leave it untouched and consider it as another word.
Note Both word-2-vec and glove were trained, treating numbers as strings (case 1).
source share