Answer
A Word attachment is simply a mapping of words into vectors. The dimension in the word Attachments refers to the length of these vectors.
Additional Information
These mappings come in many formats. Most pre-trained attachments are available as a space-separated text file, where each line contains the word in the first position and its vector representation next to it. If you were to separate these lines, you will find that they have a length of 1 + dim , where dim is the dimension of the word's vectors, and 1 corresponds to the word being represented. See GloVe pre-trained vectors for a real example.
For example, if you download glove.twitter.27B.zip , unzip it and run the following Python code:
you will get a conclusion
1193514 51 people ['1.4653', '0.4827', ..., '-0.10117', '0.077996']
To some extent unrelated, but no less important, the lines in these files are sorted according to the frequency of words found in the corps in which the attachments were trained (first of all, the most common words).
You can also present these attachments in the form of a dictionary, where keys are words and values are lists representing word vectors. The length of these lists will be the dimension of the vectors of your words.
A more common practice is to present them in the form of matrices (also called table lookups), dimensions (V x D) , where V is the size of the dictionary (that is, how many words you have), and D is the dimension of each vector word. In In this case, you need to save a separate dictionary that displays each word in its corresponding row in the matrix.
Background
As for your question about the role that dimension plays , you will need some theoretical knowledge. But in a few words, the space in which words are embedded has good features that make NLP systems work better. One of these properties is that words with the same meaning are spatially close to each other, that is, they have similar vector representations measured by a distance metric, such as Euclidean distance or cosine similarity .
You can visualize a 3D projection of several word attachments here and see, for example, that the closest words to “roads” are “highway”, “road” and “routes” in the Word2Vec 10K attachment.
For a more detailed explanation, I recommend reading the “Attachments of Words” section of this post by Christopher Olach.
For a larger theory on why using word embedding, which is an instance of distributed representations, is better than using, for example, one-time encodings (local representations), I recommend reading the first sections of distributed representations by Jeffrey Hinton et al.