So, I take the natural language processing class, and I need to create a trigram language model for generating random text, which to some extent looks “realistic”, based on some sample data.
It is essential to create a “trigram” for storing various combinations of three-letter words. My professor hints that this can be done using a dictionary of dictionaries of dictionaries that I tried to create using:
trigram = defaultdict( defaultdict(defaultdict(int)))
However, I get the error message:
trigram = defaultdict( dict(dict(int))) TypeError: 'type' object is not iterable
How would I do to create a 3-layer nested dictionary or dictionary of dictionaries of int value dictionaries?
I think people vote for a question about stack overflows if they don’t know how to answer it. I will add some background to better explain the question to those who want to help.
This trigram is used to track three-dimensional patterns of words. They are used in text processing software and almost everywhere in the natural language processing of “think siri or google now”.
If we denote 3 levels of dictionaries as dict1 dict2 and dict3 , then we analyze the text file and read the instruction "The boy works" will have the following:
A dict1 that has the key to "the". Access to this key will return dict2, which contains the key "boy". Access to this key will return the final dict3, which will contain the key "runs", now access to this key will return the value 1.
This symbolizes that in this text “boy runs” appeared 1 time. If we come across this again, we will follow the same process and with a step of 1 to two. If we come across a “girl walk,” then dict2 “key dictionary” will now contain another key for “girl”, which will have dict3, which has a “walk” key and a value of 1, and so on. In the end, after parsing a ton of text (and tracking the number of words), you will have a trigram that can determine the likelihood of a specific start word leading to a combination of three words based on the frequency of occurrence in the previous text being analyzed.
This can help you create grammar rules for identifying languages, or, in my case, create randomly generated text that is very similar to grammatical English. I need a three-layer dictionary, because in any position of a combination of three words there may be another word that can create a whole set of combinations. I TRAINED everything possible to explain the trigrams and the purpose behind them, to the best of my ability ... I just told the class a couple of weeks ago.
Now ... with all that is said. How do I start creating a dictionary of dictionaries of dictionaries whose basic dictionary contains int values ​​in python?
trigram = defaultdict (defaultdict (defaultdict (int)))
causes an error for me