Python - dictionary suggestion

Question

Python - dictionary suggestion

I am trying to write code that accepts a sentence:

dimension implies direction implies measurement implies the more and the less

and converts it to a dictionary, where the words = key and value = previous words, but for the first word there is no value NO.

It should be:

 {'and' : 'more' 'dimension' : '' 'direction' : 'implies' 'implies' : 'dimension', 'direction', 'measurement' 'less' : 'the' 'measurement' :'implies' 'more' : 'the' 'the' : 'and', 'implies'}

I wrote:

 def get_previous_words_dict(text): words_list = text.split() sentence_dict = {} for i in range(0,len(words_list)): sentence_dict[words_list[i]] = words_list[i-1]

BUT it does not add the value to the existing key value, but rather replaces it, so instead of getting 3 different values for the 'implies' I get only 1 value.

In addition, instead of assigning NO to the size of a word, it assigns it less (starting at -1).

+5

python

Nume Feb 05 '16 at 6:04

source share

3 answers

Just divide the line into a list and create another list by shifting the empty line with a prefix, then fasten it and create a dictionary, iterations, PS - use defaultdict, initialized by the list instead of the dictionary, due to the possibility of multiple values for one key.

 inp = "dimension implies direction implies measurement implies the more and the less" l1 = inp.split() l2 = [""]+l1; zipped = zip(l1,l2) from collections import defaultdict d = defaultdict(list) for k, v in zipped: d[k].append(v) print d

If you don't want to import any thing that initializes the dict to consist of an empty list, use the same logic

 inp = "dimension implies direction implies measurement implies the more and the less" l1 = inp.split() l2 = [""]+l1; zipped = zip(l1, l2) d = {x: [] for x in l1} for k, v in zipped: d[k].append(v) print d

+2

k4vin Feb 05 '16 at 6:26

source share

If you are not allowed to import anything, then the excellent reduce operation along with slicing and zip (all these are Python built-in modules that do not require import) can be a very compact way:

EDIT After he pointed out to me that I misunderstood the problem, fixed it by changing the zip() statement.

 # the string - split it immediately into a list of words # (some words deleted to make it smaller) words = "dimension implies direction implies the more and the less".split() # There is a **lot** going on in this line of code, explanation below. result = reduce(lambda acc, kv: acc.setdefault(kv[0], []).append(kv[1]) or acc, zip(words[1:], words[:-1]), {}) # this was the previous - incorrect - zip() # zip(words[1::2], words[0::2]), {})

And the output of the result (also edited )

 print result {'and': ['more'], 'direction': ['implies'], 'implies': ['dimension', 'direction', 'measurement'], 'less': ['the'], 'measurement':['implies'], 'the': ['implies', 'and'], 'more': ['the']}

For completeness, the old, erroneous result:

 print result {'the': ['and'], 'implies': ['dimension', 'direction', 'measurement'], 'more': ['the']}

Little explanation

After dividing the string into a list of words, we can index individual words as words[i] .

edited . In the statement of the problem, the keys of the resulting dict are the words following the word, and this value is the first word. Therefore, we must convert the list of words into a list of combinations of each word with the next word. Thus, the key list will be a list of [words [1], words [2], words [3], ....] and values that go with them: [words [0], words [1], words [2 ], ..., the words [n-1]].

Using Python slicing : keys = words[1:] and values = words[:-1]

Now we need to create a dict these keys and values, aggregating the values in list , if the same key happens several times.

A dict has a .setdefault(key, value) method that initializes the value of key to value if key not in the dict , otherwise it returns the value as it currently is. By default, initializing all values to empty list ( [] ), we can blindly call .append(...) on it. What this part of the code does:

 acc.setdefault(key, []).append( value )

Then there is reduce . The decrease operation reduces (...) the list of values into one. In this case, we will shorten the list of (key, value) tuples in the dict , where we copied all the values into their corresponding key.

reduce accepts a callback reduction function and an initial element. The starting element here is an empty dict {} - we will fill this out when we go.

The callback reduction function is called several times with two arguments, a battery and the next element to add to the accumulation. The function should return a new battery.

In this code, the recovery step basically consists in adding the item value to the list of values for the item key. (See above - what .setdefault().append() does).

We only need to get a list of tuples (key, value) that we need to process. This is where the built-in zip appears. zip takes two lists and returns a list of tuples of the corresponding elements.

In this way:

 zip(words[1:], words[:-1])

prints exactly what we want: a list of all tuples (key, value) .

Finally, since the decrease function should bring back a new battery, we have to play the trick. list.append(...) returns None , although the actual dict has been changed. Therefore, we cannot return this value as the next battery. So, after that we add the construction or acc .

Since the left side of a logical or always evaluated as None , which is logically False in Python, the right side is always "evaluated" - in this case, the (modified) dict itself. Thus, a pure or result is evaluated by the modified dict itself, which is exactly what we need to return.

0

haavee Feb 05 '16 at 8:15

source share

PM 2Ring · Accepted Answer · 2016-02-05T06:34:07+0000

Here's how to do it without defaultdict :

 text = 'dimension implies direction implies measurement implies the more and the less' sentence_dict = {} prev = '' for word in text.split(): if word not in sentence_dict: sentence_dict[word] = [] sentence_dict[word].append(prev) prev = word print(sentence_dict)

Output

 {'and': ['more'], 'direction': ['implies'], 'implies': ['dimension', 'direction', 'measurement'], 'less': ['the'], 'measurement': ['implies'], 'the': ['implies', 'and'], 'dimension': [''], 'more': ['the']}

Python - dictionary suggestion

More articles: