How to read multiple dictionaries from a file in python?

I am relatively new to python. I am trying to read an ascii file with several dictionaries in it. The file has the following format.

{Key1: value1 key2: value2 ... } {Key1: value1 key2: value2 ... } { ... 

Each dictionary in a file is a nested dictionary. I am trying to read it as a list of dictionaries. is there an easy way to do this? I tried the following code but it does not work

 data = json.load(open('doc.txt')) 
+5
source share
4 answers

Since the data in your input file is actually not in JSON or Python, you will need to parse it yourself. You really haven't indicated which valid keys and values ​​are in the dictionary, so the following only allows them to be alphanumeric character strings.

So, the given input file with the following contents is named doc.txt :

 {key1: value1 key2: value2 key3: value3 } {key4: value4 key5: value5 } 

The following reads and converts it into a list of Python dictionaries, consisting of alphanumeric keys and values:

 from pprint import pprint import re dictpat = r'\{((?:\s*\w+\s*:\s*\w+\s*)+)\}' # note non-capturing (?:) inner group itempat = r'(\s*(\w+)\s*:\s*(\w+)\s*)' # which is captured in this expr with open('doc.txt') as f: lod = [{group[1]:group[2] for group in re.findall(itempat, items)} for items in re.findall(dictpat, f.read())] pprint(lod) 

Conclusion:

 [{'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}, {'key4': 'value4', 'key5': 'value5'}] 
+2
source

You will need to put it on a large list to make it work. i.e.

 [ {key1: val1, key2: val2, key3: val3, ...keyN: valN} , {key1: val1, key2: val2, key3: val3, ...keyN: valN} , {key1: val1, key2: val2, key3: val3, ...keyN: valN} . . . ] 

If you cannot change the format of the data file, I'm afraid you will have to turn your own function to interpret the data.

+1
source

If the internal elements are valid JSON, the following may work. I dug up the source of the simplejson library and modified it according to your use case. SSCCE below.

 import re import simplejson FLAGS = re.VERBOSE | re.MULTILINE | re.DOTALL WHITESPACE = re.compile(r'[ \t\n\r]*', FLAGS) def grabJSON(s): """Takes the largest bite of JSON from the string. Returns (object_parsed, remaining_string) """ decoder = simplejson.JSONDecoder() obj, end = decoder.raw_decode(s) end = WHITESPACE.match(s, end).end() return obj, s[end:] def main(): with open("out.txt") as f: s = f.read() while True: obj, remaining = grabJSON(s) print ">", obj s = remaining if not remaining.strip(): break 

.. which with some similar JSON in out.txt outputs something like:

 > {'hello': ['world', 'hell', {'test': 'haha'}]} > {'hello': ['world', 'hell', {'test': 'haha'}]} > {'hello': ['world', 'hell', {'test': 'haha'}]} 
+1
source
 import re fl = open('doc.txt', 'rb') result = map( lambda part: dict( re.match( r'^\s*(.*?)\s*:\s*(.*?)\s*$', # splits with ':' ignoring space symbols line ).groups() for line in part.strip().split('\n') # splits with '\n', new line is a new key-value ), re.findall( r'\{(.*?)\}', # inside of { ... } fl.read(), flags=re.DOTALL # considering '\n'-symbols ) ) fl.close() 
0
source

Source: https://habr.com/ru/post/1209921/


All Articles