How to keep JSON key order using Python 3 json.dumps?

I noticed some strange behavior in the implementation of Python 3 json.dumps , namely, the order of the keys changes every time I give from the same object from execution to execution. Googling didn't work, since I don't care about sorting keys, I just want them to stay the same! Here is an example script:

 import json data = { 'number': 42, 'name': 'John Doe', 'email': ' john.doe@example.com ', 'balance': 235.03, 'isadmin': False, 'groceries': [ 'apples', 'bananas', 'pears', ], 'nested': { 'complex': True, 'value': 2153.23412 } } print(json.dumps(data, indent=2)) 

When I run this script, I get different outputs every time, for example:

 $ python print_data.py { "groceries": [ "apples", "bananas", "pears" ], "isadmin": false, "nested": { "value": 2153.23412, "complex": true }, "email": " john.doe@example.com ", "number": 42, "name": "John Doe", "balance": 235.03 } 

But then I run it again and I get:

 $ python print_data.py { "email": " john.doe@example.com ", "balance": 235.03, "name": "John Doe", "nested": { "value": 2153.23412, "complex": true }, "isadmin": false, "groceries": [ "apples", "bananas", "pears" ], "number": 42 } 

I understand that dictionaries are unordered collections and that order is based on a hash function; however, in Python 2, the order (no matter what it is) is fixed and does not change for each execution. The difficulty here is that it makes it difficult to run my tests because I need to compare the JSON output of two different modules!

Any idea what is going on? How to fix it? Note that I would like to avoid using OrderedDict or doing any sorting, and the important thing is that the string representation remains unchanged between executions. It is also intended for testing only and does not affect the implementation of my module.

+5
source share
2 answers

Python layers and JSON objects are unordered. You can request json.dumps() sort the keys in the output; this is intended to facilitate testing. Use the sort_keys parameter for True :

 print(json.dumps(data, indent=2, sort_keys=True)) 

See Why is ordering in Python dictionaries and sets arbitrary? about why every time you see a different order.

You can set the PYTHONHASHSEED environment PYTHONHASHSEED for an integer value to "block" the order of the dictionary; use this only to run tests, not production, since all hash randomization is to prevent the attacker from trivial DOS programs.

+8
source

A history of this behavior is this vulnerability. To prevent this, the same hash codes on one PC should be different from each other.

Python 2 probably disabled this behavior (hash randomization) by default due to compatibility, as this, for example, interrupts doctrines. Python 3 probably (guess) didn't need to be compiled.

0
source

Source: https://habr.com/ru/post/1241161/


All Articles