Fast Perl & # 8596; Python serialization that supports whole dictionary keys

I am looking for a fast (xml too slow) serialization method that can be used in both Perl and Python.

Unfortunately, I cannot use JSON (and many others) because it always changes the dict key type from integer to string. I need serialization / deserialization that saves the key type.

Python:

>>> import json >>> dict_before = {1:'one', 20: 'twenty'} >>> data = json.dumps(dict_before) >>> dict_after = json.loads(data) >>> dict_before {1: 'one', 20: 'twenty'} #integer keys >>> dict_after {u'1': u'one', u'20': u'twenty'} #string keys 

Any suggestions are welcome.

+4
source share
3 answers

Try msgpack . Its compact and fast. There is a perl implementation , but I never used it. Python impl works though:

 >>> import msgpack >>> x=msgpack.dumps({1:'aaa',2:'bbb'}) >>> x '\x82\x01\xa3aaa\x02\xa3bbb' >>> len(x) 11 >>> print msgpack.loads(x) {1: 'aaa', 2: 'bbb'} >>> 
+1
source

You can use yaml.

 >>> import yaml >>> dict_before = {1:'one', 20: 'twenty'} >>> data = yaml.safe_dump(dict_before) >>> dict_after = yaml.safe_load(data) >>> dict_after {1: 'one', 20: 'twenty'} 

I had a similar problem. I wanted to share the configuration file in Perl and Python, and I had to use yaml.

You can install the yaml module in python with:

 pip install PyYAML 

Although, whole keys will be converted to a string in perl => Legal values ​​for the Perl hash key

+4
source

Mu. You started with the wrong package.

Perl does not have a meaningful type system that distinguishes between numbers and strings. Any given value can be both. It cannot be determined using only the Perl language, regardless of whether a given value is considered a number only (although you can use modules like Devel::Peek ). It is completely impossible to find out what type the given value was originally from.

 my $x = 1; # an integer (IV), right? say "x = $x"; # not any more! It a PVIV now (string and integer) 

In addition, in the hash map ("dictionary"), the key type is always coerced to the string. In arrays, the key is always bound to an integer. Other types can only be faked.

This is great for text analysis, but, of course, is an endless pain in serializing the data structure. JSON is ideal for Perl data structures, so I suggest you stick with this (or YAML, as it is a superset of JSON) to protect yourself from the misconception that serialization may output information that may not exist.

What will we take from this?

  • If interop is important, refrain from using creative dictionary types in Python.

  • You can always encode type information in serialization if it is really important (hint: maybe it is not): {"type":"interger dict", "data":{"1":"foo","2":"bar"}}

  • It would also be premature to reject XML as too slow. See this recent article , although I disagree with these methods and am limited to JS (last week HN thread for perspective).

    If it's native, it will probably be fast enough, so obviously don't use pure versions of Perl or pure-Python. This is also true for JSON- and YAML- and some -parersers.

+2
source

Source: https://habr.com/ru/post/1498369/


All Articles