Ignore dates and times when analyzing YAML?

I am writing a script to convert a series of YAML files into one piece of JSON. I have a YAML file:

---
AWSTemplateFormatVersion: 2010-09-09
Description: AWS CloudFormation ECS Sample
Parameters:
    - SolrCloudInstanceType:
        Type: String
        Description: Solr Cloud EC2 Instance Type
        Default: m3.2xlarge
Resources:
    - ContainerInstance:
        Type: AWS::EC2::Instance
        Properties:
            InstanceType: m3.xlarge

I load it like this

import yaml

with open('base.yml', 'rb') as f:
    result = yaml.safe_load(f)

I wonder if I do a check AWSTemplateFormatVersion, I get a Python object datetime.date. This causes the JSON output to fail:

>>> json.dump(result, sys.stdout, sort_keys=True, indent=4)
{
    "AWSTemplateFormatVersion": Traceback (most recent call last):
  File "./c12n-assemble", line 42, in <module>
    __main__()
  File "./c12n-assemble", line 25, in __main__
    assembler.assemble()
  File "./c12n-assemble", line 39, in assemble
    json.dump(self.__result, self.__output_file, sort_keys=True, indent=4, separators=(',', ': '))
  File "/usr/lib/python2.7/json/__init__.py", line 189, in dump
    for chunk in iterable:
  File "/usr/lib/python2.7/json/encoder.py", line 434, in _iterencode
    for chunk in _iterencode_dict(o, _current_indent_level):
  File "/usr/lib/python2.7/json/encoder.py", line 408, in _iterencode_dict
    for chunk in chunks:
  File "/usr/lib/python2.7/json/encoder.py", line 442, in _iterencode
    o = _default(o)
  File "/usr/lib/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: datetime.date(2010, 9, 9) is not JSON serializable

Is there a way to make the YAML parser not be smart about what it considers date or date + time and just parses the string?

+4
source share
1 answer

You can expand the PyYAML loader and remove the implicit marking of timestamps or other types as follows:

class NoDatesSafeLoader(yaml.SafeLoader):
    @classmethod
    def remove_implicit_resolver(cls, tag_to_remove):
        """
        Remove implicit resolvers for a particular tag

        Takes care not to modify resolvers in super classes.

        We want to load datetimes as strings, not dates, because we
        go on to serialise as json which doesn't have the advanced types
        of yaml, and leads to incompatibilities down the track.
        """
        if not 'yaml_implicit_resolvers' in cls.__dict__:
            cls.yaml_implicit_resolvers = cls.yaml_implicit_resolvers.copy()

        for first_letter, mappings in cls.yaml_implicit_resolvers.items():
            cls.yaml_implicit_resolvers[first_letter] = [(tag, regexp) 
                                                         for tag, regexp in mappings
                                                         if tag != tag_to_remove]

NoDatesSafeLoader.remove_implicit_resolver('tag:yaml.org,2002:timestamp')

Use this alternate bootloader as follows:

>>> yaml.load("2015-03-22 01:49:21", Loader=NoDatesSafeLoader)
'2015-03-22 01:49:21'

For reference, the original behavior would be as follows:

>>> yaml.load("2015-03-22 01:49:21")
datetime.datetime(2015, 3, 22, 1, 49, 21)
+3
source

Source: https://habr.com/ru/post/1623357/


All Articles