Template for matching pyparsing results with a linked list of nodes

I defined a pyparsing rule to parse this text in a syntax tree ...

TEAM TEXT:

 add Iteration name = "Cisco 10M/half" append Observation name = "packet loss 1" assign Observation results_text = 0.0 assign Observation results_bool = True append DataPoint assign DataPoint metric = txpackets assign DataPoint units = packets append DataPoint assign DataPoint metric = txpackets assign DataPoint units = packets append Observation name = "packet loss 2" append DataPoint assign DataPoint metric = txpackets assign DataPoint units = packets append DataPoint assign DataPoint metric = txpackets assign DataPoint units = packets 

SYNTHESIS TREE:

 ['add', 'Iteration', ['name', 'Cisco 10M/half']] ['append', 'Observation', ['name', 'packet loss 1']] ['assign', 'Observation', ['results_text', '0.0']] ['assign', 'Observation', ['results_bool', 'True']] ['append', 'DataPoint'] ['assign', 'DataPoint', ['metric', 'txpackets']] ['assign', 'DataPoint', ['units', 'packets']] ... 

I'm trying to link all the nested key-value pairs in the syntax tree above to a linked list of objects ... the hierarchy looks something like this (each word is namedtuple ... the children in the hierarchy are in the list of parents of the parents):

 Log: [ Iteration: [ Observation: [DataPoint, DataPoint], Observation: [DataPoint, DataPoint] ] ] 

The goal of all this is to create a common test data collection platform to control the flow of tests against a network device and record the results. Once the data is in this format, the same data structure will be used to build the test report. To answer the question in the comments below, I chose a linked list because it seemed the easiest way to consistently deactivate information when writing a report. However, I would prefer not to assign Iteration or Observation numbers until after the tests ... in case we find problems and insert more observations during the test. My theory is that the position of each item in the list is sufficient, but I am ready to change this if this is part of the problem.

The problem is that I get lost trying to assign key values ​​to the objects in the linked list after creating it. For example, after I insert an Observation namedtuple into the first Iteration , I have a problem with reliable processing of the assign Observation results_bool = True update in the above example.

Is there a generalized design scheme to handle this situation? At the moment I have googled, but I can not show the relationship between parsing (what I can do) and managing the data hierarchy (the main problem). Hyperlinks or a small demo code are fine ... I just need pointers to get to the correct track.

+4
source share
2 answers

I ended up using textfsm , which allows me to maintain state between different lines when parsing the configuration file.

+1
source

I do not know the actual design pattern for what you are looking for, but I have a great passion for the problem. I work a lot with network devices, and parsing and organizing data is a big problem for me.

Clearly, the problem is not data analysis, but what you do with it afterwards. Here you need to think about the value that you attach to the data that you analyzed. The nested list method may work well if the objects containing the lists also make sense.

Namedtuples are great for the quick and dirty behavior of a class, but they fall when you need to do something outside of access to basic attributes, especially considering that they are immutable as tuples. It seems to me that you will want to replace certain namedtuple objects namedtuple full-blown classes. This way you can configure behavior parameters and methods.

For example, you know that Iteration will always contain 1 or more Observation objects, which will then contain 1 or more DataPoint objects. If you can accurately describe the relationship, this gives you a way to handle it.

+1
source

Source: https://habr.com/ru/post/1347624/


All Articles