How to parse a DOT file in Python

I have a converter saved as a DOT file. I see a graphical representation of the graphs using gvedit, but what if I want to convert the DOT file to an executable converter so that I can check the converter and see which lines it accepts and what doesn't.

Most tools I've seen in Openfst, Graphviz, and their Python extensions, DOT files are only used to create a graphical representation, but what if I want to parse the file to get an interactive program where I can check the lines against the converter?

Are there libraries that could accomplish this task, or should I just write it from scratch?

As I said, the DOT file is associated with a converter that I developed that mimics the morphology of the English language. This is a huge file, but just to give you an idea of ​​how it looks like, I am providing a sample. Let's say I want to create a converter that will model the behavior of the English language with respect to nouns and in terms of multiplicity. My vocabulary consists of only three words (book, boy, girl). My converter in this case would look something like this:

enter image description here

which is directly built from this DOT file:

digraph A { rankdir = LR; node [shape=circle,style=filled] 0 node [shape=circle,style=filled] 1 node [shape=circle,style=filled] 2 node [shape=circle,style=filled] 3 node [shape=circle,style=filled] 4 node [shape=circle,style=filled] 5 node [shape=circle,style=filled] 6 node [shape=circle,style=filled] 7 node [shape=circle,style=filled] 8 node [shape=circle,style=filled] 9 node [shape=doublecircle,style=filled] 10 0 -> 4 [label="g "]; 0 -> 1 [label="b "]; 1 -> 2 [label="o "]; 2 -> 7 [label="y "]; 2 -> 3 [label="o "]; 3 -> 7 [label="k "]; 4 -> 5 [label="i "]; 5 -> 6 [label="r "]; 6 -> 7 [label="l "]; 7 -> 9 [label="<+N:s> "]; 7 -> 8 [label="<+N:0> "]; 8 -> 10 [label="<+Sg:0> "]; 9 -> 10 [label="<+Pl:0> "]; } 

Testing this word converter now means that if you submit it using book+Pl , it should spit books back and vice versa. I would like to see how you can turn a dot file into a format that will allow such analysis and testing.

+6
source share
4 answers

You can start by downloading the file using https://code.google.com/p/pydot/ . From there, it should be relatively simple to write code to move the graph in memory according to the input line.

+3
source

First of all, I installed the graphviz library. Then I wrote the following code:

 import os from graphviz import Source file = open('graph4.dot', 'r')#READING DOT FILE text=file.read() Source(text) 
0
source

Guillaume's answer is enough to display the graph in Spyder (3.3.2), which may solve some problems.

If you really need to manipulate the schedule, as required by the OP, it will be a bit complicated. Part of the problem is that Graphviz is a chart rendering library when you try to analyze a chart. What you are trying to do is similar to reverse engineering a Word or LateX document from a PDF file.

If you can assume a good structure for the OP example, then regular expressions work. I like the aphorism: if you solve the problem with regular expressions, you have two problems. However, this may just be the most practical solution for these cases.

Here are the expressions to capture:

  • information about your node: r"node.*?=(\w+).*?\s(\d+)" . Capture groups are the type and label of the node.
  • Information about your edge: r"(\d+).*?(\d+).*?\"(.+?)\s" . Capture groups: source, receiver, and edge label.

To try them easily, see https://regex101.com/r/3UKKwV/1/ and https://regex101.com/r/Hgctkp/2/ .

0
source

Use this to download a .dot file in python:

 graph = pydot.graph_from_dot_file(apath) # SHOW as an image import tempfile, Image fout = tempfile.NamedTemporaryFile(suffix=".png") graph.write(fout.name,format="png") Image.open(fout.name).show() 
0
source

Source: https://habr.com/ru/post/981955/


All Articles