Is there any Python implementation of the logstash grok function?

Logstash grok is a string parsing tool that is built on top of a regular expression, it provides many templates that simplify working with string syntax, I just fell in love with it the first time I used it. But, unfortunately, it is written in Ruby, which makes it impossible to use Python in my projects, so I wonder if there is any Python grok implementation or is there an alternative to Python that can simplify parsing strings like grok?

+4
source share
2 answers

I don't know about any python grok ports, but this functionality seems pretty simple to implement:

import re

types = {
    'WORD': r'\w+',
    'NUMBER': r'\d+',
    # todo: extend me
}


def compile(pat):
    return re.sub(r'%{(\w+):(\w+)}', 
        lambda m: "(?P<" + m.group(2) + ">" + types[m.group(1)] + ")", pat)


rr = compile("%{WORD:method} %{NUMBER:bytes} %{NUMBER:duration}")

print re.search(rr, "hello 123 456").groupdict()
# {'duration': '456', 'bytes': '123', 'method': 'hello'}
+4
source

I built a project on github called pygrok based on @georg answer to satisfy the requirements for analyzing log templates in python codes. I think pygrok might be useful to you, let me briefly introduce it:

pygrok

Python library for parsing strings and extracting information from structured / unstructured data

What can I use for grok?

  • parsing and matching patterns in a string (log, message, etc.).
  • relief from complex regular expressions.
  • extracting information from structured / unstructured data

Here you can find.

+5

Source: https://habr.com/ru/post/1536556/


All Articles