Calculating similarities between sentences

I have a database with thousands of lines of error logs and their description. This error log is for an application that runs 24 hours a day. I want to create a control panel / user interface to view the current common errors that occur to support prodcution.

The problem I am facing is that although there are many common errors, the error description differs by transcription id or user id or things that are unique to this sigle variable.

for example, XYz transaction error for user 233 for example, 2. XYz transaction error for user 567

I believe that these two erros are the same. So I want a program that will go through new error logs and classify them into groups. I am trying to use "edit distance", but very slowly. Since I have old error logs, I am trying to think of solutions using this information. Any thoughts?

+3
source share
2 answers

I assume that error messages are generated by the program, and therefore they probably fall into a very specific pattern.

, - . : (, , - ), . - . , , : " ([A-Z] *) ([0-9] *)". ( ) ( ).

+1

( ), . ?

SimMetrics - F/OSS, .

+1

Source: https://habr.com/ru/post/1782329/


All Articles