Calculating similarities between sentences

Question

Calculating similarities between sentences

I have a database with thousands of lines of error logs and their description. This error log is for an application that runs 24 hours a day. I want to create a control panel / user interface to view the current common errors that occur to support prodcution.

The problem I am facing is that although there are many common errors, the error description differs by transcription id or user id or things that are unique to this sigle variable.

for example, XYz transaction error for user 233 for example, 2. XYz transaction error for user 567

I believe that these two erros are the same. So I want a program that will go through new error logs and classify them into groups. I am trying to use "edit distance", but very slowly. Since I have old error logs, I am trying to think of solutions using this information. Any thoughts?

+3

algorithm edit distance similarity

codecreator Dec 27 '10 at 17:48

source share

2 answers

John Doty · Answer 1 · 2010-12-27T18:04:33+0000

I assume that error messages are generated by the program, and therefore they probably fall into a very specific pattern.

, - . : (, , - ), . - . , , : " ([A-Z] *) ([0-9] *)". ( ) ( ).

Mikos · Answer 2 · 2010-12-27T18:11:39+0000

( ), . ?

SimMetrics - F/OSS, .

Calculating similarities between sentences

More articles: