Search in large text files

let's say you have a game server that creates textbook files for gamers' actions, and from time to time you need to look for something in these log files (for example, fraud investigation or item loss). For example, you have 100 files, and each file has a size of 20 MB to 50 MB. How would you quickly look for them?

What I already tried to do is create several threads, and each invisible thread will display its own file in memory (let memory should not be a problem if it does not exceed 500 MB of RAM) does a search here, the result was something about 1 second for each file:

File: a26.log - in the program: 0.891, lines: 625282, matches: 78848

Is there a better way to do this? - because it seems to me slow. thank.

(java was used for this case)

+3
source share
5 answers

Tim Bray studied Apache log file processing approaches here: http://www.tbray.org/ongoing/When/200x/2007/09/20/Wide-Finder

It looks like your situation can have a lot in common.

+2
source

You can use combinations of Unix commands with find and grep .

+1
source

, : . , , , , Lucene ( Solr, xml).

, , , , ad-hoc.

, . , grep .

0

ad-hoc- UNIX grep, fgrep egrep. , , , .

On the other hand, the final bottleneck in text search files (which were not previously indexed) will be the speed at which the application + operating system can move data from the disk file to memory. It seems you manage 20 megabytes or more per second, which seems fast enough ... me too.

0
source

I should probably mention that in the first post the game server is written for Win64x - and I wonder if it is at the same performance level as grep for Windows and for unix?

0
source

Source: https://habr.com/ru/post/1746172/


All Articles