Find the most frequently viewed line from the log file

I want to find the most frequently viewed line in a huge log file. Can someone help me how to do this. one way to do this is to hash each line and calculate the maximum value, but inefficiently. Are there any better ways to do this.

Thanks and respect,

mouse.

+3
source share
9 answers

If performance is critical, you can look at trie or Radix Tree .


, - 50% ( ), - (., ):

  • , , 1;
  • ,
  • 0,
  • 2 , .
  • 0, , , .

    , .

. , ACM , .

+3

, unix- - :

sort logfile.txt | uniq -c

, - - , , .

, C ++ "", , , , , : -)

+4

( , ), ( , , , , 8 - ASCII) ?

?

+3

""? ""? Unix :

tr -s ' \011' '\012' < /var/log/messages | sort | uniq -c | sort -rn | head -20

    786 --
    635 labrador
    635 Jun
    393 MARK
    236 kernel:
    163 17
    153 usb
    136 22
    118 21
    113 USB
     74 device
     73 20
     73 19
     72 18
     57 5-1:
     51 address
     43 speed
     36 New
     34 0
     33 using

, C, .

+3

, , ( - ), , "" . , , .

stl- . , , , , . stl multiset. .

+2

, - , .

, , , , "the", "a". .

+1

"string" "", , , , .

0

- , .. , , SOL

, - , . - .

- , . - .

- .

0

, . perl . perl . perl .

0

Source: https://habr.com/ru/post/1751307/


All Articles