Finding a large file for data in C / C ++

I have a log file that has this format:

DATE-TIME ### attribute1 ### attribute2 ### attribute3 

I need to look for this log file for an input attribute (entered from the command line) and output lines corresponding to the entered attribute.
A naive approach might be something like this:

scan the entire file line by line
search for the attribute
print if found, else ignore.

This approach is slow because it will require O (n) comparisons, where n is the number of rows that can be very large.
Using a hash table may be another approach, but storing such a hash table in memory for a large file may not be possible.
So what is the best possible solution? How can I index the entire file by various attributes?

EDIT:
A log file can be about 100K lines, almost like syslog files on Linux. With one call, the user can search for several attributes that are unknown until the search for the 1st attribute is completed as an interactive console.

Thank,

+3
source share
10

-, . , , , , - . , .

, - , .

, , , , , , .

+2

, , O (n).

- , - , dbm gdbm.

EDIT. , Berkeley DB, KeithB, . Berkeley DB , SQL.

+4

Berkley DB . , , . Berkley DB B-Tree , , . .

+3

.

?

DB . , , .

Map-Reduce, .

+2

. , , . .

, . . , , .

. . - . , . - [] [ ] [ ] ..

, , , , . - , , "|".

, , . . . ( ) .

+2

, , , - .

, , .

boost:: regex QRegExp.

, , -.

0

, b-tree -. . , . , , , .

, , 600k + (100M), grep . . , , 100k , .

, -, .

0

, , . , . / parallelism .

. , , , .

0

. .

. , 100K , , /var/log/messages , . , , , - grep /var/log/ , , .

, , - , , , - , . , , , !

0

, , . ?

Log files change all the time, or at least every day. So what you probably want is to do some sort of rotation of the log file, from which many ready and unsuccessful can be done in a few hours, if you know even a little perl. For this, you probably don't need C ++ either. It will just make development time slower and the end result more complex.

0
source

Source: https://habr.com/ru/post/1731608/


All Articles