Delphi: search string in TStringList

Question

Delphi: search string in TStringList

I have a big txt (ipaddress.txt) with a lot of lines, each line is an IP address, for example:

44.XXX.XXX.XXX 45.XXX.XXX.XXX 46.XXX.XXX.XXX 47.XXX.XXX.XXX 48.XXX.XXX.XXX

i load this list into a TStringList , for example:

 FIpAddressList := TStringList.Create(); FIpAddressList.Sorted := true; FIpAddressList.LoadFromFile('ipaddress.txt');

and I want to check if there is an IP address in the list, for example:

 function IsIPinList(const IPAddress : string): Boolean; begin Result := (FIpAddressList.IndexOf(IPAddress) <> -1); end;

it works ... but slowly with a huge TStringList .

Is there a way to speed up this process?

UPDATE

Static list with monthly updates, with fewer or more than 5'000 lines.
The function is called with every request on the server (for example, 5 times per second).
~~The list is only loaded when the server service starts.~~

+5

delphi

ar099968 Dec 16 '16 at 9:03

source share

1 answer

David heffernan · Accepted Answer · 2016-12-16T09:07:59+0000

One way to do this faster is to arrange the list sorting so that the search can be performed using the binary search O (log n) rather than the linear search O (n).

If the list is static, it is best to sort the list outside of your program so that you can sort it exactly once.

If the list is more dynamic, you will have to sort it and save any changes. In this case, sorting the list will only be beneficial if the number of queries you have completed is sufficient to overcome the additional costs of sorting and maintaining order.

Another approach is to use a dictionary containing your elements. Usually this will include hashing each line. Although the search complexity is O (n), the cost of hashing can be significant.

Another way to speed things up is to convert IP address strings to 32-bit integers. In fact, this is bound to bring tremendous performance benefits, and you should do it regardless of any other problems. It is always faster and more understandable to work with a 32-bit integer than the string representation of an IP address.

These are just some of the possible approaches, there are more. The choice to choose depends on the use of compromises.

Although you're probably just looking for some kind of code to solve your problem, I find this problem more algorithmic than code-based. Before choosing an algorithm, you need to better understand the requirements, data size limits, etc. Once you have decided on an algorithm or narrowed your selection to a small number for comparison, the implementation should be simple.

Another possibility is that you misidentified your problem. Even a linear search through a list of 5000 IP addresses stored as strings should be faster than you are reporting:

My computer can search such a list 2000 times per second using linear search.
Once you sort the list and switch to binary search, it will be able to manage 1,000,000 queries per second.
Switch to storing an ordered array of integers and it will execute over 10,000,000 searches per second.
Using a hash dictionary of integers, performance again increases faster.

Thus, the performance of your search can be easily improved 20,000 times, I still do not think that your performance problems come down to what you think. Interestingly, your real problem is that you read the file from disk every time you search.

Delphi: search string in TStringList

More articles: