This approach has some scalability problems (if you decide to switch to, say, city-specific geo-information), but for a given data size, it will provide significant optimization.
The problem you are facing is that MySQL does not really optimize range-based queries. Ideally, you want to do an exact ("=") search by index, not "more than", so we will need to create such an index from the data you have. Thus, MySQL will have much fewer rows to evaluate when matching.
To do this, I suggest creating a lookup table that indexes the geolocation table based on the first octet (= 1 of 1.2.3.4) of the IP addresses. The idea is that for every search you have to do, you can ignore all geolocation IP addresses that don't start with the same octet as the IP address you are looking for.
CREATE TABLE `ip_geolocation_lookup` ( `first_octet` int(10) unsigned NOT NULL DEFAULT '0', `ip_numeric_start` int(10) unsigned NOT NULL DEFAULT '0', `ip_numeric_end` int(10) unsigned NOT NULL DEFAULT '0', KEY `first_octet` (`first_octet`,`ip_numeric_start`,`ip_numeric_end`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Then we need to take the data available in your geolocation table and create data that covers all the (first) octets that the geolocation line covers: if you have an entry with ip_start = '5.3.0.0' and ip_end = '8.16.0.0' , the lookup table will need rows for octets 5, 6, 7, and 8. So ...
ip_geolocation |ip_start |ip_end |ip_numeric_start|ip_numeric_end| |72.255.119.248 |74.3.127.255 |1224701944 |1241743359 |
Must be converted to:
ip_geolocation_lookup |first_octet|ip_numeric_start|ip_numeric_end| |72 |1224701944 |1241743359 | |73 |1224701944 |1241743359 | |74 |1224701944 |1241743359 |
Since someone here requested their own MySQL solution, a procedure is stored here that will generate this data for you:
DROP PROCEDURE IF EXISTS recalculate_ip_geolocation_lookup; CREATE PROCEDURE recalculate_ip_geolocation_lookup() BEGIN DECLARE i INT DEFAULT 0; DELETE FROM ip_geolocation_lookup; WHILE i < 256 DO INSERT INTO ip_geolocation_lookup (first_octet, ip_numeric_start, ip_numeric_end) SELECT i, ip_numeric_start, ip_numeric_end FROM ip_geolocation WHERE ( ip_numeric_start & 0xFF000000 ) >> 24 <= i AND ( ip_numeric_end & 0xFF000000 ) >> 24 >= i; SET i = i + 1; END WHILE; END;
And then you will need to populate the table by calling this stored procedure:
CALL recalculate_ip_geolocation_lookup();
At this point, you can delete the procedure you just created - it is no longer needed if you do not want to recount the lookup table.
Once the lookup table is in place, all you have to do is integrate it into your queries and make sure you query for the first octet. Your query to the lookup table will satisfy two conditions:
- Find all strings matching the first octet of your IP address
- From this subset: find a string that has a range that matches your IP address.
Since the second step is performed on a subset of the data, it is significantly faster than performing range tests on all the data. This is the key to this optimization strategy.
There are various ways to find out what the first octet of an IP address is; I used ( r.ip_numeric & 0xFF000000 ) >> 24 , since my source IP addresses are in numerical form:
SELECT r.*, g.country_code FROM ip_geolocation g, ip_geolocation_lookup l, ip_random r WHERE l.first_octet = ( r.ip_numeric & 0xFF000000 ) >> 24 AND l.ip_numeric_start <= r.ip_numeric AND l.ip_numeric_end >= r.ip_numeric AND g.ip_numeric_start = l.ip_numeric_start;
Now, admittedly, in the end I will get a little: you can easily get rid of the ip_geolocation table if you made the ip_geolocation_lookup table also containing country data. I assume that deleting one table from this query will make it a little faster.
And finally, here are two other tables that I used in this answer for reference, as they are different from your tables. I am sure they are compatible.
# This table contains the original geolocation data CREATE TABLE `ip_geolocation` ( `ip_start` varchar(16) NOT NULL DEFAULT '', `ip_end` varchar(16) NOT NULL DEFAULT '', `ip_numeric_start` int(10) unsigned NOT NULL DEFAULT '0', `ip_numeric_end` int(10) unsigned NOT NULL DEFAULT '0', `country_code` varchar(3) NOT NULL DEFAULT '', `country_name` varchar(64) NOT NULL DEFAULT '', PRIMARY KEY (`ip_numeric_start`), KEY `country_code` (`country_code`), KEY `ip_start` (`ip_start`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;