Considering that the number of digits in the extension can be different for each company, and the number of digits in the number can be different for each country and region code, this is not an easy task for effective work.
Even if you split the data table into a base number and extension, you still have to split the incoming number into a base number and extension, which I actually find difficult.
I would like to try:
Original format
- Try matching the incoming number to the database.
- If it matches one entry, you have your answer - a specific person.
- If it matches more than one entry, something went wrong, so it didn’t work.
- Otherwise, you need to find a company:
- Reset the remaining digit from the incoming number and try to match the database again.
- If the number of digits falls below the threshold (probably 6 digits), then your search will probably not work. This is just to limit the number of searches in the database when the number is not found.
- If it does not match any entries, you need to try this step again.
- If it matches more than one entry, something went wrong, so it didn’t work.
- If it matches one entry, you have the next best answer - company.
For example, the search "+43123456777":
- +43123456777 corresponds to 0 articles.
- +4312345677 corresponds to 0 articles.
- +431234567 matches 1 record: "Company A"
The main way this approach is rejected is that the company has variable extension numbers. For example, consider what happens if both 431234567890 and 43123456789 are valid numbers, but only the second is in the database. If the incoming number is 431234567890, then error 43123456789 will be erroneous.
Split format
It is a bit more complicated, but more reliable.
- Try matching the incoming number to the database.
- If it matches one record, you have your own answer - the company.
- If it matches more than one record, match the record without the extension and you find the company.
- Otherwise, you need to find the number and extension of the base company:
- Reset the remaining digit from the incoming number and try to match the database again.
- If the number of digits falls below the threshold (probably 6 digits), then your search will probably not work. This is just to limit the number of searches in the database when the number is not found.
- If it does not match any entries, you need to try this step again.
- If it matches one entry, you have found your answer - the company.
- If it matches more than one record, then you have found the base number of the company and, therefore, now you know the extension, so you can try to find a specific person:
- Separate the base number from the beginning of the original incoming number and use it to search for record extensions with this base number.
- If it matches one entry, you have found a specific person.
- If it doesn’t correspond to a certain person, match the record without the extension and you will find the company.
For example, the search "+43123456777":
- +43123456777 corresponds to 0 articles.
- +4312345677 corresponds to 0 articles.
- +431234567 corresponds to 2 articles: "empty: Company A" and "890: employee in company A"
- In these two matches, “77” does not match anything, so return the empty extension: “Company A”.
Implementation Notes
This algorithm, as noted above, has some performance issues. If you search for a database of roads, it has a linear cost associated with the length of the phone number, especially if the database does not have the same numbers (for example, if the incoming number is from Kazakhstan, but there is no Kazakhstan number in datsbase * 8 ').
You can add some optimizations relatively easily. If most of the companies you work with use 3 or 4-bit extensions, you can start by removing, say, 4 digits from the end, and then do a binary beating until you get a response. This will reduce the number of 15 digits to 4 or 5 in many cases and not more than 6 search queries.
In addition, each time you narrow the selection, you can only select within the previous selection, and not select within the entire database.
Additional Implementation Notes
It finally became clear how the Non-Responsive Answer works, I see that this is a much simpler and more elegant solution. I wish I just tried just to find the database number on the incoming number, and not vice versa.
My only problem is that doing this on every telephonenumber in the database can impose excessive requirements on the server. I would suggest a comparative analysis of this solution at maximum voltage and see if this causes problems. If not, take advantage of this. If so, consider using a simple form of my algorithm and repeating stress tests. If performance is still too slow, try my binary search clause.