I have 2 arrays, $arr1 and $arr2 :
$arr1 is a list of columns that I expect to read from an excel file, $arr2 is an array of columns that were actually found.
Sometimes the downloaded file contains
- Error column names
- Columns in a different order
- Multiple columns may be missing
- In addition, column names may contain letters in a different encoding (for example, the Greek "M", which looks like the Latin M, but cannot be considered the same).
Suppose, for example, that we have the following 2 arrays:
$arr1 = array('Action', 'LotSize', 'QuantityMinimum', 'SupplierName', 'SPN', 'PartNumExt', 'UOM', 'ListPrice', 'MPN', 'MFrName', 'CatLevel1', 'CatLevel2', 'CatLevel3', 'CatLevel4', 'CatLevel5', 'CatLevel6', 'AcctLevel1', 'AcctLevel2', 'AcctLevel3', 'AcctLevel4', 'AcctLevel5', 'AcctLevel6', 'Desc1', 'Desc2', 'PicName', 'SupplierURL', 'CatPart','TechSpec', 'Kad'); $arr2 = array('Action', 'LotSze', 'QuantityMinimum', 'SupplierName', 'SPN', 'PartNumEx', 'UOM', 'ListPric', 'MPN', 'MfrName', 'CatLevel1', 'CatLevel2', 'CatLevel3', 'CatLevel4', 'AcctLevel1', 'AcctLevel2', 'AcctLevel3', 'AcctLevel4', 'Desc1', 'Desc2', 'PicName', 'SupplierURL', 'CatPart');
I need to compare 2 arrays and save the position of the corresponding elements in the 3rd array:
$arr3 = ([0]=>0, [1]=>1, [2]=>3, [3]=>5, [4]=>6, [5]=>...);
mapping the position of each matched element of $arr1 to $arr2 .
By "matching" I mean all elements that are identical (for example, Action ) or partially the same (for example, Test and Tes ,), as well as those elements that are the same, but are in another case (for example, Foo and foo , Bar, and / strong>).
I posted this question a few days ago, and I had a good answer, but after several tests with a lot of data, I found that it does not always work properly.
So, after a larger search, I found levenshtein , so I made a combination that first checks the exact match, and if not found, then tries to find the closest match. Now the problem is that some columns have similar names, for example. Catlevel1 , Catlevel2 , ..., Catlevel6 . Therefore, if Catlevel2 is missing, it will be mapped to the last and most similar column, which is Catlevel6 .
This is what I have so far:
foreach($all_columns as $i => $val1) { $result = null; // Search the second array for an exact match, if found if(($found = array_search($val1,$_SESSION['found_columns'],true)) !==false) { $result = $found; } else { // Otherwise, see if we can find a case-insensitive matching string //where the element from $arr2 is found within the one from $arr1 foreach( $_SESSION['found_columns'] as $j => $val2) { if($val1<>'' && $val2<>'') { if( stripos( $val1, $val2) !== false ) { $result = $j; break; } else { $notfound .= $val1.', '; break; } } } } $_SESSION['found_column_positions'][$i] = $result; } /*****ALTERNATIVE METHOD USING levenshtein*****/ $i=0; foreach($all_columns as $key => $value) { $found = wordMatch($value, $arr2, 2); $pos = array_search($found, $_SESSION['found_columns']); $_SESSION['found_column_positions'][$i] = $pos; $i++; }
function wordMatch($input, $array, $sensitivity){ $words = $array; $shortest = -1; foreach ($words as $word) { $lev = levenshtein($input, $word); if ($lev == 0) { $closest = $word; $shortest = 0; break; } if ($lev <= $shortest || $shortest < 0) { $closest = $word; $shortest = $lev; } } if($shortest <= $sensitivity){ return $closest; } else { return 0; } }
<h / "> Is there a better way to compare 2 arrays, find the closest match and save the value matching key for the 3rd array for use as a key link between two arrays?