Optimization of initial search in Javascript

I need to find the initials (not sure if this is the correct name, if it is not, someone please change the question) using Javascript. For instance:

A search for "mas" using the "Abraham Maslow" object will return true , and a search for "John" in "Johnathan Smith" will also be true . However, a search for "gold" on "Marygold Ding" will be false .

I originally thought:

 function search(initial, subjectsArray) { var result = []; var tmp = null; var initialLowercase = initial.toLowerCase(); for (var i = 0; i < subjectsArray.length; i++) { tmp = subjectsArray[i].toLowerCase(); if (tmp.startsWith(initialLowercase) || tmp.indexOf(' ' + initialLowercase) != -1) { result.push(subjectsArray[i]); } } return result; } 

How to optimize this code?

+4
source share
4 answers

Looks like you want to use the word boundary match in a case-sensitive regular expression, for example:

/\bmas/i.test("Abraham Maslow") === true

/\bJohn/i.test("Johnathan Smith") === true

/\bgold/i.test("Marygold Ding") === false

\b will match the beginning or end of the word, and i at the end of the regular expression makes the case insensitive so that mas can match Maslow .

- update:

If your lines contain accented characters, \ b will match them, although we consider them to be part of the word. In this case, you want to use (^|\s) instead to match "the beginning of a line or some space":

/(^|\s)c/i.test('Drácule Smith') === false

/(^|\s)dr/i.test('Drácule Smith') === true

/(^|\s)smi/i.test('Drácule Smith') === true

MDE regex documentation .

+3
source

Why don't you use RegExp instead?

 string.search(new RegExp('\\b' + word + '\S*', 'i')) !== -1 

edit @ user24 to build this into a function with the same api as the OP:

 function search(initial, subjectsArray) { // Create regex for initial var regex = new RegExp('\\b' + initial + '\S*', 'i'); // Find subjects which contain this substring for (var i = 0; i < subjectsArray.length; i++) { if(subjectsArray[i].search(regex) !== -1) { return true; } } return false; } 
+2
source

Can't just <start of input or whitespace>Token

 (/(^|\s)Drá/i).test("Dráculezz Smith") 
+1
source

An alternative to regular expression is that you can store the letters of the name individually, with the 'matches' element at each level containing names matching that value (should be pretty fast, but if you have a large number of names, the array will be huge).

 array | - m | - matches | - - 'Abraham Maslow' | - - 'John Motson' | - a | - - matches | - - - 'Abraham Maslow' | - - s | - - - matches | - - - 'Abraham Maslow' | - - - l | - - - - matches | - - - - - 'Abraham Maslow' ... | - s | - - matches | - - 'Johnathan Smith' | - - m | - - - matches | - - - - 'Johnathan Smith' | - - - - i 

This should be well optimized for speed, because you can just do something like this to find the name:

 var initials = initial.split(''); var matches; for (var x in initials) { matches = initials[x]; } matches = matches['matches']; // now contains ['Abraham Maslow','John Motson'] or ['Abraham Maslow'], etc 

That way, you never go down a branch that has something else besides what interests you, so you will never see “Jonathan Smith” when the name does not start with “S” and never counts “John Motson "when the name begins with" Ma "instead of" Mo ", etc.

0
source

Source: https://habr.com/ru/post/1396828/


All Articles