Java - list of keywords in another list of strings

I have a list of keywords in a list, and I have data coming from some source, which will also be a list.

I would like to find if any of the keywords in the data list exists, if so add these keywords to another target list.

eg.

Keyword List = FIRSTNAME, LASTNAME, CURRENCY & FUND

Data list = HUSBANDFIRSTNAME, HUSBANDLASTNAME, WIFEFIRSTNAME, SOURCECURRENCY & CURRENCYRATE.

From the above example, I would like to make a target list with keywords FIRSTNAME, LASTNAME & CURRENCY, but FUNDshould not appear because it does not exist in the data list.

I have a solution below that works using two loops (one inside the other) and checking with the String method contains, but I would like to avoid two loops, especially inside inside.

  for (int i=0; i<dataList.size();i++) {
      for (int j=0; j<keywordsList.size();j++) {
            if (dataList.get(i).contains(keywordsList.get(j))) {
                  targetSet.add(keywordsList.get(j));
                  break;
            }
      }
    }

Is there any other alternative solution to my problem?

+4
2

regex. , dataList , .

public static void main(String[] args) throws Exception {
    List<String> keywords = new ArrayList(Arrays.asList("FIRSTNAME", "LASTNAME", "CURRENCY", "FUND"));
    List<String> dataList = new ArrayList(Arrays.asList("HUSBANDFIRSTNAME", "HUSBANDLASTNAME", "WIFEFIRSTNAME", "SOURCECURRENCY", "CURRENCYRATE"));
    Set<String> targetSet = new HashSet();

    String pattern = String.join("|", keywords);
    for (String data : dataList) {
        Matcher matcher = Pattern.compile(pattern).matcher(data);
        if (matcher.find()) {
            targetSet.add(matcher.group());
        }
    }

    System.out.println(targetSet);
}

:

[CURRENCY, LASTNAME, FIRSTNAME]
+1

-. ( , ).

O(Sum(Length(Keyword)) + Length(Data) + Count(number of match)).

wiki-:

Aho-Corasick , . . . , ( "" ) . . .

( 200 ) , .

, :

O(Sum(Length(Keyword)) + Length(Data)).

, , .


EDIT:

, , . Set Match Problem , ( ) (). Set Match Problem, Aho–Corasick algorithm, . , :

for (int i=0; i < dataList.size(); i++) {       
  targetSet.addAll(Ac.run(keywordsList));
}

.

+1

Source: https://habr.com/ru/post/1598810/


All Articles