Java String - See if a string contains only numbers and characters, not words?

I have an array of strings that I load throughout the application and it contains different words. I have a simple if statement to see if it contains letters or numbers, but not words.

I mean that I want only those words that look like AB2CD5X.. and I want to delete all the other words, such as words Hello 3, 3 word, any otherwhich is the word in English. Is it possible to filter only alpha-numeric words, except for those words that contain a true grammar word.

I know how to check if a string contains alphanumeric words

Pattern p = Pattern.compile("[\\p{Alnum},.']*");

also know

 if(string.contains("[a-zA-Z]+") || string.contains([0-9]+])
+4
source share
5

. , . , , Jazzy spellchecker. .

, , UTF-8 () :

public static void main(String[] args) throws IOException {
    final Set<String> dictionary = loadDictionary();
    final String text = loadInput();
    final List<String> output = new ArrayList<>();
    // by default splits on whitespace
    final Scanner scanner = new Scanner(text);
    while(scanner.hasNext()) {
        final String token = scanner.next().toLowerCase();
        if (!dictionary.contains(token)) output.add(token);
    }
    System.out.println(output);

}

private static String loadInput() {
    return "This is a 5gse5qs sample f5qzd fbswx test";
}

private static Set<String> loadDictionary() throws IOException {
    final File dicFile = new File("path_to_your_flat_dic_file");
    final Set<String> dictionaryWords = new HashSet<>();
    String line;
    final LineNumberReader reader = new LineNumberReader(new BufferedReader(new InputStreamReader(new FileInputStream(dicFile), "UTF-8")));
    try {
        while ((line = reader.readLine()) != null) dictionaryWords.add(line);
        return dictionaryWords;
    }
    finally {
        reader.close();
    }
}

, . . Apache Lucene EnglishStemmer

+5

, . , " ", .

, API:

DefaultHttpClient httpClient = new DefaultHttpClient(new ThreadSafeClientConnManager());
SkPublishAPI api = new SkPublishAPI(baseUrl + "/api/v1", accessKey, httpClient);
api.setRequestHandler(new SkPublishAPI.RequestHandler() {
    public void prepareGetRequest(HttpGet request) {
        System.out.println(request.getURI());
        request.setHeader("Accept", "application/json");
    }
});

"api":

      try {
          System.out.println("*** Dictionaries");
          JSONArray dictionaries = new JSONArray(api.getDictionaries());
          System.out.println(dictionaries);

          JSONObject dict = dictionaries.getJSONObject(0);
          System.out.println(dict);
          String dictCode = dict.getString("dictionaryCode");

          System.out.println("*** Search");
          System.out.println("*** Result list");
          JSONObject results = new JSONObject(api.search(dictCode, "ca", 1, 1));
          System.out.println(results);
          System.out.println("*** Spell checking");
          JSONObject spellResults = new JSONObject(api.didYouMean(dictCode, "dorg", 3));
          System.out.println(spellResults);
          System.out.println("*** Best matching");
          JSONObject bestMatch = new JSONObject(api.searchFirst(dictCode, "ca", "html"));
          System.out.println(bestMatch);

          System.out.println("*** Nearby Entries");
          JSONObject nearbyEntries = new JSONObject(api.getNearbyEntries(dictCode,
                  bestMatch.getString("entryId"), 3));
          System.out.println(nearbyEntries);
      } catch (Exception e) {
          e.printStackTrace();
      }
+1

Antlr . Antlr -

Hibernate ANTLR HQL (, SELECT, FROM).

0

if(string.contains("[a-zA-Z]+") || string.contains([0-9]+])

, , , , , :

if(string.contains("[a-zA-Z]+") && string.contains([0-9]+])

, , ? ? , , 3 word. , , :

if(string.contains("[a-zA-Z]+") && string.contains([0-9]+] && !string.contains(" "))

,

0

,

, StringTokenizer , , , , , . .

0

Source: https://habr.com/ru/post/1542326/


All Articles