I have a lot of lines. All of them contain only characters. Symbols and words are not separated by a space from each other. Some of the characters make up English words and others are just bufflegab. Lines may not contain the whole sentence.
I need to find out which ones are written in real English. I mean, String can be built by concatenating well-written English words. I know that I can do something with the word. But words do not split apart. Therefore, it may take a long time to verify each possible combination of words.
I am looking for an algorithm or high performance method that checks if strings are built from English words or English speech. Perhaps there is something that gives me a chance that the line contains English speech.
Do you know a method or algorithm that helps me? Something like sphinx help me?
c0d3x
source
share