If the rest of the text is English, you can use a list of words. If more than a given percentage (say, 50%) of the words in the text is not contained in the list of words, this is probably noise.
You might want to set a threshold, such as 5 words, to prevent the deletion of messages such as "LOL".
On most Linux installations, you can extract a list of words from aspell as follows:
aspell --lang en dump master
source share