I assume you are interested in HTML content. This is actually not a very simple task. Most browsers use encoding, unicode bytes (at the beginning of your document) and HTML headers as hints when trying different encodings to see
Details of the algorithm are outlined in 8.2.2.1 of the HTML5 specification, see http://www.w3.org/html/wg/drafts/html/master/single-page.html .
source
share