$string = file_get_contents('http://example.com'); if ('UTF-8' === mb_detect_encoding($string)) { $dom = new DOMDocument(); // hack to preserve UTF-8 characters $dom->loadHTML('<?xml encoding="UTF-8">' . $string); $dom->preserveWhiteSpace = false; $dom->encoding = 'UTF-8'; $body = $dom->getElementsByTagName('body'); echo htmlspecialchars($body->item(0)->nodeValue); }
This changes all UTF-8 characters to ร
, ยพ, ยค and other garbage. Is there any other way to save UTF-8 characters?
Do not post answers telling me to make sure I output it as UTF-8, I made sure that I am.
Thank you in advance:)
dom php utf-8
Richard Knop Feb 10 2018-10-10 12:58
source share