A file readable for inserting a DB, resulting in Unicode strings

I am reading a JSON string from a file, parsing it, and then inserting data into a MySQL database. My insert request causes the following error:

SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xE3\xADs' for column 'fname' at row 1

I believe that the content causing the error is íin the title Ailís(I repeated the identifiers until the error was selected).

  • File encoded by utf8
  • I am reading a file using a UTF8 context
  • I check data encoding as UTF8 (this)
  • My PDO connection is UTF8 encoded as well SET NAMES utf8
  • Database encoded by UTF8
  • The table is encoded by UTF8
  • Column encoded by UTF8

The code:

$opts = ['http' => ['header' => 'Accept-Charset: UTF-8, *;q=0']];
$context = stream_context_create($opts);
$post = file_get_contents('sample_data/11111a_json_upload.json',false, $context);
if(!mb_check_encoding($post, 'UTF-8'))
    throw new Exception('Invalid encoding detected.');
$data = json_decode($post, true);

I also inserted the following function before I decrypted JSON:

static function clean_unicode_literals($string)
{
    return preg_replace_callback('@\\\(x)?([0-9a-zA-Z]{2,3})@',
        function ($m) {
            if ($m[1]) {
                $hex = substr($m[2], 0, 2);
                $unhex = chr(hexdec($hex));
                if (strlen($m[2]) > 2) {
                    $unhex .= substr($m[2], 2);
                }
                return $unhex;
            } else {
                return chr(octdec($m[2]));
            }
        }, $string);
}

, , . , - ?

PDO, :

public function __construct($db_user, $db_pass, $db_name, $db_host, $charset)
{
    if(!is_null($db_name))
        $dsn = 'mysql:host=' . $db_host . ';dbname=' . $db_name . ';charset=' . $charset;
    else
        $dsn = 'mysql:host=' . $db_host . ';charset=' . $charset;

    $options = [
        PDO::ATTR_PERSISTENT => true,
        PDO::ATTR_ERRMODE    => PDO::ERRMODE_EXCEPTION,
        PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES 'utf8'"
    ];

    try
    {
        $this->db_handler = new PDO($dsn, $db_user, $db_pass, $options);
        $this->db_handler->exec('SET NAMES utf8');
        $this->db_valid = true;
    }
    catch(PDOException $e)
    {
        $this->db_error = $e->getMessage();
        $this->db_valid = false;
    }

    return $this->db_valid;
}

( , ...)
, utf8_general_ci.

IDE - PHPStorm, WAMP MySQL 5.7.14 Windows 10.

+4
1

- : \xE3\xADs

nibble E , 3- UTF-8, .

í, \xC3\xAD.

, clean_unicode_literals , JSON UTF-8 JSON.

clean_unicode_literals, , .

+1

Source: https://habr.com/ru/post/1687250/


All Articles