Corrupted $ _GET variable

I have a lot of encoding problems on my site.

This is my problem right now if I go to analize.php?dialog=árbol , which code is:

 <? echo $_GET['dialog']; echo "sabía"; 

on it i get:

 sabÃa sabía 

I use ANSI, changing the breaks of both to UTF-8. I don’t understand why this is happening, also there is no code above this. I don't care how they are displayed, as this file is only used to retrieve data from my database. But I need to display $_GET correctly so that I can include it in the request.

How can I do that?

+4
source share
1 answer

You cannot send the “í” character to a URL; URLs must use a subset of ASCII encoding. Therefore, the URL is encoded to ?dialog=sab%C3%ADa your browser before being sent to the server. %C3%AD represents two bytes of C3 AD , which is the UTF-8 encoding for the character "í". You can confirm this with var_dump($_SERVER['QUERY_STRING']); . This automatically decrypted PHP, the result is a sequence of UTF-8 byte for "sabía", where "í" is encoded using two bytes C3 AD .

Your browser interprets this sequence of bytes using the encoding Windows-1252 or ISO-8859-1. Byte C3 represents "Ã" in this encoding, byte AD is a soft hyphen and is invisible.

Two possible solutions:

  • use UTF-8 everywhere (recommended!)

    • save source code as utf-8
    • display a header that makes the browser interpret the site as UTF-8:

       header('Content-Type: text/html; charset=utf-8'); 
  • convert $_GET values ​​to Windows-1252 / ISO-8859-1 (or any other code you want to use on your site) using mb_convert_encoding or iconv (not recommended)

    • even then you should set a header that tells the browser which encoding you are using

In short, you need to make sure that you use the same encoding everywhere and tell the browser what exactly this encoding is.

+5
source

Source: https://habr.com/ru/post/1386630/


All Articles