How to find character in byte offset in PHP?

I am trying to fix a problem with some (apparently) garbled serialized data in a MySQL database after converting to UTF-8. When I try not to initialize them, I get the usual one:

Notice: unserialize() [function.unserialize]: Error at offset 1481 of 255200 bytes [...] 

However, given that this is a multibyte string, I cannot figure out how to find which character is in this byte offset. I need something like substr() , but for bytes instead of characters. How can i do this?

Thanks in advance.

+4
source share
2 answers

You must do substr($str, 1481, 2); , substr($str, 1481, 3); or substr($str, 1481, 4); . If it is UTF-8, you will find it in any of the three substrings, because UTF-8 char can take from 2 to 4 characters depending on the first char.

I had a lot of problems with this, so if you can’t find what is happening with the encoding, please reply again :-) I will try to give you a hand.

Good luck

Edit: do not forget to make a title ("Content-type: text / html; charset = utf8"); to view the result correctly.

+2
source

substr works on bytes instead of characters. Thus, this should return the 1481th byte:

 substr($data, 1481, 1) 
0
source

Source: https://habr.com/ru/post/1334829/


All Articles