Php mb_convert_case () save words in uppercase

Assuming I have the string "HET1200 Text String" and I need it to change to "HET1200 Text String". The encoding will be UTF-8.

How can i do this? I am currently using mb_convert_case($string, MB_CASE_TITLE, "UTF-8");, but this changes "HET1200" to "Het1200".

I can point out the exception, but it will not be exhaustive. Therefore, I prefer all uppercase words to be stored in uppercase.

Thank:)

+3
source share
1 answer

OK, try to recreate mb_convert_caseas close as possible, but only by changing the first character of each word.

mb_convert_case :

int mode = 0; 

for (i = 0; i < unicode_len; i+=4) {
    int res = php_unicode_is_prop(
        BE_ARY_TO_UINT32(&unicode_ptr[i]),
        UC_MN|UC_ME|UC_CF|UC_LM|UC_SK|UC_LU|UC_LL|UC_LT|UC_PO|UC_OS, 0);
    if (mode) {
        if (res) {
            UINT32_TO_BE_ARY(&unicode_ptr[i],
                php_unicode_tolower(BE_ARY_TO_UINT32(&unicode_ptr[i]),
                    _src_encoding TSRMLS_CC));
        } else {
            mode = 0;
        }   
    } else {
        if (res) {
            mode = 1;
            UINT32_TO_BE_ARY(&unicode_ptr[i],
                php_unicode_totitle(BE_ARY_TO_UINT32(&unicode_ptr[i]),
                    _src_encoding TSRMLS_CC));
        }
    }
}

, :

  • mode 0. mode , . 0, , , .
  • .
    • , .
      • res 1, . , 1, "Mark, Non-Spacing", "Mark, Enclosing", "Other, Format", "Letter, Modifier", "Symbol, Modifier", "Letter, Uppercase" "Letter, Lowercase", "Letter, Titlecase", "Punctuation, Other" "Other, Surrogate". , ", " .
      • , - , , .
      • , mode 0, , .
    • ,
      • : .

mbstring, , . , , - 10 , mb_convert_case .

, unicode regex .

mb_convert_case :

function mb_convert_case_utf8_variation($s) {
    $arr = preg_split("//u", $s, -1, PREG_SPLIT_NO_EMPTY);
    $result = "";
    $mode = false;
    foreach ($arr as $char) {
        $res = preg_match(
            '/\\p{Mn}|\\p{Me}|\\p{Cf}|\\p{Lm}|\\p{Sk}|\\p{Lu}|\\p{Ll}|'.
            '\\p{Lt}|\\p{Sk}|\\p{Cs}/u', $char) == 1;
        if ($mode) {
            if (!$res)
                $mode = false;
        }
        elseif ($res) {
            $mode = true;
            $char = mb_convert_case($char, MB_CASE_TITLE, "UTF-8");
        }
        $result .= $char;
    }

    return $result;
}

:

echo mb_convert_case_utf8_variation("HETÁ1200 Ááxt ítring uii");

:

HETÁ1200 Ááxt Ítring Uii
+4

Source: https://habr.com/ru/post/1755654/


All Articles