Comparing double-byte strings in C #

I have two lines, each of which has a double-byte value, and the other has a single-byte value. The string comparison result returns false, how do I get them to compare correctly after ignoring the single-byte / double-byte difference?

string s1 = "smatsumoto11"
string s2 = "smatsumoto11"

In the same scenario, if you have an nvarchar column on a SQL server that contains a value smatsumoto11, a query to fetch data with a where clause with a row smatsumoto11will return the same row. I need similar semantics with C # string matching.

I tried several options mentioned on MSDN, but they don't seem to work.

Any ideas?

+4
source share
4 answers

s1 "fullwidth", string.Compare :

string.Compare(s1, s2, CultureInfo.CurrentCulture, CompareOptions.IgnoreWidth);

(, CultureInfo.)

+6

: Normalize" :

, , Unicode.

, / . , , .

+3

, s1 MS Mincho.

MS Mincho (MS 明朝) - Windows 3.1 , Internet Explorer 3, Windows XP, Microsoft Office v.X 2004 .

Arnout.

, //TRANSLIT iconv , , .

        string s1 = "smatsumoto11";
        string s2 = "smatsumoto11";

        string conv = Encoding.ASCII.GetString(Encoding.GetEncoding("Cyrillic").GetBytes(s1));

        if (conv == s2) Console.WriteLine("They are the same!");

, ... a >

+1

, "" , , , , .NET, SQL Server.

-:

, - .

, . Unicode, UTF-16 Little Endian ( Windows .NET). , , 62 000 - 63 000 ( ) (.. U + 0000 U + FFFF - 0 - 65,535), "" ). Unicode 1,1 260 000 . U + FFFF/65,535, , , . , , 4 .

:

false,

s1 = "smatsumoto11" " ". :

http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:East_Asian_Width=Fullwidth:]

, , :

http://unicode-table.com/en/blocks/halfwidth-and-fullwidth-forms/

, , String.Compare(String, String, CultureInfo, CompareOptions), @Arnout, CompareInfo.Compare(String, String, CompareOptions) :

CompareInfo.Compare(s1, s2, CompareOptions.IgnoreWidth)

:

, nvarchar SQL-, smatsumoto11, where, smatsumoto11, .

. , , 7- ASCII ( 0 - 127), , , , LCID/Locale/Culture/Collation. Collation SQL Server ( ) SQL_Latin1_General_CP1_CI_AS, . Page 1252 ( CHAR/VARCHAR, NCHAR/NVARCHAR) "en-US". /LCID Fullwidth "half-width". , Collations, _WS , , _WS "Width Sensitive", .NET, CompareOptions.IgnoreWidth.

, Collations, _WS , , 1776 3885 Total Collations, Width Sensitive ( SQL Server 2012). , 262 ( , _BIN, _BIN2), , .

SELECT *
FROM sys.fn_helpcollations()
WHERE [name] LIKE N'%[_]WS%'
ORDER BY [name];
-- 1776 out of 3885 on SQL Server 2012

, , ( ) SQL_Latin1_General_CP1_CI_AS Latin1_General_100_CI_AS - INSensitive. , , , , CompareOptions.IgnoreWidth, Collations SQL Server, .NET Case Sensitive SQL Server. SQL Server ( Collations _CI _WS, CompareOptions.IgnoreCase :

CompareInfo.Compare(s1, s2, CompareOptions.IgnoreWidth | CompareOptions.IgnoreCase)

// or

String.Compare(s1, s2, CultureInfo.CurrentCulture, 
               CompareOptions.IgnoreWidth | CompareOptions.IgnoreCase)

:

.NET Framework

.NET Framework

+1
source

Source: https://habr.com/ru/post/1774032/


All Articles