The string will replace diacritical characters in C #

I would like to use the this method to create a user-friendly URL. Since my site is in Croatian, there are characters that I would not want to shoot, but replaced them with another. For example, this line:
ŠĐĆŽ šđčćž
should be: sdccz-sdccz

So, I would like to create two arrays, one of which will contain the characters that should be replaced, and the other array with replacement characters:

string[] character = { "Š", "Đ", "Č", "Ć", "Ž", "š", "đ", "č", "ć", "ž" }; string[] characterReplace = { "s", "d", "c", "c", "z", "s", "d", "c", "c", "z" }; 

Finally, these two arrays should be used in some method that will take a string, find matches, and replace them. In php, I used the preg_replace function to handle this. In C #, this does not work:

 s = Regex.Replace(s, character, characterReplace); 


I would be grateful if someone could help. Thanks

+4
source share
3 answers

It seems you want to remove the diacritics and leave the base character. I would recommend Ben Lings for this:

 string input = "ŠĐĆŽ šđčćž"; string decomposed = input.Normalize(NormalizationForm.FormD); char[] filtered = decomposed .Where(c => char.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark) .ToArray(); string newString = new String(filtered); 

Edit: Small problem! This does not work for Đ. Result:

 SĐCZ sđccz 
+11
source

Jon Skeet mentioned the following code in a newsgroup ...

 static string RemoveAccents (string input) { string normalized = input.Normalize(NormalizationForm.FormKD); Encoding removal = Encoding.GetEncoding(Encoding.ASCII.CodePage, new EncoderReplacementFallback(""), new DecoderReplacementFallback("")); byte[] bytes = removal.GetBytes(normalized); return Encoding.ASCII.GetString(bytes); } 

EDIT

I may be crazy, but I just ran the following ...

 Dim Input As String = "ŠĐĆŽ-šđčćž" Dim Builder As New StringBuilder() For Each Chr As Char In Input Builder.Append(Chr) Next Console.Write(Builder.ToString()) 

And the result was SDCZ-sdccz

+11
source

A dictionary would be the logical solution for this ...

 Dictionary<char, char> AccentEquivelants = new Dictionary<char, char>(); AccentEquivelants.Add('Š', 's'); //...add other equivelents string inputstring = ""; StringBuilder FixedString = new StringBuilder(inputstring); for (int i = 0; i < FixedString.Length; i++) if (AccentEquivelants.ContainsKey(FixedString[i])) FixedString[i] = AccentEquivelants[FixedString[i]]; return FixedString.ToString(); 

You need to use StringBuilder when doing string operations, for example, because strings in C # are immutable, so changing a character at a time creates several string objects in memory, while StringBuilders are mutable and do not have this drawback.

0
source

Source: https://habr.com/ru/post/1305846/


All Articles