Is there a function that returns the root letter for special characters?

.NET has a function that returns the root letter (a letter without special attributes such as cedilla), kinda:

Select Case c
  Case "á", "à", "ã", "â", "ä", "ª" : x = "a"
  Case "é", "è", "ê", "ë" : x = "e"
  Case "í", "ì", "î", "ï" : x = "i"
  Case "ó", "ò", "õ", "ô", "ö", "º" : x = "o"
  Case "ú", "ù", "û", "ü" : x = "u"

  Case "Á", "À", "Ã", "Â", "Ä" : x = "A"
  Case "É", "È", "Ê", "Ë" : x = "E"
  Case "Í", "Ì", "Î", "Ï" : x = "I"
  Case "Ó", "Ò", "Õ", "Ô", "Ö" : x = "O"
  Case "Ú", "Ù", "Û", "Ü" : x = "U"

  Case "ç" : x = "c"
  Case "Ç" : x = "C"

  Case Else
       x = c
End Select

This code skips a few letters, but this is just an example :)

+3
source share
4 answers

By the way (completely unrelated to the question), your code works with strings. This is not only less effective, but it doesn’t actually make sense, since you are interested in individual characters, not strings, and these are different data types in .NET.

, , c :

Select Case c
  Case "á"c, "à"c, "ã"c, "â"c, "ä"c, "ª"c : x = "a"c
  ' … and so on. '
End Select
+2

taken from Chetan Sastry's answer, here I give you the VB.NET and C # code, which is copied from his BIG answer :)

VB:

Imports System.Text
Imports System.Globalization

''' <summary>
''' Removes the special attributes of the letters passed in the word
''' </summary>
''' <param name="word">Word to be normalized</param>
Function RemoveDiacritics(ByRef word As String) As String
    Dim normalizedString As String = word.Normalize(NormalizationForm.FormD)
    Dim r As StringBuilder = New StringBuilder()
    Dim i As Integer
    Dim c As Char

    For i = 0 To i < normalizedString.Length
        c = normalizedString(i)
        If (CharUnicodeInfo.GetUnicodeCategory(c) <> UnicodeCategory.NonSpacingMark) Then
            r.Append(c)
        End If
    Next

    RemoveDiacritics = r.ToString
End Function

FROM#

using System.Text;
using System.Globalization;

/// <summary>
/// Removes the special attributes of the letters passed in the word
/// </summary>
/// <param name="word">Word to be normalized</param>
public String RemoveDiacritics(String word)
{
  String normalizedString = word.Normalize(NormalizationForm.FormD);
  StringBuilder stringBuilder = new StringBuilder();
  int i;
  Char c;

  for (i = 0; i < normalizedString.Length; i++)
  {
    c = normalizedString[i];
    if (CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
  stringBuilder.Append(c);
  }

  return stringBuilder.ToString();
} 

I hope this helps someone like me :)

+1
source

In .NET, there is a simple method comparison string.

public static string NormalizeString(string value)
{
    string nameFormatted = value.Normalize(System.Text.NormalizationForm.FormKD);
    Regex reg = new Regex("[^a-zA-Z0-9 ]");
    return reg.Replace(nameFormatted, "");
}
0
source

Source: https://habr.com/ru/post/1702931/


All Articles