Soundex Algorithm implementation using C ++

Simply put, the Soundex Algorithm changes a sequence of characters into code. It is said that the characters that produce the same Soundex code sound the same.

  • Code 4 characters wide.
  • The first character of the code is always the first character of the word

Each character in the alphabet belongs to a certain group (at least in this example, and after that the code is the rule in which I will adhere):

  • b, p, v, f = 1
  • c, g, j, k, q, s, x, z = 2
  • d, t = 3
  • l = 4
  • m, n = 5
  • r = 6
  • Each other letter in the alphabet belongs to group 0.

Other known rules:

  • All letters belonging to group 0 are ignored, IF you have not received the letters in the provided word, in this case the rest of the code is filled with 0.
  • , . 0.

, "Ray" Soundex: R000 (R - , a - 0, , y 0, , , 3 0).

, 1) 128- , Soundex 2) 5 , Soundex ( , ).

, , . , , . , .

// CREATE A SOUNDEX CODE
// * Parameter list includes the string of characters that are to be converted to code and a variable to save the code respectively.
void SoundsAlike(const char input[], char scode[])
{
    scode[0] = toupper(input[0]); // First character of the string is added to the code

    int matchCount = 1;
    int codeCount = 1;
    while((matchCount < strlen(input)) && (codeCount < 4))
    {
        if(((input[matchCount] == 'b') || (input[matchCount] == 'p') || (input[matchCount] == 'v') || (input[matchCount] == 'f')) && (scode[codeCount-1] != 1))
        {
            scode[codeCount] = 1;
            codeCount++;
        }
        else if(((input[matchCount] == 'c') || (input[matchCount] == 'g') || (input[matchCount] == 'j') || (input[matchCount] == 'k') || (input[matchCount] == 'q') || (input[matchCount] == 's') || (input[matchCount] == 'x') || (input[matchCount] == 'z')) && (scode[codeCount-1] != 2))
        {
            scode[codeCount] = 2;
            codeCount++;
        }
        else if(((input[matchCount] == 'd') || (input[matchCount] == 't')) && (scode[codeCount-1] != 3))
        {
            scode[codeCount] = 3;
            codeCount++;
        }
        else if((input[matchCount] == 'l') && (scode[codeCount-1] != 4))
        {
            scode[codeCount] = 4;
            codeCount++;
        }
        else if(((input[matchCount] == 'm') || (input[matchCount] == 'n')) && (scode[codeCount-1] != 5))
        {
            scode[codeCount] = 5;
            codeCount++;
        }
        else if((input[matchCount] == 'r') && (scode[codeCount-1] != 6))
        {
            scode[codeCount] = 6;
            codeCount++;
        }
        matchCount++;
    }

    while(codeCount < 4)
    {
        scode[codeCount] = 0;
        codeCount++;
    }
    scode[4] = '\0';

    cout << scode << endl;
}

, - strlen, - , while, ( if ).

? .

+3
4

scode[codeCount] = 1;

scode[codeCount] = '1';

char, ascii, - "1".

+3

++ , . std::string. , :

void Soundex( const string & input, string & output ) {
   for ( int i = 0; i < input.length(); i++ ) {
       char c = input[i];        // get character from input
       if ( c === .... ) {       // if some decision
            output += 'X';       // add some character to output
       }
       else if ( ..... )  {       // more tests
       }
   }
}
0

strlen(), char. , strlen() . , "scode" "\ 0", , "\ 0", .

0

C, ++. , , ? strlen .

, :

  • . .
  • Define a variable, set it to enter [matchCount], and use it. This will make the code more readable.
  • I would recommend replacing if-else statements with one switch.
  • Designed for the default case (none of the if-else or case statements)
0
source

Source: https://habr.com/ru/post/1706883/


All Articles