How to break a string with special characters on NSMutableArray

I am trying to highlight a string with Danish characters in NSMutableArray. But something is not working .: (

My code is:

NSString *danishString = @"æøå"; NSMutableArray *characters = [[NSMutableArray alloc] initWithCapacity:[danishString length]]; for (int i=0; i < [danishString length]; i++) { NSString *ichar = [NSString stringWithFormat:@"%c", [danishString characterAtIndex:i ]]; [characters addObject:ichar]; } 

If I do an NSLog on danishString, it works (returns æøå);

But if I do NSLog on characters (array), I get very strong characters - what is wrong?

/ Morten

+4
source share
4 answers

First of all, your code is incorrect. characterAtIndex returns unichar , so you should use @"%C" (uppercase) as the format specifier.

Even with the correct format specifier, your code is unsafe and, strictly speaking, still incorrect, since not all Unicode characters can be represented by one unichar . You should always process unicode strings for each substring:

It is generally accepted to consider a string as a sequence of characters, but when working with NSString objects or using Unicode strings in general, in most cases it is better to deal with substrings, rather than individual characters. The reason for this is that the user is perceived as a character in the text, in many cases several characters per line can be represented.

You must read the String Programming Guide .

Finally, the correct code for you:

 NSString *danishString = @"æøå"; NSMutableArray *characters = [[NSMutableArray alloc] initWithCapacity:[danishString length]]; [danishString enumerateSubstringsInRange:NSMakeRange(0, danishString.length) options:NSStringEnumerationByComposedCharacterSequences usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) { [characters addObject:substring]; }]; 

If with NSLog(@"%@", characters); you see a "strange character" of the form "\ Uxxxx", that's right. This behavior is the default string of the NSArray method on description . You can print these Unicode characters one by one if you want to see "regular characters":

 for (NSString *c in characters) { NSLog(@"%@", c); } 
+2
source

In your example, ichar not an NSString type, but unichar . If you want NSString to try to get a substring instead:

 NSString *danishString = @"æøå"; NSMutableArray *characters = [[NSMutableArray alloc] initWithCapacity:[danishString length]]; for (int i=0; i < [danishString length]; i++) { NSRange r = NSMakeRange(i, 1); NSString *ichar = [danishString substringWithRange:r]; [characters addObject:ichar]; } 
0
source

You can do something like the following, which should be fine with Danish characters, but it will break if you have characters expanded. I suggest reading the String Programming Guide for more information.

 NSString *danishString = @"æøå"; NSMutableArray* characters = [NSMutableArray array]; for( int i = 0; i < [danishString length]; i++ ) { NSString* subchar = [danishString substringWithRange:NSMakeRange(i, 1)]; if( subchar ) [characters addObject:subchar]; } 

This would split the string into an array of individual characters, assuming that all code points are made up of characters.

0
source

It prints unicode characters. Anyway, you can use unicode (with \ u) anywhere.

-1
source

Source: https://habr.com/ru/post/1389425/


All Articles