Check strings for identical characters in Objective-C

I have an array of strings from which I would like to extract only those that have unique character sets. (For example, "asdf" and "fdsa" are considered redundant). This is the method I am currently using:

NSMutableArray *uniqueCharSets = [[NSMutableArray alloc] init]; NSMutableArray *uniqueStrings = [[NSMutableArray alloc] init]; for (NSString *_string in unique) { NSCharacterSet *_charSet = [NSCharacterSet characterSetWithCharactersInString:_string]; if (![uniqueCharSets containsObject:_charSet]) { [uniqueStrings addobject:_string]; [uniqueCharSets addObject:_charSet]; } } 

This seems to work, but it is very slow and resource intensive. Can anyone think of a better way to do this?

+6
source share
3 answers

I just put together a quick example of how I approach this, but it turns out that this is weirder than you expect. First, NSCharacterSet does not perform equality to validate content. It uses only the value of a pointer. Based on this, your example will NOT work properly.

My approach is to use NSSet to handle hashing this data for us.

 @interface StringWrapper : NSObject @property (nonatomic, copy) NSString *string; @property (nonatomic, copy) NSData *charSetBitmap; - (id)initWithString:(NSString*)aString; @end @implementation StringWrapper @synthesize string, charSetBitmap; - (id)initWithString:(NSString*)aString; { if ((self = [super init])) { self.string = aString; } return self; } - (void)setString:(NSString *)aString; { string = [aString copy]; self.charSetBitmap = [[NSCharacterSet characterSetWithCharactersInString:aString] bitmapRepresentation]; } - (BOOL)isEqual:(id)object; { return [self.charSetBitmap isEqual:[object charSetBitmap]]; } - (NSUInteger)hash; { return [self.charSetBitmap hash]; } @end int main (int argc, const char * argv[]) { @autoreleasepool { NSMutableSet *stringWrappers = [[NSMutableSet alloc] init]; NSArray *strings = [NSArray arrayWithObjects:@"abc",@"aaabcccc",@"awea",@"awer",@"abcde", @"ehra", @"QWEQ", @"werawe", nil]; for (NSString *str in strings) [stringWrappers addObject:[[StringWrapper alloc] initWithString:str]]; NSArray *uniqueStrings = [stringWrappers valueForKey:@"string"]; NSLog(@"%@", uniqueStrings); } return 0; } 

The code is pretty simple. We create a container object to cache the results of the representation of a bitmap character set. We use a bitmap representation because NSData implements isEqual: accordingly.

0
source
  • Using NSDictionary , match each lexicographically sorted line equivalent to the NSArray input lines: (for example, adfs => [afsd, asdf, ...] )
  • Go through the dictionary, print the keys (or their values) that have only the values โ€‹โ€‹of a singleton array
+1
source

The only thing that comes to my mind is not to use containsObject : since NSMutableArray not ordered (in general), we can assume that containsObject just iterates through the array, starting from the very beginning until it finds an object. This means O(n) ( n worst case comparisons).

A better solution might be to maintain an ordered array and use a custom search method using a dichotomous approach . So you will have O(log n) complexity.
Of course, you must make sure that your array is ordered (much more efficient than adding and reordering), so you should use the insertObject:atIndex: method to correctly enter the element.

0
source

Source: https://habr.com/ru/post/904855/


All Articles