If you just need the number of times that objects arise, then Kurt's answer is pretty good. If you really need a snippet, this should work:
NSArray *original = @[@1, @3, @5, @7, @9, @8, @5, @3, @2, @4, @3, @6]; NSMutableArray *chunked = [NSMutableArray array]; NSNumber *current = nil; for (NSNumber *number in [original sortedArrayUsingSelector:@selector(compare:)]) { if (![number isEqual:current]) { [chunked addObject:[NSMutableArray arrayWithObject:number]]; current = number; } else { [[chunked lastObject] addObject:number]; } } NSLog(@"%@", chunked);
If I missed something, it is not computationally difficult, but should be a little more efficient than Tim's original method (no dictionaries, sets or hashing needed). There is one view involved (with a quick enumeration, the container - the part after in
- is evaluated only once), and you iterate over the sorted array once. NSMutableArray
O(1)
insertion at both ends, so the worst case should be O(n)
due to iteration.
Actually: in a further review, the following code works much faster for large sets of numbers. It is a bit more confusing, but works more efficiently.
NSArray *original = @[@1, @3, @5, @7, @9, @8, @5, @3, @2, @4, @3, @6]; NSMutableArray *chunked = [NSMutableArray array]; NSCountedSet *countedSet = [[NSCountedSet alloc] initWithArray:original]; for (NSNumber *number in countedSet) { NSMutableArray *chunk = [NSMutableArray array]; NSUInteger count = [set countForObject:number]; for (NSUInteger i = 0; i < count; i++) { [chunk addObject:number]; } [chunked addObject:chunk]; } [chunked sortUsingComparator:^(NSArray *a1, NSArray *a2) { return [a1[0] compare:a2[0]]; }]; NSLog(@"%@", chunked);
With 10000000
random numbers, the first implementation takes about 12.27
seconds, and the second takes 0.92
seconds. Hover over your mouse.
The second method has the disadvantage that the pieces that it creates are all duplicates of the same object; if this presents problems for you (in the general case, it may be problematic for memory management, or if your objects can be considered "equal" in a sense, even if all their properties are not quite right), use the first method. Otherwise, it will be better for you.
Additional clarification: upon further thought, I knew that during the difference between the two methods there was something suspicious, and I was right. If you have many variations in your dataset (with very few duplicate numbers), method 2 will work far, much slower; changing the number does not greatly affect Method 1. For many repeated numbers, Method 2 will be pretty fast, but if your data set is completely random, you would be better off using Method 1.
Here is the code I use to test these two: http://pastebin.com/9syEyiyM