Is this a mistake I have to introduce to Apple, or is this the expected behavior?

Question

Is this a mistake I have to introduce to Apple, or is this the expected behavior?

When using CoreData, the next multi-column index predicate is very slow - it takes almost 2 seconds for 26,000 records.

Note that both columns are indexed, and I purposefully execute the query with> and <= instead of starting to do it quickly:

NSPredicate *predicate = [NSPredicate predicateWithFormat: @"airportNameUppercase >= %@ AND airportNameUppercase < %@ \ OR cityUppercase >= %@ AND cityUppercase < %@ \ upperText, upperTextIncremented, upperText, upperTextIncremented];

However, if I run two separate fetchRequests, one for each column, and then combined the results, then each fetchRequest takes only 1-2 hundredths of a second, and combining lists (which are sorted) takes about 1 / 10th of a second.

Is this a bug in how CoreData handles multiple indexes or is this the expected behavior? Below is my complete, optimized code that works very fast:

 NSFetchRequest *fetchRequest = [[[NSFetchRequest alloc] init]autorelease]; [fetchRequest setFetchBatchSize:15]; // looking up a list of Airports NSEntityDescription *entity = [NSEntityDescription entityForName:@"Airport" inManagedObjectContext:context]; [fetchRequest setEntity:entity]; // sort by uppercase name NSSortDescriptor *nameSortDescriptor = [[[NSSortDescriptor alloc] initWithKey:@"airportNameUppercase" ascending:YES selector:@selector(compare:)] autorelease]; NSArray *sortDescriptors = [[[NSArray alloc] initWithObjects:nameSortDescriptor, nil]autorelease]; [fetchRequest setSortDescriptors:sortDescriptors]; // use > and <= to do a prefix search that ignores locale and unicode, // because it very fast NSString *upperText = [text uppercaseString]; unichar c = [upperText characterAtIndex:[text length]-1]; c++; NSString *modName = [[upperText substringToIndex:[text length]-1] stringByAppendingString:[NSString stringWithCharacters:&c length:1]]; // for the first fetch, we look up names and codes // we'll merge these results with the next fetch for city name // because looking up by name and city at the same time is slow NSPredicate *predicate = [NSPredicate predicateWithFormat: @"airportNameUppercase >= %@ AND airportNameUppercase < %@ \ OR iata == %@ \ OR icao == %@", upperText, modName, upperText, upperText, upperText]; [fetchRequest setPredicate:predicate]; NSArray *nameArray = [context executeFetchRequest:fetchRequest error:nil]; // now that we looked up all airports with names beginning with the prefix // look up airports with cities beginning with the prefix, so we can merge the lists predicate = [NSPredicate predicateWithFormat: @"cityUppercase >= %@ AND cityUppercase < %@", upperText, modName]; [fetchRequest setPredicate:predicate]; NSArray *cityArray = [context executeFetchRequest:fetchRequest error:nil]; // now we merge the arrays NSMutableArray *combinedArray = [NSMutableArray arrayWithCapacity:[cityArray count]+[nameArray count]]; int cityIndex = 0; int nameIndex = 0; while( cityIndex < [cityArray count] || nameIndex < [nameArray count]) { if (cityIndex >= [cityArray count]) { [combinedArray addObject:[nameArray objectAtIndex:nameIndex]]; nameIndex++; } else if (nameIndex >= [nameArray count]) { [combinedArray addObject:[cityArray objectAtIndex:cityIndex]]; cityIndex++; } else if ([[[cityArray objectAtIndex:cityIndex]airportNameUppercase] isEqualToString: [[nameArray objectAtIndex:nameIndex]airportNameUppercase]]) { [combinedArray addObject:[cityArray objectAtIndex:cityIndex]]; cityIndex++; nameIndex++; } else if ([[cityArray objectAtIndex:cityIndex]airportNameUppercase] < [[nameArray objectAtIndex:nameIndex]airportNameUppercase]) { [combinedArray addObject:[cityArray objectAtIndex:cityIndex]]; cityIndex++; } else if ([[cityArray objectAtIndex:cityIndex]airportNameUppercase] > [[nameArray objectAtIndex:nameIndex]airportNameUppercase]) { [combinedArray addObject:[nameArray objectAtIndex:nameIndex]]; nameIndex++; } } self.airportList = combinedArray;

+4

objective-c cocoa-touch core-data

Andrew Johnson Feb 24 '11 at 23:29

source share

2 answers

Ryan · Answer 1 · 2011-02-24T23:49:20+0000

CoreData has no advantage for creating or using multi-column indexes. This means that when executing a query that matches your predicate with multiple properties, CoreData can only use one index for selection. Subsequently, it uses the index for one of the property tests, but SQLite cannot use the index to collect matches for the second property and, therefore, must do all this in memory instead of using its index structure on disk.

This second selection phase ends slowly because it must collect all the results in memory from disk, and then compare and drop the results in memory. That way you end up doing potentially more I / O than if you could use an index with multiple columns.

That is why if you disqualify many potential results in each column of your predicate, you will see much faster results by doing what you do and make two separate selections and merge in memory than you would if you made one choice.

To answer your question, this behavior is not unexpected for Apple; it's just a design decision not to support multi-column indexes in CoreData. But you should report the error at http://radar.apple.com , requesting support for multi-column indexes if you want to see this feature in the future.

In the meantime, if you really want to get maximum database performance on iOS, you can use SQLite directly instead of CoreData.

Ben · Answer 2 · 2011-02-25T05:30:39+0000

If in doubt, you must indicate a mistake.

There is currently no API for specifying Core Data to create a composite index. If a composite index existed, it would be used without problems.

Unindexed columns are not fully processed in memory. They lead to a table scan, which is not the same as downloading the whole file (well, if your file does not have only 1 table). Scanning a table row by row tends to be very slow.

SQLite itself is limited by the number of indexes that it will use for each query. Mostly just 1, give or accept some circumstances.

You must use the [n] flag for this query to perform a binary search with normalized text. There is an example project in ADC called DerivedProperty. It will show how to normalize text so you can use binary mappings as opposed to the default ICU integration for fantastic localized text mappings in Unicode format.

There's a longer discussion of quick string searches in Core Data at https://devforums.apple.com/message/363871

Is this a mistake I have to introduce to Apple, or is this the expected behavior?

More articles: