If you often want to compare sequences, I would suggest you define a type that encapsulates an immutable sequence and implements ICollection together with IList<T> or ICollection<T> (you can define two types: one of which wraps IList<T> and implements ICollection and IList<T> , and one of them wraps IEnumerable<T> and implements ICollection and ICollection<T> ). This type should override Equals() and GetHashCode() and should have fields for cached accounts along with Int64 pairs and an Int32 field for common hash codes and, possibly, an Int64 sequence number field.
If the client code calls GetHashCode or if you need to list the elements to determine the number of elements in the wrapped collection, your code should list through the collection, calculate the hash values ββfor each element and use these calculations for a pair of 64-bit hash values ββfor the collection as a whole, and finally translate them into a 32-bit value suitable for using GetHashCode . Despite the fact that GetHashCode() requires only one 32-bit value, I would suggest calculating and saving more than for the reasons described below.
When performing an equality test, start with both objects wrapping the same collection. If so, they are equal. Otherwise, check if the collections contain the same number of elements and that the common hash codes match. If no condition applies, they are not equal. Otherwise, check the individual elements for each other. Please note that if hash codes have not yet been computed, it may be appropriate or not worth calculating (and checking) them before performing the equality test; some benchmarking may reveal if it is useful or harmful. If the collection ultimately gets hashed, it is better to do it sooner rather than later. On the other hand, if checking for equality on a collection of a million elements will consistently say βnot equalβ, just by looking at the first element, and nothing else will need a hash value, calculating this would be a waste.
If two objects are found equal, it may be appropriate to replace the new shell wrapped in the object with a collection wrapped in an older object and make the new serial number of the object coincide with the older object. Doing this increases the likelihood that if the wrappers are compared again, they can be considered equal without checking any items. Note that there are various other methods that can be used to facilitate future equality exams that involve various memory compromises; unfortunately, the approach that will have the best typical behavior has very bad worst behavior. Also note that although any shell that caches hash values ββfails if wrapped collections are modified externally, tracking the causes of such failures can be difficult if the above referenced replacements are made.
If you compare many unequal collections, the possibility of early exit using hash codes can be a serious victory in performance. In computing hash codes, I suggest you use a couple of "independent" methods for computing 64-bit hash codes. The reason for this is that depending on how the hash codes of the individual elements are computed, the probability of a system hash collision using a single hashing method may be unacceptably high. The cost of computing your own hash values ββcan be small compared to the cost of getting the hash values ββof your components, so calculating two or three independent hash functions will be a cheap way to protect against system hash collisions.