TEqualityComparer <T> may not work for records due to alignment

We recently had a problem with an instance of TDictionary<T> , which could not correctly find elements that were already included in the dictionary. The problem arose only in 64Bit builds. I was able to break the problem down to this code:

 var i1, i2: TPair<Int64,Integer>; begin FillMemory(@i1, sizeof(i1), $00); FillMemory(@i2, sizeof(i1), $01); i1.Key := 2; i1.Value := -1; i2.Key := i1.Key; i2.Value := i1.Value; Assert(TEqualityComparer<TPair<Int64,Integer>>.Default.Equals(i1, i2)); Assert(TEqualityComparer<TPair<Int64,Integer>>.Default.GetHashCode(i1) = TEqualityComparer<TPair<Int64,Integer>>.Default.GetHashCode(i2)); end; 

Claims do not work in Win64 builds. The problem seems to be related to write alignment: the size of this TPair is 16 bytes, but only 12 bytes are filled with data. However, TEqualityComparer takes into account all 16 bytes. Thus, 2 record values ​​can be considered as not equal, although all members are equal, only because of other previous memory contents

Can this be considered a design error or behavior? In any case, this is a trap. What is the best solution for such situations?

You can use NativeInt instead of Integer as a workaround, but this type of Integer not under our control.

+6
source share
2 answers

I do not think this is a mistake. Design Behavior. Without checking, or perhaps some compile-time support for understanding these types, it is difficult to write a universal resolver for arbitrary structured types.

The default comparison tool can only be safely used for non-populated types that contain only some simple value types that can be compared using a naive binary comparison. For example, there are no floating point types because their comparison operators are more complex. Think of NaNs, negative zero, etc.

I think the only reliable way to handle this is to write your own comparative equality coefficient. Others suggested that all instances of records be initialized by default, but this imposes a significant burden on consumers of these types and risks unclear and difficult to identify flaws if some code forgets to initialize by default.

I would use TEqualityComparer<T>.Construct to create suitable equality mappings. This requires the least amount of template. You provide two anonymous methods: an equality function and a hash function, and Construct returns a newly minted comparator to you.

You can wrap this in a general class as follows:

 uses System.Generics.Defaults, System.Generics.Collections; {$IFOPT Q+} {$DEFINE OverflowChecksEnabled} {$Q-} {$ENDIF} function CombinedHash(const Values: array of Integer): Integer; var Value: Integer; begin Result := 17; for Value in Values do begin Result := Result*37 + Value; end; end; {$IFDEF OverflowChecksEnabled} {$Q+} {$ENDIF} type TPairComparer = class abstract public class function Construct<TKey, TValue>( const EqualityComparison: TEqualityComparison<TPair<TKey, TValue>>; const Hasher: THasher<TPair<TKey, TValue>> ): IEqualityComparer<TPair<TKey, TValue>>; overload; static; class function Construct<TKey, TValue>: IEqualityComparer<TPair<TKey, TValue>>; overload; static; end; class function TPairComparer.Construct<TKey, TValue>( const EqualityComparison: TEqualityComparison<TPair<TKey, TValue>>; const Hasher: THasher<TPair<TKey, TValue>> ): IEqualityComparer<TPair<TKey, TValue>>; begin Result := TEqualityComparer<TPair<TKey, TValue>>.Construct( EqualityComparison, Hasher ); end; class function TPairComparer.Construct<TKey, TValue>: IEqualityComparer<TPair<TKey, TValue>>; begin Result := Construct<TKey, TValue>( function(const Left, Right: TPair<TKey, TValue>): Boolean begin Result := TEqualityComparer<TKey>.Default.Equals(Left.Key, Right.Key) and TEqualityComparer<TValue>.Default.Equals(Left.Value, Right.Value); end, function(const Value: TPair<TKey, TValue>): Integer begin Result := CombinedHash([ TEqualityComparer<TKey>.Default.GetHashCode(Value.Key), TEqualityComparer<TValue>.Default.GetHashCode(Value.Value) ]); end ) end; 

I provided two overloads. If the default mappers for your two types are sufficient, you can use parameterless overloading. Otherwise, you can provide two anonymous methods for types.

For your type you will get a comparative example:

 TPairComparer.Construct<Int64, Integer> 

Both of these simple types have comparative comparisons by default that you can use. Consequently, you can use the carefree Construct overload.

+5
source

The default comparison for records only works for records of pure value types without padding. Based on this, as a rule, this is not a good idea. For any entries that require accurate comparisons of hashing and equality, you really need to write your own comparisons.

As already noted, initializing all your records with Default() also an option, but this approach is tedious and error prone - it is easy to forget to initialize a record, and it is difficult to trace such an omission when this happens. This approach is also effective only for eliminating padding errors, while a custom resolver can also handle reference types, etc.

This, for example, demonstrates a working solution to the problem:

 program Project1; uses SysUtils, Windows, StrUtils, Generics.Collections, Generics.Defaults, System.Hash; type TPairComparer<TKey, TValue> = class(TEqualityComparer<TPair<TKey, TValue>>) public function Equals(const Left, Right: TPair<TKey, TValue>): Boolean; override; function GetHashCode(const Value: TPair<TKey, TValue>): Integer; override; end; TInt64IntDict<TValue> = class(TDictionary<TPair<Int64, Integer>, TValue>) public constructor Create; end; function TPairComparer<TKey, TValue>.Equals(const Left: TPair<TKey, TValue>; const Right: TPair<TKey, TValue>) : boolean; begin result := TEqualityComparer<TKey>.Default.Equals(Left.Key, Right.Key) and TEqualityComparer<TValue>.Default.Equals(Left.Value, Right.Value); end; {$IFOPT Q+} {$DEFINE OVERFLOW_ON} {$Q-} {$ELSE} {$UNDEF OVERFLOW_ON} {$ENDIF} function TPairComparer<TKey, TValue>.GetHashCode(const Value: TPair<TKey, TValue>) : integer; begin result := THashBobJenkins.GetHashValue(Value.Key, SizeOf(Value.Key), 23 * 31); result := THashBobJenkins.GetHashValue(Value.Value, SizeOf(Value.Value), result * 31); end; {$IFDEF OVERFLOW_ON} {$Q+} {$UNDEF OVERFLOW_ON} {$ENDIF} constructor TInt64IntDict<TValue>.Create; begin inherited Create(0, TPairComparer<Int64, Integer>.Create); end; var i1, i2: TPair<Int64, Integer>; LI64c : TPairComparer<Int64, Integer>; LDict : TInt64IntDict<double>; begin FillMemory(@i1, SizeOf(i1), $00); FillMemory(@i2, SizeOf(i1), $01); i1.Key := 2; i1.Value := -1; i2.Key := i1.Key; i2.Value := i1.Value; WriteLn(Format('i1 key = %d, i1 value = %d', [i1.Key, i1.Value])); WriteLn(Format('i2 key = %d, i2 value = %d', [i2.Key, i2.Value])); WriteLn; WriteLn('Using Default comparer'); if TEqualityComparer<TPair<Int64, Integer>>.Default.Equals(i1, i2) then WriteLn('i1 equals i2') else WriteLn('i1 not equals i2'); if TEqualityComparer<TPair<Int64, Integer>>.Default.GetHashCode(i1) = TEqualityComparer<TPair<Int64, Integer>>.Default.GetHashCode(i2) then WriteLn('i1, i2 - hashes match') else WriteLn('i1, i2 - hashes do not match'); WriteLn; WriteLn('Using custom comparer'); LI64c := TPairComparer<Int64, Integer>.Create; if LI64c.Equals(i1, i2) then WriteLn('i1 equals i2') else WriteLn('i1 not equals i2'); if LI64c.GetHashCode(i1) = LI64c.GetHashCode(i2) then WriteLn('i1, i2 - hashes match') else WriteLn('i1, i2 - hashes do not match'); WriteLn; LDict := TInt64IntDict<double>.Create; LDict.Add(i1, 1.23); if LDict.ContainsKey(i2) then WriteLn('Dictionary already contains key') else WriteLn('Dictionary does not contain key'); ReadLn; end. 

It leads to exit

i1 key = 2, i1 value = -1
i2 key = 2, i2 value = -1

Using default comparison - i1 is not equal to i2
i1, i2 - hashes do not match

Using custom comparator i1 equals i2
i1, i2 - hashes correspond

Dictionary already contains key

However, as David's answer shows, using a delegated comparator will result in less overhead and should be preferred in practice.

+3
source

Source: https://habr.com/ru/post/1266356/


All Articles