Unique entries in the HashSet <List <T>>, where the list can have null entries
There is List<MyElement> = new ArrayList<MyElement>();
class MyElement { private Object[] values; //... } I need to find all the unique entries in this list. I would use a HashSet , BUT the problem is that values can contain null AND , it should be assumed that null is equal to any other value. For example, Object[] o1 = new Object[]{1,null,"s2"} and Object[] o2 = new Object[]{1,2,"s2"} should be considered as the same records (i.e. e. Not unique), and only one of them should be stored in a HashSet . Is there a way to override the correct functions in a HashSet?
Do you really need O (1) time to add () and contains ()? I donβt see a good way to write a hashCode () function for your MyElement class that matches your requirements.
The comparator (or creating the MyElement Comparable), however, can do the trick, and then you can use the TreeSet to look up the unique elements of your list.
Here is the first attempt (you should not use it as is, it probably will not work).
class MyElementComparator implements Comparator<MyElement> { @Override public int compare(MyElement e, MyElement f) { int sizeCmp = e.values.length - f.values.length; if(sizeCmp != 0) // Lists are of different sizes, elements aren't equal return sizeCmp; // Start comparing element by element for(int i=0; i<e.values.length; i++) { Object eo = e.values[i]; Object fo = f.values[i]; // Null is a wildcard if(eo == null || fo == null) continue; // If objects are the same, then continue too. if(eo == fo || eo.equals(fo)) continue; // Otherwise, decide on one object or the other based on hashcode (or any other valid mean). return eo.hashCode() - fo.hashCode(); } // All elements were equal or skipped, then the objects are equal. return 0; } } Quick tests seem to indicate that they work:
MyElement a = new MyElement(1, null, "s2"); MyElement b = new MyElement(1, 2, "s2"); MyElement c = new MyElement(null, "s", 3); TreeSet<MyElement> set = new TreeSet<MyElement>(new MyElementComparator()); set.add(a); set.add(b); set.add(c); System.out.println(set.size()); // 2 But this will not succeed if you add to the set an element equal to the other two other elements. For example, {1} and {2} are different, but if you add {null}, then the set should be reduced to {null}, and this will not happen.
No Comparator will achieve this, will you need a different data structure, perhaps a Disjoint set (Union Find)? http://en.wikipedia.org/wiki/Disjoint-set_data_structure
Your problem is that null references should not be anything, since equal to the contract states:
For any non-zero reference x, x.equals (NULL) should return false.
So, if your values field makes sense for your equals implementation, then you cannot implement what you say without breaking the contract.
I would replace the Object[] field for List one, and implement equals in the MyElement class. This, in turn, will give significant equal to the list, as stated in his contract . Of course, if you override peers, you must override hashcode in order to maintain integrity.
I would leave the good old HashSet untouched, keep in mind that writing the right collections is not a trivial task, no matter how it might seem at first glance. Therefore, redefine your MyElement and equals hash codes to fit your needs without violating both contracts.