Tuesday, July 06, 2004 8:59 PM
pgolde
IComparer<T>
Another interface in Beta 1 that I have some questions about is IComparer<T>. In Beta 1, this interface looks like:
interface IComparer<T> {
int GetHashCode(T t);
bool Equals(T t1, T t2);
int Compare(T t1, T t2);
}
The problem with this interface is that it combines two very different ways of comparing items into a single interface. Firstly, implementions this interface offer the ability to compare two T's to see which is “less than” or “before” the other, as offered by the Compare method. Secondly, they also offer this ability to get a hash code from a T and compare two T's just for equality, as offered by the GetHashCode and Equals methods.
Both of these are very useful abilities. However, the two abilities should be offered separately, and placed into two different interfaces. Almost always (I can't think of a counter-example), users of this interface want either the “ordering” capability, or the “hash/equals” capability, but not both. For example, a hash table collection only uses the “hash/equals” capability, and never cares about which object is less than another objects. On the other hand, a sorting algorithm or a binary tree collection only cares about ordering, and never wants to see a hash code. By using the same interface, it is unclear to the user whether a particular collection cares about ordering or hashing; this is typically very important in order to correctly use a collection.
From the point of view of people implementing the interface, it is also important to be able to implement the two capabilities separately. There are many data structures, for example, where hashing and equality are trivial and obvious operations, but there may be no natural ordering (consider complex numbers, say). Conversely, there are cases where and ordering is very important, but calculating a hash is burdensome and unneeded. Of course, one can always implement some methods, and for others throw a “NotImplementedException”, but that is confusing and error prone.
It would be far preferable to have two interfaces:
interface IComparer<T> {
int Compare(T t1, T t2);
}
interface IKeyComparer<T> {
int GetHashCode(T t);
bool Equals(T t1, T t2);
}
Most maddeningly, the NON-generic collection classes (finally) get this right in .NET 2.0, as the non-generic System.Collections.IKeyComparer was introduced as a replacement for the (incomplete, broken) System.Collections.IHashCodeProvider.
I suspect that in the generic collections cases, someone decided to combine the two interfaces in the name of “simplicity”, without considering the implications very closely. I would urge the BCL team to reconsider!