.NET Dictionaries and Overriding GetHashCode()

This post discusses why should we override GetHashCode method on our custom objects (i.e. objects of a class that we’ve written) when using them with .NET dictionaries and what will happen if we don’t. GetHashCode returns an integer hash code. Two of the properties of a hash code are relevant to our discussion here.

  • If two objects have different hash codes, those objects are definitely not equal.
  • If two objects have same hash codes, those two objects may or may not be equal.

Now here’s the central idea of this post: When using a custom object as key in a Dictionary<TKey,TValue>, if the object’s GetHashCode method returns a different value every time, regardless of its contents, then the dictionary’s indexer will always throw a KeyNotFound exception. To find out why, we need to know how Dictionary<TKey,TValue> is implemented.

Mental Model of Dictionary<TKey,TValue>’s Implementation

This description isn’t necessarily how .NET implements Dictionary<TKey,TValue> object but it is a good way to mentally model it so we can understand its behaviour. So think of a .NET dictionary as an array of LinkedList<T>s. The reason we say array of LinkedLists<T> will be clear if we consider the following scenario. Say you want to add an an object myVal1 with a key myKey1 to a Dictionary<TKey,TValue>, where myKey1 is a custom object. Dictionary<TKey,TValue> will call GetHashCode on myKey1 to get the hash code which will be an int. It will then modulo that hash code with the length of its array. That will return another int which will be within the range 0 to length of array – 1 (i.e. a valid array index). Dictionary<TKey,TValue> will then take the LinkedList<T> at that index of its array and add myVal1 to that LinkedList<T> if myKey1 doesn’t already exist or update the value with myVal1 if myKey1 already exists. This equality check is done using Equals method of myKey1.

Next when you want to access your store value using the key with a call like myDictionary[myKey1], the dictionary will again call GetHashCode on myKey1, modulo it by length of the array to arrive at the linked list. Now, the dictionary will iterate through the linked list, using Equals method to check if there is an element where myKey1 matches. If a match is found, then the dictionary returns myVal1. Otherwise a KeyNotFound exception is thrown.

Significance of GetHashCode

Now suppose the myKey1 object’s class implements GetHashCode in such a way that it always returns the same value, then technically, it can be used as a key in Dictionary<TKey,TValue>. However, it will come at a huge performance cost and negate all the benefits of using a dictionary rather than a linked list or a binary search tree for example.

On the other hand, if myKey1 returns a different value every time GetHashCode is called, the Dictionary<TKey,TValue> object will not be able to match two hash codes in the first place and will most likely throw a KeyNotFound exception.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s