Hashing


Hashing

We were able to improve our search algorithms in previous sections by using information about where objects are stored in the collection in relation to one another. We could, for example, use a binary search in logarithmic time if we knew a list was ordered. We'll take it a step further in this part by creating a data structure that can be searched.

O(1)

It takes O(1) time. Hashing is the term for this notion.

To do so, we'll need to know even more about where the artifacts might be located in the collection when we look for them. If every item is where it should be, the search can rely on a single comparison to determine whether or not an item exists. However, as we will see, this is not always the case.

A hash table is a collection of items that are organized in such a way that they may be easily found afterward. Each slot in the hash table, also known as a position, can retain an item designated by an integer value beginning at 0. We'll have a slot named 0, a slot named 1, a slot named 2, and so on, for example. Because there are no items in the hash table at first, every slot is empty. A hash table can be created by utilizing a list with each element set to the special Python value None. Figure 4 depicts a hash table with the dimensions m=11 m=11. In other words, the table has m slots, numbered 0 through 10.

The hash function is the mapping between an item and the place in the hash table where that item resides. The hash function will return an integer in the range of slot names, between 0 and m-1, for any item in the collection. Assume we have 54, 26, 93, 17, 77, and 31 as a set of integers. Our first hash function, sometimes known as the "remainder technique," simply divides an item by the table size and returns the remainder as its hash value (h(item)=item percent 11).