Design HashSet

Tags:

Introduction

Design a HashSet without using any built-in hash table libraries.

Implement MyHashSet class:

Example 1
Input: operations = ["MyHashSet", "add", "add", "contains", "contains", "add", "contains", "remove", "contains"]
values = [[], [1], [2], [1], [3], [2], [2], [2], [2]]
Output: [null, null, null, true, false, null, true, null, false]
Explanation:
MyHashSet myHashSet = new MyHashSet();
myHashSet.add(1); // set = [1]
myHashSet.add(2); // set = [1, 2]
myHashSet.contains(1); // return True
myHashSet.contains(3); // return False, (not found)
myHashSet.add(2); // set = [1, 2]
myHashSet.contains(2); // return True
myHashSet.remove(2); // set = [1]
myHashSet.contains(2); // return False, (already removed)
Constraints:

Contents

Solution 1 - Using an array of LinkedLists

In this approach, we will use an array of LinkedLists to store elements, we will initialize the array with some initial size, in below implementation we are initializing array with 10000, and the reason we are initializing array with this size is just to have more distribution of elements instead of skewing many element under same index.

In order to find the index where to insert a given key, we will do a modular operation on the key value with array length, this will ensure that we will never got out of bounds of array.

int hashIndex = key % array.length;

Implementation steps:
  • Constructor: we will initialize array elements with empty LinkedList, so that we don't need to do null checks later.
  • add(int key): we will get the index using key % array.length, then check if the key already in the LinkedList at that index or not, if it does not contain the key, then add it to the linked list.
  • remove(int key): we will get the index using key % array.length, then remove the entry with the key from the LinkedList at that index.
  • contains(int key): we will get the index using key % array.length, then check if the element is present in the LinkedList at that index. If it present return true, false otherwise.

import java.util.Arrays;
import java.util.LinkedList;

public class MyHashSet {

    LinkedList<Integer>[] set = new LinkedList[10000]; // This can be changed as needed to spread numbers across

    public MyHashSet() { // to avoid NPEs later, initializing with empty LinkedLists
      for(int i=0;i< set.length; i++) {
          set[i] = new LinkedList<>();
      }
    }

    public void add(int key) {
        if(!contains(key)) {
            int hashKey = key % set.length;
            set[hashKey].add(key);
        }
    }

    public void remove(int key) {
        int hashKey = key % set.length;
        set[hashKey].remove(Integer.valueOf(key));
    }

    public boolean contains(int key) {
        int hashKey = key % set.length;
        return set[hashKey].contains(key);
    }


    public static void main(String[] args) {
        MyHashSet myHashSet = new MyHashSet();
        myHashSet.add(1);
        myHashSet.add(2);
        System.out.println(Arrays.toString(myHashSet.set));
        System.out.println(myHashSet.contains(1));
        System.out.println(myHashSet.contains(3));
        myHashSet.add(2);
        System.out.println(Arrays.toString(myHashSet.set));
        System.out.println(myHashSet.contains(2));
        myHashSet.remove(2);
        System.out.println(Arrays.toString(myHashSet.set));
        System.out.println(myHashSet.contains(2));
    }
}
Complexity Analysis:

Time complexity: All opeeration runs in O(1) time, add operation runs in constant time, since we are looking up element using an index and then adding element to the tail of a LinkedList, both remove and contains also runs in constant average time, when you look at the maximum key as per the problem will be 106 and our array size is 104, so even if some of the numbers are skewed at one index, it is still going to be a constant number.
Space complexity: O(1) since we are storing elements into an array of linked lists.

Above implementations source code can be found at GitHub link for Java code