Post

How Hashmaps Really Work Under the Hood

How Hashmaps Really Work Under the Hood

Have you ever wondered how hashmaps really work under the hood? It might seem very complex, but once you understand it, it’s actually very simple.

Hashmaps Are Just Arrays With Smart Indexing

Hashmaps are basically arrays. They use generated indexes from the input.

How Hashing Creates the Index

So how do we generate those indexes? It’s right there in the name, by hashing. The exact algorithm doesn’t really matter that much, because security is not important here.

Every time you get an item from a hashmap, the input gets hashed. Then the result is used as an index to find the value.

Handling Collisions With Linked Lists

Now, how do we prevent collisions? The truth is, collisions will happen and cannot be completely prevented, but they can be handled. The simplest way is to make every single hashmap item a linked list that points to another item with the same index.

A good way to know which linked list item you are accessing is to store the original key and compare it with the input key.

Dynamic Scaling and Resizing

How do we handle dynamic scaling? To scale, you have to reallocate the memory, rehash every single item in the hashmap, and copy them into a new bigger array. That’s why it’s a good idea to start with more space and use better scaling like multiplying the size by 4 or more when it’s like 70-80% full.

The more memory you allocate, the fewer collisions there will be, but also more memory will be consumed.

Simple HashMap Structure in C

1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct HashMap {
    struct HashMapItem* items;
    uint32_t size;
    uint32_t capacity;
};

typedef struct HashMapItem HashMapItem;

struct HashMapItem {
    uint8_t* original_content;
    uint32_t original_content_size;
    struct HashMapItem* next;
    void* value;
};
This post is licensed under CC BY 4.0 by the author.