Skip to content

Latest commit

 

History

History
45 lines (29 loc) · 2.28 KB

401-class30.md

File metadata and controls

45 lines (29 loc) · 2.28 KB

Hash Table Overview

A hash table, also known as a hash map, is a data structure that allows for the efficient storage and retrieval of key-value pairs. It utilizes a hash function to compute an index into an array of slots or buckets, from which the desired value can be found.

Key Components of a Hash Table

  • Keys: Unique identifiers used to store and retrieve values.
  • Values: Data associated with a particular key.
  • Hash Function: A function that takes a key as input and returns an index into the array of buckets. Ideally, it distributes the keys uniformly across the array.
  • Buckets/Slots: Storage spaces in the array where the key-value pairs are stored.
  • Collision: Occurs when two different keys produce the same index using the hash function. Common techniques to handle collisions include separate chaining (where each bucket contains a list of key-value pairs) and open addressing (like linear probing, where we search for the next available slot).

Hash Table Cheat Sheet

Operations and their average time complexities

  • Insert: O(1) on average, O(n) worst-case due to collisions.
  • Delete: O(1) on average, O(n) worst-case.
  • Lookup: O(1) on average, O(n) worst-case.

Handling Collisions

  • Separate Chaining: Each bucket contains a list or another data structure. If two keys hash to the same index, their key-value pairs are stored in the same list.
  • Open Addressing: If a bucket is already occupied, search for the next available slot. Variants include linear probing, quadratic probing, and double hashing.

Resizing

When the hash table becomes too full (often determined by a load factor), it may need to be resized to maintain efficiency. This involves creating a larger table and rehashing all existing key-value pairs into it.

Choosing a Good Hash Function

  • Distributes keys uniformly across the bucket array.
  • Is consistent, i.e., it returns the same output for the same input.
  • Is efficient and computes the hash in constant time.

Common Uses

  • Databases for quick data retrieval.
  • Caching to store and retrieve data rapidly.
  • Associative arrays in programming languages like Python's dictionaries.

Potential Issues

  • Collisions can degrade performance.
  • Poorly chosen hash functions can lead to clustering of keys.