C++ 哈希表封装：模拟实现 unordered_map 与 unordered_set

在深入标准库之前，理解底层数据结构的设计思路至关重要。unordered_map 和 unordered_set 作为 C++11 引入的无序关联容器，其核心依赖于哈希表（Hash Table）。本文将基于自定义哈希表框架，模拟实现这两个容器，重点解析迭代器逻辑、键值提取及扩容机制。

源码背景与结构分析

SGI-STL30 版本（C++11 之前）中并未包含标准的 unordered_map/set，当时使用的是非标准的 hash_map/hash_set。这两个容器是 C++11 标准后才正式纳入 STL 的。虽然名字不同，但核心实现思想一脉相承。

从源码架构来看，hash_map 和 hash_set 复用了同一个 hashtable 类。区别在于传递给 hashtable 的数据类型：

hash_set：传入的是两个 key（Value 本身即 Key），通过仿函数提取。
hash_map：传入的是 pair<const Key, Value>，需要专门的仿函数来提取 Key 部分。

这种复用设计体现了泛型编程的优势，但也带来了命名风格上的差异。例如 hash_set 使用 Value 命名模板参数，而 hash_map 则区分 Key 和 T，实际开发中我们应统一规范以提升可读性。

模拟实现核心思路

1. 复用哈希表框架

我们的目标是构建一个通用的 HashTable，然后在其之上封装 unordered_set 和 unordered_map。为了适配不同的容器需求，我们需要解决一个关键问题：HashTable 内部不知道传入的 T 是单个 Key 还是 pair<Key, Value>。

解决方案是使用仿函数（Functor）作为 KeyOfT 参数。在 insert 时，通过 KeyOfT 将 T 转换为 K 对象，再进行取模和比较。这样，unordered_set 可以传入 SetKeyOfT，直接返回 Key；unordered_map 传入 MapKeyOfT，提取 pair 中的 first 成员。

2. 迭代器的实现难点

哈希表的迭代器属于单向迭代器。其核心难点在于 operator++ 的实现：

如果当前桶内还有下一个节点，直接指向该节点。
如果当前桶已遍历完，需要计算找到下一个不为空的桶。

为了实现这一点，迭代器结构中除了保存当前节点指针外，还必须持有哈希表对象的指针。这样当当前桶耗尽时，可以通过哈希表对象遍历后续桶，直到找到空桶或结束位置。begin() 返回第一个非空桶的首节点，end() 则用空指针表示。

对于 unordered_set，由于不支持修改元素，迭代器对应的第二个模板参数应为 const K。而对于 unordered_map，key 不可变但 value 可变，因此 pair 的第一个参数设为 const K。

3. operator[] 的支持

unordered_map 支持下标访问运算符 []，这要求 insert 操作能返回插入位置的迭代器。修改 HashTable 的 Insert 接口，使其返回 pair<Iterator, bool>，即可轻松实现 [] 的重载：若 key 存在则返回对应 value 引用，否则插入默认构造的 value 并返回引用。

代码实现细节

以下是完整的头文件实现，包含了哈希表基础结构、迭代器逻辑以及容器的封装。

HashTable.h

#pragma once
#include <vector>
#include <algorithm>

static const int __stl_num_primes = ;
    __stl_prime_list[__stl_num_primes] = {
    , , , , , , , , , ,
    , , , , , , ,
    , , , , , ,
    , , , , 
};

   __stl_next_prime(  n) {
      * first = __stl_prime_list;
      * last = __stl_prime_list + __stl_num_primes;
      * pos = std::(first, last, n);
     pos == last ? *(last - ) : *pos;
}

< >
  {
    {
         ()key;
    }
};

<>
 <std::string> {
    {
         hash = ;
         ( ch : key) {
            hash += ch;
            hash *= ;
        }
         hash;
    }
};

 Hash_bucket {
    < >
      {
        T _data;
        HashNode<T>* _next;
        ( T& data) :_data(data), _next() {}
    };

    < ,  ,  ,  ,  ,  >
      {
         HashNode<T> Node;
          HashTable<K, T, KeyOfT, Hash> HT;
         HTIterator<K, T, Ref, Ptr, KeyOfT, Hash> Self;

        Node* _node;
         HT* _pht;

        (Node* node,  HT* pht) :_node(node), _pht(pht) {}

        Ref *() {  _node->_data; }
        Ptr ->() {  &_node->_data; }

        Self& ++() {
             (_node->_next) {
                _node = _node->_next;
            }  {
                KeyOfT kot;
                Hash hs;
                 hashi = ((_node->_data)) % _pht->_tables.();
                ++hashi;
                 (hashi < _pht->_tables.()) {
                     (_pht->_tables[hashi]) {
                        _node = _pht->_tables[hashi];
                        ;
                    }  {
                        ++hashi;
                    }
                }
                 (hashi == _pht->_tables.()) {
                    _node = ;
                }
            }
             *;
        }

         !=( Self & s)  {  _node != s._node; }
         ==( Self & s)  {  _node == s._node; }
    };

    < ,  ,  ,  >
      {
        < ,  ,  ,  ,  ,  >
          ;

         HashNode<T> Node;
    :
         HTIterator<K, T, T&, T*, KeyOfT, Hash> Iterator;
         HTIterator<K, T,  T&,  T*, KeyOfT, Hash> ConstIterator;

        {
             (_n == )  ();
             ( i = ; i < _tables.(); i++) {
                 (_tables[i])  (_tables[i], );
            }
             ();
        }

        {  (, ); }

        {
             (_n == )  ();
             ( i = ; i < _tables.(); i++) {
                 (_tables[i])  (_tables[i], );
            }
             ();
        }

        {  (, ); }

        () :_tables(__stl_next_prime(), ), _n() {}

        ~() {
             ( i = ; i < _tables.(); i++) {
                Node* cur = _tables[i];
                 (cur) {
                    Node* next = cur->_next;
                     cur;
                    cur = next;
                }
                _tables[i] = ;
            }
            _n = ;
        }

        {
            KeyOfT kot;
             ( it = ((data)); it != ())  { it,  };

            Hash hs;
            
             (_n >= _tables.()) {
                ;
                 ( i = ; i < _tables.(); i++) {
                    Node* cur = _tables[i];
                     (cur) {
                        Node* next = cur->_next;
                         hashi = ((cur->_data)) % newtables.();
                        cur->_next = newtables[hashi];
                        newtables[hashi] = cur;
                        cur = next;
                    }
                    _tables[i] = ;
                }
                _tables.(newtables);
            }

             hashi = ((data)) % _tables.();
            Node* newnode =  (data);
            newnode->_next = _tables[hashi];
            _tables[hashi] = newnode;
            ++_n;
             { (newnode, ),  };
        }

        {
            KeyOfT kot;
            Hash hs;
             hashi = (key) % _tables.();
            Node* cur = _tables[hashi];
             (cur) {
                 ((cur->_data) == key)  { cur,  };
                cur = cur->_next;
            }
             { ,  };
        }

        {
            KeyOfT kot;
            Hash hs;
             hashi = (key) % _tables.();
            Node* prev = ;
            Node* cur = _tables[hashi];
             (cur) {
                 ((cur->_data) == key) {
                     (prev == ) {
                        _tables[hashi] = cur->_next;
                    }  {
                        prev->_next = cur->_next;
                    }
                    --_n;
                     cur;
                     ;
                }
                prev = cur;
                cur = cur->_next;
            }
             ;
        }

    :
        std::vector<Node*> _tables;
         _n;
    };
}

C++ 哈希表封装：模拟实现 unordered_map 与 unordered_set