C++ 哈希表原理与 unordered_map/set 封装实现

哈希表通过映射关系实现快速查找，核心在于哈希函数与冲突解决策略。常见冲突处理方式包括闭散列（线性探测、二次探测）和开散列（拉链法）。文章详细分析了负载因子对性能的影响，以及扩容时的重新哈希过程。基于 C++ 模板实现了哈希表底层结构，封装了 unordered_set 和 unordered_map，解决了迭代器遍历、const 正确性及非整型 Key 的哈希计算问题。

Ne0发布于 2026/3/16更新于 2026/7/629 浏览

1. unordered 系列关联式容器

下面来看哈希，首先看关联式容器 unordered_map 和 unordered_set，它们底层是哈希表，用法和 map set 一样。它是单向迭代器，因为没有 rbegin 和 rend。也就是红黑树和哈希表实现的 map 和 set 用法几乎相同，区别是：

unordered 系列是单向迭代器。
unordered 系列遍历出来不是有序的。

它只能去重，不能排序，它也是有 multi 版本的。

2. 哈希

什么是哈希呢？我们以前遇到的搜索有这样几类：首先是暴力查找，在一个数组里都查，这样非常慢。于是有人衍生出了有序数组的二分查找，但它的前提是排序，而且增删查改不方便，过程中为了保证有序会涉及大量的数据挪动。因此衍生出了平衡搜索树，此时基础上又出现了新的搜索，这种搜索叫哈希 (散列)。

它的本质是存储的值跟存储位置建立出一个映射关系。例如计数排序的样例中，有最小值 15，最大值 30，总共开了 16 个空间。然后存映射关系 (次数)，15 映射第一个位置，16 映射第二个位置……以后值是多少去对应的位置找，这就是一种哈希。

但上述如果数据太分散空间消耗量会很大，之前的方式在哈希表中叫直接定址法 (用于值的分布范围集中)，它是根据值相对直接找一个位置，数据太分散不适合。此时若数据比较分散，还是开固定空间，比如开 20 个。来个值 key，这用除留系数法，通过除留的系数确定位置，满了再考虑扩容。用 i=key%20(空间个数)，这样算出来第 i 个位置，然后把值放进去。

它的问题是会引发哈希冲突或碰撞，也就是不同的值，可能会映射到同一个位置，值和位置是多对一的关系。那怎么办？有一种方法是再按照一种规则去找下一个位置存，比如有种方式叫闭散列——开放定址法：1.线性探测 2.二次探测。也就是找的这个位置被占了，再去找其它位置占。

还有一种方式叫拉链法，也叫哈希桶。这个位置存指针，把桶用指针链接起来，一个位置有多个值来个多对一的关系，这种方式不影响别人。

3. 闭散列——开放定址法

下面来看看闭散列——开放地址法中的线性探测。空间大小是 10，来了一堆值映射到这些位置。现在来了 111 和 1 冲突了，放到下一个位置。44 和 4 冲突了，然后不断向后探测都有值，找到 8 位置有空位就把值放进去了。

如果走完了没找到空位置，要立刻扩容吗？只有 19 个，现在来了个 29，往后走越界了，不需要立刻扩容，绕到开头找空位放。

再看一个问题，下图中想查找 44 怎么查找：44%10 算出在 4 的位置，但 44 不在这个位置，不一定停，可能在下一个位置，就不断往后找，找到 8 位置就找到了。如果找的是 54，模出来在 4 位置，不是 54 往后走，找到空位置可以停下来。侧面反映出如果表太满了，此时查找一些不存在的值效率必然下降，接近暴力查。

还有个问题是比如要删 6，是找到 6 在的位置然后删 6 吗？这样会有些问题：1.把 6 删了该位置的值抹成 0 还是随机值？2.此时查 44 会发生什么？找到空就结束了，这样删除影响了查找。怎么解决？这里要想办法用一个标记，用状态标记，这严格说有三种状态：1.EXIST 存在 2.EMPTY 空 3.DELETE 删除。此时想删 6，不需要真干了 6，把这个状态设置为 DELETE，然后查 44 时遇到 DELETE 不停止，遇到 EMPTY 才停止。

下面来写一下：还是定义为 KV 结构，如果自己写就需要写个数组并定义数组大小及容量，现在用 vector 就行。这里不能直接定义出是一个 pair 的结构，因为除了 KV 还有状态，所以枚举状态。再定义一个哈希的数据，给个 pair 和 STATE 数据。哈希表中定义一个 vector，里面放 HashDate，再定义一个 n，那 vector 已有 size 了为啥还要定义 n 呢？因为它和顺序表的区别是它的数据要分散存储，它的物理结构是数组，逻辑上是个哈希表，这的 n 表示存储的有效数据的个数。

vector 可以不初始化，因为它有默认构造，n 可以给个值。下一步来 insert 一个值，insert 一个 pair(先不考虑扩容，把算位置的问题解决了)：先算一个起始位置，可以用 first 模上表的大小。这的大小是 size 还是 capacity 呢？假设模的是_table.capacity，模完后刚好那个位置没有被占，就把它的值正常放进去。

假设 vector 是这样的：现在比如模出的位置在 18，此时 18 这个位置不能访问，方括号中会断言下标在 0~size-1。所以这里不能模 capacity，要模 size，其次最好让 size 和 capacity 相等，这样不存在空间浪费的问题 (后面扩容想办法)。

下面继续完善，如果这个位置有值要继续探测，探测到空说明没值可以放；遇到这个位置是删除也是可以填值的。

还有个问题是这个表不敢太满，比如只剩下 1 或 2 个位置随便来个值都可能冲突，若满了来值还会造成死循环，因此哈希表中引入了一个概念叫负载或载荷因子。它用来标识存的值占空间的比列。负载因在越大，冲突概率越大，空间利用率越高；负载因为越小，冲突概率越小，空间利用率越低。通过上述介绍可以看到，哈希表不能满了再扩容，控制负载因子到一定值就扩容。下面写写扩容逻辑：如果 table 大小为 0 或者有效数据个数除大小大于等于 0.7 就扩容，为了有小数记得隐式转化为 double。但 table 刚开始为 0，可写个构造用 resize 让 size 和 capacity 为 10。

现在如果符合条件扩容 2 倍，可以这样写吗：用 resize 开一段大空间，把数据拷贝下来。现在查找 111，111%20==11，会导致去 11 的位置找。这里扩容不是重点，重点是值的重新映射。因此不能直接用 resize 下来值，因为所有的值扩容后映射关系变了，需要重新映射。那应该怎么弄？可以去开一个新的 vector，把原来的值重新插入到 vector 里面。

相关免费在线工具

加密/解密文本

使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online

Gemini 图片去水印

基于开源反向 Alpha 混合算法去除 Gemini/Nano Banana 图片水印，支持批量处理与下载。在线工具，Gemini 图片去水印在线工具，online

Base64 字符串编码/解码

将字符串编码和解码为其 Base64 格式表示形式即可。在线工具，Base64 字符串编码/解码在线工具，online

Base64 文件转换器

将字符串、文件或图像转换为其 Base64 表示形式。在线工具，Base64 文件转换器在线工具，online

Markdown转HTML

将 Markdown（GFM）转为 HTML 片段，浏览器内 marked 解析；与 HTML转Markdown 互为补充。在线工具，Markdown转HTML在线工具，online

HTML转Markdown

将 HTML 片段转为 GitHub Flavored Markdown，支持标题、列表、链接、代码块与表格等；浏览器内处理，可链接预填。在线工具，HTML转Markdown在线工具，online

//HashTable.h #pragma once #include <vector> #include <string> #include <stdio.h> template<class K> struct DefaultHashFunc { size_t operator()(const K& key) { return (size_t)key; } }; template<> struct DefaultHashFunc<string> { size_t operator()(const string& str) { size_t hash = 0; for (auto ch : str) { hash *= 131; hash += ch; } return hash; } }; namespace open_address { enum STATE { EXIST, EMPTY, DELETE }; template<class K, class V> struct HashData { pair<K, V> _kv; STATE _state = EMPTY; }; template<class K, class V, class HashFunc = DefaultHashFunc<K>> class HashTable { public: HashTable() { _table.resize(10); } bool Insert(const pair<K, V>& kv) { if (Find(kv.first)) { return false; } //扩容 if (_n * 10 / _table.size() >= 7) { size_t newSize = _table.size() * 2; //遍历旧表，重新映射到新表 HashTable<K, V, HashFunc> newHT; newHT._table.resize(newSize); //遍历旧表的数据插入到新表即可 for (size_t i = 0; i < _table.size(); i++) { if (_table[i]._state == EXIST) { newHT.Insert(_table[i]._kv); } } _table.swap(newHT._table); } //线性探测 HashFunc hf; size_t hashi = hf(kv.first) % _table.size(); while (_table[hashi]._state == EXIST) { ++hashi; hashi %= _table.size(); } _table[hashi]._kv = kv; _table[hashi]._state = EXIST; ++_n; return true; } HashData<const K, V>* Find(const K& key) { //线性探测 HashFunc hf; size_t hashi = hf(key) % _table.size(); while (_table[hashi]._state != EMPTY) { if (_table[hashi]._state == EXIST && _table[hashi]._kv.first == key) { return (HashData<const K, V>*) & _table[hashi]; } ++hashi; hashi %= _table.size(); } return nullptr; } bool Erase(const K& key) { HashData<const K, V>* ret = Find(key); if (ret) { ret->_state = DELETE; --_n; return true; } return false; } private: vector<HashData<K, V>> _table; size_t _n = 0; //存储有效数据的个数 }; } namespace hash_backet { template<class T> struct HashNode { pair<K, V> _kv; HashNode<K, V>* _next; HashNode(const pair<K, V>& kv) :_kv(kv), _next(nullptr) {} }; template<class K, class V, class HashFunc = DefaultHashFunc<K>> class HashTable { typedef HashNode<K, V> Node; public: HashTable() { _table.resize(10, nullptr); } ~HashTable() { for (size_t i = 0; i < _table.size(); i++) { Node* cur = _table[i]; while (cur) { Node* next = cur->_next; delete cur; cur = next; } _table[i] = nullptr; } } bool Insert(const pair<K, V>& kv) { if (Find(kv.first)) { return false; } HashFunc hf; //负载因子到 1 就扩容 if (_n == _table.size()) { size_t newSize = _table.size() * 2; vector<Node*> newTable; newTable.resize(newSize, nullptr); //遍历旧表，顺手牵羊，把结点迁下来移到新表 for (size_t i = 0; i < _table.size(); i++) { Node* cur = _table[i]; while (cur) { Node* next = cur->_next; //头插到新表 size_t hashi = hf(cur->_kv.first) % newSize; cur->_next = newTable[hashi]; newTable[hashi] = cur; cur = next; } _table[i] = nullptr; } _table.swap(newTable); } size_t hashi = hf(kv.first) % _table.size(); //头插 Node* newnode = new Node(kv); newnode->_next = _table[hashi]; _table[hashi] = newnode; ++_n; return true; } Node* Find(const K& key) { HashFunc hf; size_t hashi = hf(key) % _table.size(); Node* cur = _table[hashi]; while (cur) { if (cur->_kv.first == key) { return cur; } cur = cur->_next; } return nullptr; } bool Erase(const K& key) { HashFunc hf; size_t hashi = hf(key) % _table.size(); Node* prev = nullptr; Node* cur = _table[hashi]; while (cur) { if (cur->_kv.first == key) { if (prev == nullptr) { _table[hashi] = cur->_next; } else { prev->_next = cur->_next; } delete cur; return true; } prev = cur; cur = cur->_next; } return false; } void Print() { for (size_t i = 0; i < _table.size(); i++) { printf("[%d]->", i); Node* cur = _table[i]; while (cur) { cout << cur->_kv.first << "->"; cur = cur->_next; } printf("NULL\n"); } } private: vector<Node*> _table; //指针数组 size_t _n; //存储了多少个有效数据 }; }

//unordered_map.h #pragma once #include "HashTable.h" namespace yxx { template<class K, class V> class unordered_map { struct MapKeyOfT { const K& operator()(const pair<const K, V>& kv) { return kv.first; } }; public: typedef typename hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT>::iterator iterator; typedef typename hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT>::iterator const_iterator; iterator begin() { return _ht.begin(); } iterator end() { return _ht.end(); } const_iterator begin() const { return _ht.begin(); } const_iterator end() const { return _ht.end(); } pair<iterator, bool> insert(const pair<K, V>& kv) { return _ht.Insert(kv); } V& operator[](const K& key) { pair<iterator, bool> ret = _ht.Insert(make_pair(key, V())); return ret.first->second; } private: hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT> _ht; }; } //unordered_set.h #pragma once #include "HashTable.h" namespace yxx { template<class K> class unordered_set { struct SetKeyOfT { const K& operator()(const K& key) { return key; } }; public: typedef typename hash_bucket::HashTable<K, K, SetKeyOfT>::const_iterator iterator; typedef typename hash_bucket::HashTable<K, K, SetKeyOfT>::const_iterator const_iterator; iterator begin() const { return _ht.begin(); } iterator end() const { return _ht.end(); } pair<iterator, bool> insert(const K& key) { pair<typename hash_bucket::HashTable<K, K, SetKeyOfT>::iterator, bool> ret = _ht.Insert(key); return pair<const_iterator, bool>(ret.first, ret.second); } private: hash_bucket::HashTable<K, K, SetKeyOfT> _ht; }; } //HashTable.h #pragma once #include <vector> #include <string> #include <stdio.h> #include <iostream> template<class K> struct DefaultHashFunc { size_t operator()(const K& key) { return (size_t)key; } }; template<> struct DefaultHashFunc<string> { size_t operator()(const string& str) { size_t hash = 0; for (auto ch : str) { hash *= 131; hash += ch; } return hash; } }; namespace hash_bucket { template<class T> struct HashNode { T _data; HashNode<T>* _next; HashNode(const T& data) :_data(data), _next(nullptr) {} }; //前置声明 template<class K, class T, class KeyOfT, class HashFunc> class HashTable; template<class K, class T, class Ptr, class Ref, class KeyOfT, class HashFunc> struct HTIterator { typedef HashNode<T> Node; typedef HTIterator<K, T, Ptr, Ref, KeyOfT, HashFunc> Self; typedef HTIterator<K, T, T*, T&, KeyOfT, HashFunc> Iterator; Node* _node; const HashTable<K, T, KeyOfT, HashFunc>* _pht; HTIterator(Node* node, const HashTable<K, T, KeyOfT, HashFunc>* pht) :_node(node), _pht(pht) {} HTIterator(const Iterator& it) :_node(it._node), _pht(it._pht) {} Ref operator*() { return _node->_data; } Ptr operator->() { return &_node->_data; } Self& operator++() { if (_node->_next) { _node = _node->_next; } else { KeyOfT kot; HashFunc hf; size_t hashi = hf(kot(_node->_data)) % _pht->_table.size(); //从下一个位置查找查找下一个不为空的桶 ++hashi; while (hashi < _pht->_table.size()) { if (_pht->_table[hashi]) { _node = _pht->_table[hashi]; return *this; } else { ++hashi; } } _node = nullptr; } return *this; } bool operator!=(const Self& s) { return _node != s._node; } bool operator==(const Self& s) { return _node == s._node; } }; template<class K, class T, class KeyOfT, class HashFunc = DefaultHashFunc<K>> class HashTable { typedef HashNode<T> Node; //友元声明 template<class K, class T, class Ptr, class Ref, class KeyOfT, class HashFunc> friend struct HTIterator; public: typedef HTIterator<K, T, T*, T&, KeyOfT, HashFunc> iterator; typedef HTIterator<K, T, const T*, const T&, KeyOfT, HashFunc> const_iterator; iterator begin() { //找第一个桶 for (size_t i = 0; i < _table.size(); i++) { Node* cur = _table[i]; if (cur) { return iterator(cur, this); } } return iterator(nullptr, this); } iterator end() { return iterator(nullptr, this); } const_iterator begin() const { //找第一个桶 for (size_t i = 0; i < _table.size(); i++) { Node* cur = _table[i]; if (cur) { return const_iterator(cur, this); } } return const_iterator(nullptr, this); } const_iterator end() const { return const_iterator(nullptr, this); } HashTable() { _table.resize(10, nullptr); } ~HashTable() { for (size_t i = 0; i < _table.size(); i++) { Node* cur = _table[i]; while (cur) { Node* next = cur->_next; delete cur; cur = next; } _table[i] = nullptr; } } pair<iterator, bool> Insert(const T& data) { KeyOfT kot; iterator it = Find(kot(data)); if (it != end()) { return make_pair(it, false); } HashFunc hf; //负载因子到 1 就扩容 if (_n == _table.size()) { size_t newSize = _table.size() * 2; vector<Node*> newTable; newTable.resize(newSize, nullptr); //遍历旧表，顺手牵羊，把结点迁下来移到新表 for (size_t i = 0; i < _table.size(); i++) { Node* cur = _table[i]; while (cur) { Node* next = cur->_next; //头插到新表 size_t hashi = hf(kot(cur->_data)) % newSize; cur->_next = newTable[hashi]; newTable[hashi] = cur; cur = next; } _table[i] = nullptr; } _table.swap(newTable); } size_t hashi = hf(kot(data)) % _table.size(); //头插 Node* newnode = new Node(data); newnode->_next = _table[hashi]; _table[hashi] = newnode; ++_n; return make_pair(iterator(newnode, this), true); } iterator Find(const K& key) { HashFunc hf; KeyOfT kot; size_t hashi = hf(key) % _table.size(); Node* cur = _table[hashi]; while (cur) { if (kot(cur->_data) == key) { return iterator(cur, this); } cur = cur->_next; } return end(); } bool Erase(const K& key) { HashFunc hf; KeyOfT kot; size_t hashi = hf(key) % _table.size(); Node* prev = nullptr; Node* cur = _table[hashi]; while (cur) { if (kot(cur->_data) == key) { if (prev == nullptr) { _table[hashi] = cur->_next; } else { prev->_next = cur->_next; } delete cur; return true; } prev = cur; cur = cur->_next; } --_n; return false; } void Print() { for (size_t i = 0; i < _table.size(); i++) { printf("[%d]->", i); Node* cur = _table[i]; while (cur) { cout << cur->_data << "->"; cur = cur->_next; } printf("NULL\n"); } } private: vector<Node*> _table; //指针数组 size_t _n; //存储了多少个有效数据 }; }

C++ 哈希表原理与 unordered_map/set 封装实现

1. unordered 系列关联式容器

2. 哈希

3. 闭散列——开放定址法

更多推荐文章

相关免费在线工具

4. 二次探测及拉链法

5. 完整代码

6. 封装

7. 完整代码

更多推荐文章

相关免费在线工具

C++ 哈希表原理与 unordered_map/set 封装实现

1. unordered 系列关联式容器

2. 哈希

3. 闭散列——开放定址法

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

4. 二次探测及拉链法

5. 完整代码

6. 封装

7. 完整代码

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具