哈希表原理与 C++ 实现详解

综述由AI生成哈希表通过映射关系实现 O(1) 平均查找。本文讲解开放定址法（线性探测）与拉链法两种冲突解决策略，涵盖负载因子控制、扩容重哈希及迭代器封装细节。结合 C++ 模板编程，演示了从基础 HashTable 到 unordered_map/set 的完整实现过程，重点解析了删除标记、状态管理及 const 迭代器处理等关键难点。

咸鱼开飞机发布于 2026/3/24更新于 2026/5/87 浏览

关联式容器中的无序系列

我们先快速浏览一下 unordered_map 和 unordered_set。它们底层基于哈希表，用法与 map、set 类似，但有几个关键区别：

迭代器是单向的，不支持反向遍历（没有 rbegin/rend）。
遍历时数据是无序的。

红黑树实现的 map/set 与哈希表实现的 unordered_map/set 在接口上几乎一致，主要差异在于性能特征和顺序性。下面演示一下基本用法：

文章配图

它主要用于去重，不保证排序，当然也有 multi 版本。再看 unordered_map 的演示：

文章配图

深入理解哈希

什么是哈希？回顾之前的搜索方式：暴力查找效率太低；有序数组二分查找虽然快，但增删改涉及大量数据移动；平衡搜索树解决了部分问题，而哈希（散列）则提供了一种新的思路。

哈希的本质是在存储的值与存储位置之间建立映射关系。举个计数排序的例子：

文章配图

假设最小值 15，最大值 30，开了 16 个空间。15 映射到第一个位置，16 到第二个……以后查值直接去对应位置找。这属于直接定址法，适用于值分布集中的场景。但如果数据太分散，空间消耗会很大。

针对数据分散的情况，我们通常使用除留余数法。设定固定空间大小（比如 20），通过 i = key % 20 确定位置。但这会引发哈希冲突，即不同的值映射到同一个位置（多对一）。例如 27 和 47 模 20 都是 7：

文章配图

解决冲突主要有两种策略：闭散列（开放定址法）和拉链法（哈希桶）。

闭散列——开放定址法

闭散列的核心思想是当当前位置被占用时，按某种规则寻找下一个空位。常见的有线性探测和二次探测。

以线性探测为例，空间大小为 10：

文章配图

如果来了 111 和 1 冲突了，就放到下一个位置。44 和 4 冲突了，不断向后探测直到找到空位：

文章配图

如果走完了没找到空位怎么办？不需要立刻扩容，可以绕回开头继续找：

文章配图

查找逻辑也类似，算出初始位置后，如果不是目标值且不是空位，就继续往后找。如果遇到空位还没找到，说明不存在。这也反映出如果表太满，查找效率会下降，接近暴力查找。

删除操作有个坑：不能简单抹成 0，否则会影响后续查找（遇到空位就停止）。我们需要引入状态标记，严格分为三种：

EXIST：存在
EMPTY：空
DELETE：已删除

删除时将状态设为 DELETE，查找时遇到 DELETE 不停止，只有遇到 EMPTY 才停止。

代码实现细节

定义 KV 结构体，配合状态枚举。这里用 vector 存储数据，并维护一个 _n 记录有效数据个数（区别于物理容量）。

插入时先计算起始位置 hashi = key % size。注意这里模的是 size 而不是 capacity，避免越界断言。为了减少浪费，通常保持 size 和 capacity 相等。

如果该位置已被占用，继续探测。遇到 DELETE 状态也可以填入新值。这里引入了负载因子的概念：

文章配图

负载因子越大，冲突概率越高，空间利用率越高。一般控制在 0.7 左右触发扩容。扩容逻辑是开一个新表（通常是原大小的 2 倍），将旧数据重新映射插入到新表中，而不是直接拷贝，因为扩容后哈希位置会变。

//HashTable.h #pragma once #include <vector> #include <string> #include <stdio.h> template<class K> struct DefaultHashFunc { size_t operator()(const K& key) { return (size_t)key; } }; //struct StringHashFunc //{ // size_t operator()(const string& str) // { // return str[0]; // } //}; template<> struct DefaultHashFunc<string> { size_t operator()(const string& str) { size_t hash = 0; for (auto ch : str) { hash *= 131; hash += ch; } return hash; } }; namespace open_address { enum STATE { EXIST, EMPTY, DELETE }; template<class K, class V> struct HashData { pair<K, V> _kv; STATE _state = EMPTY; }; template<class K, class V, class HashFunc = DefaultHashFunc<K>> class HashTable { public: HashTable() { _table.resize(10); } bool Insert(const pair<K, V>& kv) { if (Find(kv.first)) { return false; } //扩容 if (_n * 10 / _table.size() >= 7) { size_t newSize = _table.size() * 2; //遍历旧表，重新映射到新表 HashTable<K, V, HashFunc> newHT; newHT._table.resize(newSize); //遍历旧表的数据插入到新表即可 for (size_t i = 0; i < _table.size(); i++) { if (_table[i]._state == EXIST) { newHT.Insert(_table[i]._kv); } } _table.swap(newHT._table); } ////扩容 ////if((double)_n / (double)_table.size() >= 0.7) //if (_n*10 / _table.size() >= 7) //{ // size_t newSize = _table.size() * 2; // vector<HashData> newtable; // newtable.resize(newSize); // //遍历链表，重新映射到新表 //} //线性探测 HashFunc hf; size_t hashi = hf(kv.first) % _table.size(); while (_table[hashi]._state == EXIST) { ++hashi; hashi %= _table.size(); } _table[hashi]._kv = kv; _table[hashi]._state = EXIST; ++_n; return true; } HashData<const K, V>* Find(const K& key) { //线性探测 HashFunc hf; size_t hashi = hf(key) % _table.size(); while (_table[hashi]._state != EMPTY) { if (_table[hashi]._state == EXIST && _table[hashi]._kv.first == key) { return (HashData<const K, V>*) & _table[hashi]; } ++hashi; hashi %= _table.size(); } return nullptr; } bool Erase(const K& key) { HashData<const K, V>* ret = Find(key); if (ret) { ret->_state = DELETE; --_n; return true; } return false; } private: vector<HashData<K, V>> _table; size_t _n = 0; //存储有效数据的个数 }; }

仿函数与模板特化

Key 可能不是整型（如 string），无法直接取模。我们需要两层映射：先将 Key 转为整型哈希值，再映射到位置。这可以通过仿函数解决。

默认仿函数处理整型，针对 string 进行特化，使用更复杂的算法（如 BKDR 哈希）减少冲突：

template<> struct DefaultHashFunc<string> {
    size_t operator()(const string& str) {
        size_t hash = 0;
        for (auto ch : str) {
            hash *= 131;
            hash += ch;
        }
        return hash;
    }
};

这样既支持默认类型，也能灵活扩展自定义类型。

二次探测及拉链法

线性探测容易导致'聚集'现象，即冲突值扎堆。二次探测尝试分散冲突，每次探测步长增加平方值。不过更常用的方案是拉链法（哈希桶）。

拉链法将数组元素改为指针，每个位置挂一个链表。冲突时直接在链表头插或尾插，互不影响。

哈希桶实现

基础结构是 vector<Node*>，同样需要维护有效数据个数 _n。插入时计算位置，头插节点。

扩容时，为了避免频繁创建销毁节点，可以直接迁移节点到新表。遍历旧表链表，将节点摘下头插到新表的对应位置，最后交换新旧表指针。

namespace hash_bucket {
    template<class T> struct HashNode {
        T _data;
        HashNode<T>* _next;
        HashNode(const T& data) :_data(data), _next(nullptr) {}
    };

    template<class K, class T, class KeyOfT, class HashFunc = DefaultHashFunc<K>>
    class HashTable {
        typedef HashNode<T> Node;
        //友元声明
        template<class K, class T, class Ptr, class Ref, class KeyOfT, class HashFunc>
        friend struct HTIterator;
    public:
        typedef HTIterator<K, T, T*, T&, KeyOfT, HashFunc> iterator;
        typedef HTIterator<K, T, const T*, const T&, KeyOfT, HashFunc> const_iterator;

        iterator begin() {
            for (size_t i = 0; i < _table.size(); i++) {
                Node* cur = _table[i];
                if (cur) { return iterator(cur, this); }
            }
            return iterator(nullptr, this);
        }
        iterator end() { return iterator(nullptr, this); }
        const_iterator begin() const {
            for (size_t i = 0; i < _table.size(); i++) {
                Node* cur = _table[i];
                if (cur) { return const_iterator(cur, this); }
            }
            return const_iterator(nullptr, this);
        }
        const_iterator end() const { return const_iterator(nullptr, this); }

        HashTable() { _table.resize(10, nullptr); }
        ~HashTable() {
            for (size_t i = 0; i < _table.size(); i++) {
                Node* cur = _table[i];
                while (cur) {
                    Node* next = cur->_next;
                    delete cur;
                    cur = next;
                }
                _table[i] = nullptr;
            }
        }

        pair<iterator, bool> Insert(const T& data) {
            KeyOfT kot;
            iterator it = Find(kot(data));
            if (it != end()) { return make_pair(it, false); }
            HashFunc hf;
            //负载因子到 1 就扩容
            if (_n == _table.size()) {
                size_t newSize = _table.size() * 2;
                vector<Node*> newTable;
                newTable.resize(newSize, nullptr);
                //遍历旧表，顺手牵羊，把结点迁下来移到新表
                for (size_t i = 0; i < _table.size(); i++) {
                    Node* cur = _table[i];
                    while (cur) {
                        Node* next = cur->_next;
                        //头插到新表
                        size_t hashi = hf(kot(cur->_data)) % newSize;
                        cur->_next = newTable[hashi];
                        newTable[hashi] = cur;
                        cur = next;
                    }
                    _table[i] = nullptr;
                }
                _table.swap(newTable);
            }
            size_t hashi = hf(kot(data)) % _table.size();
            //头插
            Node* newnode = new Node(data);
            newnode->_next = _table[hashi];
            _table[hashi] = newnode;
            ++_n;
            return make_pair(iterator(newnode, this), true);
        }

        iterator Find(const K& key) {
            HashFunc hf;
            KeyOfT kot;
            size_t hashi = hf(key) % _table.size();
            Node* cur = _table[hashi];
            while (cur) {
                if (kot(cur->_data) == key) { return iterator(cur, this); }
                cur = cur->_next;
            }
            return end();
        }

        bool Erase(const K& key) {
            HashFunc hf;
            KeyOfT kot;
            size_t hashi = hf(key) % _table.size();
            Node* prev = nullptr;
            Node* cur = _table[hashi];
            while (cur) {
                if (kot(cur->_data) == key) {
                    if (prev == nullptr) {
                        _table[hashi] = cur->_next;
                    } else {
                        prev->_next = cur->_next;
                    }
                    delete cur;
                    return true;
                }
                prev = cur;
                cur = cur->_next;
            }
            --_n;
            return false;
        }

    private:
        vector<Node*> _table; //指针数组
        size_t _n; //存储了多少个有效数据
    };
}

封装 unordered_map 与 set

有了基础的哈希表，我们可以进一步封装 unordered_map 和 unordered_set。核心在于处理模板参数和迭代器。

对于 set，内部存储的是 Key；对于 map，存储的是 pair<Key, Value>。我们需要一个 KeyOfT 仿函数来提取 Key 用于哈希计算。

迭代器的实现是关键难点。需要支持普通迭代器和常量迭代器，处理 operator++ 时不仅要遍历当前链表的下一个节点，还要能跳到下一个非空的桶。

//unordered_map.h #pragma once #include "HashTable.h"
namespace yxx {
    template<class K, class V>
    class unordered_map {
        struct MapKeyOfT {
            const K& operator()(const pair<const K, V>& kv) { return kv.first; }
        };
    public:
        typedef typename hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT>::iterator iterator;
        typedef typename hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT>::iterator const_iterator;

        iterator begin() { return _ht.begin(); }
        iterator end() { return _ht.end(); }
        const_iterator begin() const { return _ht.begin(); }
        const_iterator end() const { return _ht.end(); }

        pair<iterator, bool> insert(const pair<K, V>& kv) { return _ht.Insert(kv); }
        V& operator[](const K& key) {
            pair<iterator, bool> ret = _ht.Insert(make_pair(key, V()));
            return ret.first->second;
        }
    private:
        hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT> _ht;
    };
}

//unordered_set.h #pragma once #include "HashTable.h"
namespace yxx {
    template<class K>
    class unordered_set {
        struct SetKeyOfT {
            const K& operator()(const K& key) { return key; }
        };
    public:
        typedef typename hash_bucket::HashTable<K, K, SetKeyOfT>::const_iterator iterator;
        typedef typename hash_bucket::HashTable<K, K, SetKeyOfT>::const_iterator const_iterator;

        iterator begin() const { return _ht.begin(); }
        iterator end() const { return _ht.end(); }

        pair<iterator, bool> insert(const K& key) {
            pair<typename hash_bucket::HashTable<K, K, SetKeyOfT>::iterator, bool> ret = _ht.Insert(key);
            return pair<const_iterator, bool>(ret.first, ret.second);
        }
    private:
        hash_bucket::HashTable<K, K, SetKeyOfT> _ht;
    };
}

通过这种方式，我们不仅实现了高效的哈希表，还完成了标准库风格的容器封装，包括迭代器、异常安全以及 const 正确性处理。

哈希表原理与 C++ 实现详解

咸鱼开飞机发布于 2026/3/24更新于 2026/5/87 浏览

//HashTable.h #pragma once #include <vector> #include <string> #include <stdio.h> template<class K> struct DefaultHashFunc { size_t operator()(const K& key) { return (size_t)key; } }; //struct StringHashFunc //{ // size_t operator()(const string& str) // { // return str[0]; // } //}; template<> struct DefaultHashFunc<string> { size_t operator()(const string& str) { size_t hash = 0; for (auto ch : str) { hash *= 131; hash += ch; } return hash; } }; namespace open_address { enum STATE { EXIST, EMPTY, DELETE }; template<class K, class V> struct HashData { pair<K, V> _kv; STATE _state = EMPTY; }; template<class K, class V, class HashFunc = DefaultHashFunc<K>> class HashTable { public: HashTable() { _table.resize(10); } bool Insert(const pair<K, V>& kv) { if (Find(kv.first)) { return false; } //扩容 if (_n * 10 / _table.size() >= 7) { size_t newSize = _table.size() * 2; //遍历旧表，重新映射到新表 HashTable<K, V, HashFunc> newHT; newHT._table.resize(newSize); //遍历旧表的数据插入到新表即可 for (size_t i = 0; i < _table.size(); i++) { if (_table[i]._state == EXIST) { newHT.Insert(_table[i]._kv); } } _table.swap(newHT._table); } ////扩容 ////if((double)_n / (double)_table.size() >= 0.7) //if (_n*10 / _table.size() >= 7) //{ // size_t newSize = _table.size() * 2; // vector<HashData> newtable; // newtable.resize(newSize); // //遍历链表，重新映射到新表 //} //线性探测 HashFunc hf; size_t hashi = hf(kv.first) % _table.size(); while (_table[hashi]._state == EXIST) { ++hashi; hashi %= _table.size(); } _table[hashi]._kv = kv; _table[hashi]._state = EXIST; ++_n; return true; } HashData<const K, V>* Find(const K& key) { //线性探测 HashFunc hf; size_t hashi = hf(key) % _table.size(); while (_table[hashi]._state != EMPTY) { if (_table[hashi]._state == EXIST && _table[hashi]._kv.first == key) { return (HashData<const K, V>*) & _table[hashi]; } ++hashi; hashi %= _table.size(); } return nullptr; } bool Erase(const K& key) { HashData<const K, V>* ret = Find(key); if (ret) { ret->_state = DELETE; --_n; return true; } return false; } private: vector<HashData<K, V>> _table; size_t _n = 0; //存储有效数据的个数 }; }

template<> struct DefaultHashFunc<string> { size_t operator()(const string& str) { size_t hash = 0; for (auto ch : str) { hash *= 131; hash += ch; } return hash; } };

namespace hash_bucket { template<class T> struct HashNode { T _data; HashNode<T>* _next; HashNode(const T& data) :_data(data), _next(nullptr) {} }; template<class K, class T, class KeyOfT, class HashFunc = DefaultHashFunc<K>> class HashTable { typedef HashNode<T> Node; //友元声明 template<class K, class T, class Ptr, class Ref, class KeyOfT, class HashFunc> friend struct HTIterator; public: typedef HTIterator<K, T, T*, T&, KeyOfT, HashFunc> iterator; typedef HTIterator<K, T, const T*, const T&, KeyOfT, HashFunc> const_iterator; iterator begin() { for (size_t i = 0; i < _table.size(); i++) { Node* cur = _table[i]; if (cur) { return iterator(cur, this); } } return iterator(nullptr, this); } iterator end() { return iterator(nullptr, this); } const_iterator begin() const { for (size_t i = 0; i < _table.size(); i++) { Node* cur = _table[i]; if (cur) { return const_iterator(cur, this); } } return const_iterator(nullptr, this); } const_iterator end() const { return const_iterator(nullptr, this); } HashTable() { _table.resize(10, nullptr); } ~HashTable() { for (size_t i = 0; i < _table.size(); i++) { Node* cur = _table[i]; while (cur) { Node* next = cur->_next; delete cur; cur = next; } _table[i] = nullptr; } } pair<iterator, bool> Insert(const T& data) { KeyOfT kot; iterator it = Find(kot(data)); if (it != end()) { return make_pair(it, false); } HashFunc hf; //负载因子到 1 就扩容 if (_n == _table.size()) { size_t newSize = _table.size() * 2; vector<Node*> newTable; newTable.resize(newSize, nullptr); //遍历旧表，顺手牵羊，把结点迁下来移到新表 for (size_t i = 0; i < _table.size(); i++) { Node* cur = _table[i]; while (cur) { Node* next = cur->_next; //头插到新表 size_t hashi = hf(kot(cur->_data)) % newSize; cur->_next = newTable[hashi]; newTable[hashi] = cur; cur = next; } _table[i] = nullptr; } _table.swap(newTable); } size_t hashi = hf(kot(data)) % _table.size(); //头插 Node* newnode = new Node(data); newnode->_next = _table[hashi]; _table[hashi] = newnode; ++_n; return make_pair(iterator(newnode, this), true); } iterator Find(const K& key) { HashFunc hf; KeyOfT kot; size_t hashi = hf(key) % _table.size(); Node* cur = _table[hashi]; while (cur) { if (kot(cur->_data) == key) { return iterator(cur, this); } cur = cur->_next; } return end(); } bool Erase(const K& key) { HashFunc hf; KeyOfT kot; size_t hashi = hf(key) % _table.size(); Node* prev = nullptr; Node* cur = _table[hashi]; while (cur) { if (kot(cur->_data) == key) { if (prev == nullptr) { _table[hashi] = cur->_next; } else { prev->_next = cur->_next; } delete cur; return true; } prev = cur; cur = cur->_next; } --_n; return false; } private: vector<Node*> _table; //指针数组 size_t _n; //存储了多少个有效数据 }; }

//unordered_map.h #pragma once #include "HashTable.h" namespace yxx { template<class K, class V> class unordered_map { struct MapKeyOfT { const K& operator()(const pair<const K, V>& kv) { return kv.first; } }; public: typedef typename hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT>::iterator iterator; typedef typename hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT>::iterator const_iterator; iterator begin() { return _ht.begin(); } iterator end() { return _ht.end(); } const_iterator begin() const { return _ht.begin(); } const_iterator end() const { return _ht.end(); } pair<iterator, bool> insert(const pair<K, V>& kv) { return _ht.Insert(kv); } V& operator[](const K& key) { pair<iterator, bool> ret = _ht.Insert(make_pair(key, V())); return ret.first->second; } private: hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT> _ht; }; }

//unordered_set.h #pragma once #include "HashTable.h" namespace yxx { template<class K> class unordered_set { struct SetKeyOfT { const K& operator()(const K& key) { return key; } }; public: typedef typename hash_bucket::HashTable<K, K, SetKeyOfT>::const_iterator iterator; typedef typename hash_bucket::HashTable<K, K, SetKeyOfT>::const_iterator const_iterator; iterator begin() const { return _ht.begin(); } iterator end() const { return _ht.end(); } pair<iterator, bool> insert(const K& key) { pair<typename hash_bucket::HashTable<K, K, SetKeyOfT>::iterator, bool> ret = _ht.Insert(key); return pair<const_iterator, bool>(ret.first, ret.second); } private: hash_bucket::HashTable<K, K, SetKeyOfT> _ht; }; }

哈希表原理与 C++ 实现详解

关联式容器中的无序系列

深入理解哈希

闭散列——开放定址法

代码实现细节

仿函数与模板特化

二次探测及拉链法

哈希桶实现

封装 unordered_map 与 set

哈希表原理与 C++ 实现详解

关联式容器中的无序系列

深入理解哈希

闭散列——开放定址法

代码实现细节

仿函数与模板特化

二次探测及拉链法

哈希桶实现

封装 unordered_map 与 set

更多推荐文章

相关免费在线工具

更多推荐文章

相关免费在线工具

哈希表原理与 C++ 实现详解

关联式容器中的无序系列

深入理解哈希

闭散列——开放定址法

代码实现细节

仿函数与模板特化

二次探测及拉链法

哈希桶实现

封装 unordered_map 与 set

哈希表原理与 C++ 实现详解

关联式容器中的无序系列

深入理解哈希

闭散列——开放定址法

代码实现细节

仿函数与模板特化

二次探测及拉链法

哈希桶实现

封装 unordered_map 与 set

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具