C++进阶：深入理解 unordered_set 与 unordered_map（含哈希表模拟实现） | 极客日志

C++算法

C++进阶：深入理解 unordered_set 与 unordered_map（含哈希表模拟实现）

深入剖析哈希表原理，涵盖哈希函数设计、冲突解决策略（开放定址法、哈希桶），并基于 C++ 完整模拟实现了 unordered_set 和 unordered_map 容器。内容包含负载因子计算、扩容机制、迭代器封装及内存管理细节，适合希望底层掌握 STL 实现的开发者阅读。

猫巷少女发布于 2026/3/16更新于 2026/7/124 浏览

C++ 进阶：深入理解 unordered_set 与 unordered_map（含哈希表模拟实现）

在正式讲解 STL 的 unordered_map 以及 unordered_set 这两个容器之前，我们先来回顾一下目前能够高效查找数据的数据结构。首先想到的是数组，但这里的数组并非简单的无序存储，而是先排序再借助二分算法查找。由于排序只需一次，代价可均摊到每次查找中，因此二分查找的时间复杂度为 O(logN)。但如果涉及插入及删除操作，若不在末尾，必然移动大量元素，最坏情况下时间复杂度会达到 O(N)，效率不如查找。

接着是在二叉搜索树基础上优化的 AVL 树和红黑树，它们将树高压缩到 logN，查找和插入都在 logN 量级。而综合效率最为高效的数据结构便是哈希表，本文要讲的 unordered_map 和 unordered_set 底层正是基于哈希表。要掌握这两个容器，首先得掌握哈希表的原理。

哈希表原理

哈希表通过建立键与存储位置的映射来实现高效查找。本质上它是一个动态数组，存储位置即数组索引。这种映射关系的专业术语是哈希函数。

哈希函数

哈希函数可能是一个简单的算术表达式，也可能包含复杂步骤。例如：

// 乘法哈希
uint32_t knuth_hash(uint32_t key) {
    return key * 2654435761;
}

// DJB2 字符串哈希
uint32_t djb2_hash(const char* str) {
    uint32_t hash = 5381; // 魔术种子
    int c;
    while ((c = *str++)) {
        // hash * 33 + c
        hash = ((hash << 5) + hash) + c;
    }
    return hash;
}

哈希函数的输入范围可以很大甚至无穷，但输出通常在一个固定且有限的范围内。这意味着多个不同的输入可能对应相同的输出，这种现象称为哈希碰撞。当然，如果输出范围足够大，理论上可以做到一对一映射。

哈希函数需满足以下性质：

确定性：相同输入必定产生相同输出。
有限性：输出值的范围是固定且有限的。
碰撞可能性：不同输入可能产生相同输出。
均匀性：输出值应均匀分布在数轴上，避免聚簇。
高效性：计算速度应接近 O(1)。

设计一个完美的哈希函数成本很高，我们通常直接使用前人设计好的函数即可。例如在处理海量字符串统计时，可以利用哈希函数将数据分流到不同文件，降低内存压力。

开放定址法 + 线性探测

相关免费在线工具

加密/解密文本
使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online
Gemini 图片去水印
基于开源反向 Alpha 混合算法去除 Gemini/Nano Banana 图片水印，支持批量处理与下载。在线工具，Gemini 图片去水印在线工具，online
Base64 字符串编码/解码
将字符串编码和解码为其 Base64 格式表示形式即可。在线工具，Base64 字符串编码/解码在线工具，online
Base64 文件转换器
将字符串、文件或图像转换为其 Base64 表示形式。在线工具，Base64 文件转换器在线工具，online
Markdown转HTML
将 Markdown（GFM）转为 HTML 片段，浏览器内 marked 解析；与 HTML转Markdown 互为补充。在线工具，Markdown转HTML在线工具，online
HTML转Markdown
将 HTML 片段转为 GitHub Flavored Markdown，支持标题、列表、链接、代码块与表格等；浏览器内处理，可链接预填。在线工具，HTML转Markdown在线工具，online

template<typename T>
class HashNode {
public:
    T data;
    HashNode<T>* next;
    HashNode(const T& val) : data(val), next(nullptr) {}
};

template<typename key, typename T, typename keyofT, typename HashFun = _HashFun<key>>
class HashTable {
private:
    typedef HashNode<T> Node;
    std::vector<Node*> HT;
    size_t _n;
    // ... 其他成员函数
};

HashTable() : _n(0) { HT.resize(10, nullptr); }

std::pair<iterator, bool> insert(const T& _data) {
    HashFun returnHash;
    keyofT returnKey;
    iterator exit = find(returnKey(_data));
    if (exit != end()) return std::make_pair(exit, false);

    if (_n == HT.size()) {
        // 扩容逻辑：创建新数组，重新映射节点
        std::vector<Node*> newve;
        size_t newsize = 2 * HT.size();
        newve.resize(newsize);
        for (int i = 0; i < HT.size(); i++) {
            Node* cur = HT[i];
            while (cur) {
                Node* prev = cur->next;
                size_t num = returnHash(returnKey(cur->data)) % newve.size();
                Node* next = newve[num];
                cur->next = next;
                newve[num] = cur;
                cur = prev;
            }
        }
        HT.swap(newve);
    }

    size_t num = returnHash(returnKey(_data)) % HT.size();
    Node* newnode = new Node(_data);
    newnode->next = HT[num];
    HT[num] = newnode;
    _n++;
    return std::make_pair(iterator(newnode, this), true);
}

template<typename key, typename T, typename ptr, typename ref, typename keyofT, typename HashFun>
class Hash_iterator {
private:
    typedef HashNode<T> Node;
    Node* _Node;
    const HashTable<key, T, keyofT, HashFun>* HTptr;
    // ... 构造函数及运算符重载
};

#pragma once
#include <string>
#include <vector>

namespace my_std {

template<typename T>
class _HashFun {
public:
    size_t operator()(const T& val) { return val; }
};

template<>
class _HashFun<int> {
public:
    size_t operator()(const int& val) { return static_cast<size_t>(val); }
};

template<>
class _HashFun<std::string> {
public:
    size_t operator()(const std::string& val) {
        size_t hash = 0;
        for (int i = 0; i < val.size(); i++) {
            hash = hash * 131 + val[i];
        }
        return hash;
    }
};

template<typename T>
class HashNode {
public:
    T data;
    HashNode<T>* next;
    HashNode(const T& val) : data(val), next(nullptr) {}
};

template<typename key, typename T, typename keyofT, typename HashFun = _HashFun<key>>
class HashTable;

template<typename key, typename T, typename ptr, typename ref, typename keyofT, typename HashFun = _HashFun<key>>
class Hash_iterator {
private:
    typedef HashNode<T> Node;
    typedef Hash_iterator<key, T, ptr, ref, keyofT, HashFun> self;
    typedef Hash_iterator<key, T, T*, T&, keyofT, HashFun> iterator;
    Node* _Node;
    const HashTable<key, T, keyofT, HashFun>* HTptr;

public:
    Hash_iterator(Node* ptr, const HashTable<key, T, keyofT, HashFun>* _HTptr) : _Node(ptr), HTptr(_HTptr) {}
    Hash_iterator(const iterator& it) : _Node(it.getNode()), HTptr(it.getHTptr()) {}

    Node* getNode() const { return _Node; }
    const HashTable<key, T, keyofT, HashFun>* getHTptr() const { return HTptr; }

    self& operator++() {
        if (!_Node) return *this;
        if (_Node->next) {
            _Node = _Node->next;
            return *this;
        }
        keyofT returnKey;
        HashFun returnHash;
        size_t currentnum = returnHash(returnKey(_Node->data)) % HTptr->HT.size();
        size_t nextnum = currentnum + 1;
        while (nextnum < HTptr->HT.size() && !HTptr->HT[nextnum]) {
            nextnum++;
        }
        if (nextnum < HTptr->HT.size()) {
            _Node = HTptr->HT[nextnum];
        } else {
            _Node = nullptr;
        }
        return *this;
    }

    ref operator*() { return _Node->data; }
    ptr operator->() { return &_Node->data; }

    bool operator==(const self& p) { return _Node == p._Node; }
    bool operator!=(const self& p) { return _Node != p._Node; }
};

template<typename key, typename T, typename keyofT, typename HashFun>
class HashTable {
private:
    typedef HashNode<T> Node;
    std::vector<Node*> HT;
    size_t _n;

public:
    typedef Hash_iterator<key, T, T*, T&, keyofT, HashFun> iterator;
    typedef Hash_iterator<key, T, const T*, const T&, keyofT, HashFun> const_iterator;

    template<typename key_, typename T_, typename ptr_, typename ref_, typename keyofT_, typename HashFun_>
    friend class Hash_iterator;

    HashTable() : _n(0) { HT.resize(10, nullptr); }

    ~HashTable() {
        for (int i = 0; i < HT.size(); i++) {
            Node* cur = HT[i];
            while (cur) {
                Node* next = cur->next;
                delete cur;
                cur = next;
            }
            HT[i] = nullptr;
        }
    }

    iterator find(const key& k) {
        keyofT returnKey;
        HashFun returnHash;
        size_t num = returnHash(k) % HT.size();
        Node* cur = HT[num];
        while (cur) {
            if (returnKey(cur->data) == k) return iterator(cur, this);
            cur = cur->next;
        }
        return iterator(nullptr, this);
    }

    const_iterator find(const key& k) const {
        HashFun returnHash;
        keyofT returnKey;
        size_t num = returnHash(k) % HT.size();
        Node* cur = HT[num];
        while (cur) {
            if (returnKey(cur->data) == k) return const_iterator(cur, this);
            cur = cur->next;
        }
        return const_iterator(nullptr, this);
    }

    std::pair<iterator, bool> insert(const T& _data) {
        HashFun returnHash;
        keyofT returnKey;
        iterator exit = find(returnKey(_data));
        if (exit != end()) return std::make_pair(exit, false);

        if (_n == HT.size()) {
            std::vector<Node*> newve;
            size_t newsize = 2 * HT.size();
            newve.resize(newsize);
            for (int i = 0; i < HT.size(); i++) {
                Node* cur = HT[i];
                while (cur) {
                    Node* prev = cur->next;
                    size_t num = returnHash(returnKey(cur->data)) % newve.size();
                    Node* next = newve[num];
                    cur->next = next;
                    newve[num] = cur;
                    cur = prev;
                }
            }
            HT.swap(newve);
        }

        size_t num = returnHash(returnKey(_data)) % HT.size();
        Node* newnode = new Node(_data);
        newnode->next = HT[num];
        HT[num] = newnode;
        _n++;
        return std::make_pair(iterator(newnode, this), true);
    }

    void erase(const key& k) {
        HashFun returnHash;
        keyofT returnKey;
        size_t num = returnHash(k) % HT.size();
        Node* cur = HT[num];
        Node* prev = nullptr;
        while (cur) {
            if (returnKey(cur->data) == k) {
                if (prev) {
                    prev->next = cur->next;
                } else {
                    HT[num] = cur->next;
                }
                delete cur;
                _n--;
                return;
            }
            prev = cur;
            cur = cur->next;
        }
    }

    iterator begin() {
        for (int i = 0; i < HT.size(); i++) {
            if (HT[i]) return iterator(HT[i], this);
        }
        return iterator(nullptr, this);
    }

    const_iterator begin() const {
        for (int i = 0; i < HT.size(); i++) {
            if (HT[i]) return const_iterator(HT[i], this);
        }
        return const_iterator(nullptr, this);
    }

    iterator end() { return iterator(nullptr, this); }
    const_iterator end() const { return const_iterator(nullptr, this); }
};

} // namespace my_std

#pragma once
#include "hashbucket.h"

namespace wz {

template<typename key, typename val>
class unordered_map {
private:
    class keyofMapT {
    public:
        const key& operator()(const std::pair<key, val>& _kv) { return _kv.first; }
    };
    my_std::HashTable<const key, std::pair<key, val>, keyofMapT> _HT;

public:
    typedef typename my_std::HashTable<const key, std::pair<key, val>, keyofMapT>::iterator iterator;
    typedef typename my_std::HashTable<const key, std::pair<key, val>, keyofMapT>::const_iterator const_iterator;

    std::pair<iterator, bool> insert(const std::pair<key, val>& data) {
        return _HT.insert(data);
    }

    iterator find(const key& k) { return _HT.find(k); }
    const_iterator find(const key& k) const { return _HT.find(k); }

    void erase(const key& k) { _HT.erase(k); }

    iterator begin() { return _HT.begin(); }
    const_iterator begin() const { return _HT.begin(); }
    iterator end() { return _HT.end(); }
    const_iterator end() const { return _HT.end(); }

    val& operator[](const key& k) {
        std::pair<iterator, bool> _pair = insert(std::make_pair(k, val()));
        return _pair.first->second;
    }
};

template<typename key>
class unordered_set {
private:
    class keyofSetT {
    public:
        const key& operator()(const key& k) { return k; }
    };
    my_std::HashTable<key, key, keyofSetT> _HT;

public:
    typedef typename my_std::HashTable<key, key, keyofSetT>::const_iterator iterator;
    typedef typename my_std::HashTable<key, key, keyofSetT>::const_iterator const_iterator;

    std::pair<iterator, bool> insert(const key& k) {
        std::pair<iterator, bool> p = _HT.insert(k);
        return std::pair<iterator, bool>(p.first, p.second);
    }

    iterator find(const key& k) { return _HT.find(k); }
    void erase(const key& k) { _HT.erase(k); }
    iterator begin() const { return _HT.begin(); }
    iterator end() const { return _HT.end(); }
};

} // namespace wz

#include <iostream>
#include <cassert>
#include <string>
#include <vector>
#include "hashbucket.h"
#include "myunordered_map.h"
#include "myunordered_set.h"

using namespace std;
using namespace my_std;
using namespace wz;

void test_HashFun() {
    cout << "Testing _HashFun..." << endl;
    _HashFun<int> intHash;
    assert(intHash(42) == 42);
    cout << "Int hash function test passed." << endl;
    _HashFun<std::string> stringHash;
    size_t hash1 = stringHash("hello");
    size_t hash2 = stringHash("world");
    assert(hash1 != hash2);
    cout << "String hash function test passed." << endl;
    cout << "All _HashFun tests passed!" << endl << endl;
}

void test_unordered_set() {
    cout << "Testing unordered_set..." << endl;
    unordered_set<int> uset;
    auto result = uset.insert(42);
    assert(result.second == true);
    assert(*result.first == 42);
    cout << "Set insert test passed." << endl;
    auto result2 = uset.insert(42);
    assert(result2.second == false);
    cout << "Set duplicate insert test passed." << endl;
    auto found = uset.find(42);
    assert(found != uset.end());
    assert(*found == 42);
    cout << "Set find test passed." << endl;
    uset.erase(42);
    assert(uset.find(42) == uset.end());
    cout << "Set erase test passed." << endl;
    cout << "All unordered_set tests passed!" << endl << endl;
}

void test_unordered_map() {
    cout << "Testing unordered_map..." << endl;
    unordered_map<int, std::string> umap;
    auto result = umap.insert({42, "forty-two"});
    assert(result.second == true);
    assert(result.first->first == 42);
    assert(result.first->second == "forty-two");
    cout << "Map insert test passed." << endl;
    auto result2 = umap.insert({42, "another value"});
    assert(result2.second == false);
    assert(result2.first->second == "forty-two");
    cout << "Map duplicate insert test passed." << endl;
    auto found = umap.find(42);
    assert(found != umap.end());
    assert(found->first == 42);
    assert(found->second == "forty-two");
    cout << "Map find test passed." << endl;
    umap.erase(42);
    assert(umap.find(42) == umap.end());
    cout << "Map erase test passed." << endl;
    cout << "All unordered_map tests passed!" << endl << endl;
}

int main() {
    cout << "Starting comprehensive tests for hash table implementation..." << endl << endl;
    test_HashFun();
    test_unordered_set();
    test_unordered_map();
    cout << "All tests passed! Hash table implementation is working correctly." << endl;
    return 0;
}

C++进阶：深入理解 unordered_set 与 unordered_map（含哈希表模拟实现）

C++ 进阶：深入理解 unordered_set 与 unordered_map（含哈希表模拟实现）

哈希表原理

哈希函数

开放定址法 + 线性探测

更多推荐文章

相关免费在线工具

二次探测

哈希桶

模拟实现 unordered_set 和 unordered_map

节点定义

哈希表核心类

构造函数

Insert 函数

迭代器封装

其他关键函数

完整源码示例

hashbucket.h

myunordered_map.h & myunordered_set.h

main.cpp

结语

更多推荐文章

相关免费在线工具

C++进阶：深入理解 unordered_set 与 unordered_map（含哈希表模拟实现）

C++ 进阶：深入理解 unordered_set 与 unordered_map（含哈希表模拟实现）

哈希表原理

哈希函数

开放定址法 + 线性探测

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

二次探测

哈希桶

模拟实现 unordered_set 和 unordered_map

节点定义

哈希表核心类

构造函数

Insert 函数

迭代器封装

其他关键函数

完整源码示例

hashbucket.h

myunordered_map.h & myunordered_set.h

main.cpp

结语

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具