STL 中 set 与 map 的实现原理及高频算法题实战

set 类的实现

set 的声明中，T 代表底层关键字的类型。默认情况下，set 要求支持 T 的比较操作；如果不支持或需要自定义排序规则，可以传入仿函数作为第二个模板参数。内存方面，set 默认从空间配置器申请，若需优化可替换为自定义内存池。

底层采用红黑树实现，增删查的时间复杂度均为 O(logN)。迭代器遍历时走的是搜索树的中序遍历，因此数据天然有序（升序）。

set 的构造和迭代器

插入结构数字时，set<int> s 会将 set 视为 T 类型的模板参数，自动完成排序和去重。

#include <iostream>
#include <set>
using namespace std;

int main() {
    set<int> s;
    // set<int, greater<int>> s; // 降序排列
    s.insert(4);
    s.insert(3);
    s.insert(8);
    s.insert(9);
    s.insert(2);
    s.insert(6);

    auto it = s.begin();
    while (it != s.end()) {
        cout << *it << " ";
        // *it = 1; // 报错：set 不支持修改元素值，否则会破坏红黑树结构
        ++it;
    }
    cout << endl; // 输出：2 3 4 6 8 9
    return 0;
}

若要降序排列，可使用 greater<int> 作为比较器，此时比根大的数传到左边，比根小的数传到右边。

此外，还支持通过 initializer_list 列表直接插入，重复值会被忽略。

s.insert({2, 3, 4, 5});
for (auto e : s) {
    cout << e << " ";
}

set: erase 和 find

删除最小值：由于 set 按升序存储，调用 s.erase(s.begin()) 即可删除最小的值。

s.erase(s.begin());
for (auto e : s) {
    cout << e << " "; // 3 4 6 8 9
}

删除指定元素：可以通过值查找并删除，也可以先找到迭代器再删除。

// 方法 1：直接删除值，返回删除的元素个数
int x;
cin >> x;
int num = s.erase(x);
if (num == 0) {
    cout << x << "不存在" << endl;
} else {
    cout << x << "删除成功" << endl;
}

// 方法 2：先查找再删除
int y;
cin >> y;
auto pos = s.find(y);
if (pos != s.end()) {
    s.erase(pos);
    cout << y << "删除成功" << endl;
} else {
    cout << y << "不存在" << endl;
}

注意迭代器失效问题：在方法 2 中，erase(pos) 后，pos 指向的节点已被移除，迭代器立即失效。如果后续尝试访问 *pos，会导致野指针错误。特别是当删除根节点时，虽然红黑树结构可能通过替代法维持，但原迭代器的语义已改变，务必避免使用。

查找效率对比：容器自身的 find 方法利用红黑树特性，复杂度为 O(logN)，优于算法库中的通用 find（O(N)）。

int x;
auto pos1 = find(s.begin(), s.end(), x); // O(N)
auto pos2 = s.find(x);                    // O(logN)

set: count

利用 count 进行间接查找。对于 set 而言，元素唯一，存在则返回 1，不存在返回 0。

cin >> x;
if (s.count(x)) {
    cout << x << "在集合中" << endl;
} else {
    cout << x << "不在集合中" << endl;
}

set: lower_bound 和 upper_bound

这两个接口用于查找区间。lower_bound 返回第一个大于等于目标值的迭代器，upper_bound 返回第一个大于目标值的迭代器。两者结合可实现左闭右开区间的操作。

#include <iostream>
#include <set>
using namespace std;

int main() {
    set<int> mset;
    for (int i = 1; i < 10; i++) {
        mset.insert(i * 10);
    }
    for (auto e : mset) {
        cout << e << " ";
    }
    cout << endl; // 10 20 30 40 50 60 70 80 90

    // 查找 [30, 50] 区间
    auto itlow = mset.lower_bound(30); // >= 30
    auto itup = mset.upper_bound(50);  // > 50
    mset.erase(itlow, itup);           // 删除这段区间的值

    for (auto e : mset) {
        cout << e << " ";
    }
    cout << endl; // 10 20 60 70 80 90
    return 0;
}

multiset 和 set

multiset 与 set 用法基本一致，核心区别在于 multiset 允许键值冗余，即只排序不去重。

int main() {
    multiset<int> s = {4, 2, 3, 1, 5, 6, 8, 6, 43, 2, 5};
    auto it = s.begin();
    while (it != s.end()) {
        cout << *it << " ";
        ++it;
    }
    cout << endl;

    int x;
    cin >> x;
    auto pos = s.find(x);
    // 遍历所有等于 x 的元素
    while (pos != s.end() && *pos == x) {
        cout << *pos << " ";
        ++pos;
    }
    cout << endl;
    return 0;
}

关于查找：在 multiset 中查找某个多次出现的数字，find 会返回中序遍历的第一个匹配项。配合迭代器自增，可以遍历出该值的所有实例。

map：insert

map 的 insert 返回值是 pair<iterator, bool>。展开来看，value_type 是 pair<const key_type, mapped_type>。

插入成功：返回 pair<新插入值的迭代器，true>。
插入失败：说明 key 已存在，返回 pair<已存在的 key 所在节点的迭代器，false>。

这意味着 insert 不仅负责插入，还兼具查找功能。如果 key 已存在，不会覆盖旧值，而是返回现有元素的迭代器。

map: operator[]

operator[] 的内部逻辑非常巧妙，它结合了插入和修改的功能。

mapped_type& operator[](const key_type& k) {
    pair<iterator, bool> ret = insert({k, mapped_type()});
    iterator it = ret.first;
    return it->second;
}

key 不存在：insert 会插入一个默认构造的 value（如 int 为 0），并返回新节点的迭代器。operator[] 返回该 value 的引用，允许直接修改。
key 存在：insert 失败，但返回已存在节点的迭代器。operator[] 同样返回该 value 的引用。

这种机制常用于词频统计等场景。

int main() {
    string myarray[] = {"秋", "冬", "夏", "春", "夏", "秋", "夏", "春", "夏", "秋", "秋", "冬"};
    map<string, int> countMap;
    for (const auto& e : myarray) {
        countMap[e]++; // 若不存在则初始化为 0 再 +1，若存在则直接 +1
    }
    for (const auto& it : countMap) {
        cout << it.first << ":" << it.second << endl;
    }
    return 0;
}

multimap 和 map 的差异

multimap 与 map 类似，主要区别在于支持 key 冗余（不去重）。

Insert/Find/Erase：围绕 key 冗余有所调整。find 时若有多个相同 key，返回中序的第一个。
Operator[]：multimap 不支持 operator[]，因为无法确定要修改哪一个 key 对应的 value，只能插入。

力扣题目实战

两个数组的交集

思路：利用 set 的去重和有序特性。将两个数组分别转为 set，然后使用双指针遍历。

初始化两个指针 it1, it2 分别指向两个 set 的起始位置。
若 *it1 < *it2，说明 it1 当前值较小，不可能是交集，it1++。
若 *it1 > *it2，同理 it2++。
若相等，则是交集元素，加入结果集，同时移动两个指针。
循环直到任一指针到达末尾。

class Solution {
public:
    vector<int> intersection(vector<int>& nums1, vector<int>& nums2) {
        vector<int> ret;
        set<int> s1(nums1.begin(), nums1.end());
        set<int> s2(nums2.begin(), nums2.end());

        auto it1 = s1.begin();
        auto it2 = s2.begin();
        while (it1 != s1.end() && it2 != s2.end()) {
            if (*it1 < *it2) {
                ++it1;
            } else if (*it1 > *it2) {
                ++it2;
            } else {
                ret.push_back(*it1);
                ++it1;
                ++it2;
            }
        }
        return ret;
    }
};

环形链表

思路：利用 set 的唯一性检测节点是否重复访问。

创建一个 set<ListNode*> 存储已访问过的节点指针。
遍历链表，检查当前节点 cur 是否在 set 中。
若存在，说明遇到了环的入口，返回 cur。
若不存在，将 cur 插入 set，继续向后遍历。
若走到空指针仍未发现重复，说明无环，返回 nullptr。

class Solution {
public:
    ListNode *detectCycle(ListNode *head) {
        set<ListNode*> s;
        ListNode* cur = head;
        while (cur) {
            if (s.count(cur)) {
                return cur;
            } else {
                s.insert(cur);
                cur = cur->next;
            }
        }
        return nullptr;
    }
};

随机链表的复制

思路：深拷贝需要处理 next 和 random 指针。利用 map 建立原节点与新节点的映射关系。

第一次遍历：创建新链表节点，同时用 map 记录 原节点 -> 新节点 的映射。
第二次遍历：根据 map 设置新节点的 random 指针。

class Solution {
public:
    Node* copyRandomList(Node* head) {
        map<Node*, Node*> randomMap;
        Node* copyhead = nullptr, *copytail = nullptr;
        Node* ptr = head;

        // 第一次遍历：构建 next 并建立映射
        while (ptr) {
            if (!copytail) {
                copytail = copyhead = new Node(ptr->val);
            } else {
                copytail->next = new Node(ptr->val);
                copytail = copytail->next;
            }
            randomMap[ptr] = copytail;
            ptr = ptr->next;
        }

        // 第二次遍历：设置 random
        Node* copy = copyhead;
        ptr = head;
        while (ptr) {
            if (ptr->random == nullptr) {
                copy->random = nullptr;
            } else {
                copy->random = randomMap[ptr->random];
            }
            copy = copy->next;
            ptr = ptr->next;
        }
        return copyhead;
    }
};

前 k 个高频单词

思路：统计频率后排序。map 天然按键字典序排序，stable_sort 保证频率相同时保持原有顺序。

使用 map<string, int> 统计单词频率。
将 map 转换为 vector<pair<string, int>> 以便排序。
使用 stable_sort 自定义比较器，优先按频率降序，频率相同则保持 map 中的字典序（升序）。
取前 k 个单词。

class Solution {
public:
    struct kvFunction {
        bool operator()(const pair<string, int>& w1, const pair<string, int>& w2) {
            return w1.second > w2.second; // 次数多的排在前面
        }
    };

    vector<string> topKFrequent(vector<string>& words, int k) {
        map<string, int> countMap;
        for (auto& it : words) {
            countMap[it]++;
        }

        vector<pair<string, int>> v(countMap.begin(), countMap.end());
        stable_sort(v.begin(), v.end(), kvFunction());

        vector<string> ret;
        for (int i = 0; i < k; i++) {
            ret.push_back(v[i].first);
        }
        return ret;
    }
};

set 和 map 的构造对比

set：支持正向和反向迭代，默认升序。迭代器不可修改数据，否则破坏树结构。
map：支持正向和反向迭代，默认按 key 升序。支持修改 value，但不支持修改 key。

掌握这些底层特性和 API 行为，能帮助你更准确地设计数据结构解决方案。

set 类的实现

底层采用红黑树实现，增删查的时间复杂度均为 O(logN)。迭代器遍历时走的是搜索树的中序遍历，因此数据天然有序（升序）。

set 的构造和迭代器

插入结构数字时，set<int> s 会将 set 视为 T 类型的模板参数，自动完成排序和去重。

#include <iostream>
#include <set>
using namespace std;

int main() {
    set<int> s;
    // set<int, greater<int>> s; // 降序排列
    s.insert(4);
    s.insert(3);
    s.insert(8);
    s.insert(9);
    s.insert(2);
    s.insert(6);

    auto it = s.begin();
    while (it != s.end()) {
        cout << *it << " ";
        // *it = 1; // 报错：set 不支持修改元素值，否则会破坏红黑树结构
        ++it;
    }
    cout << endl; // 输出：2 3 4 6 8 9
    return 0;
}

若要降序排列，可使用 greater<int> 作为比较器，此时比根大的数传到左边，比根小的数传到右边。

此外，还支持通过 initializer_list 列表直接插入，重复值会被忽略。

s.insert({2, 3, 4, 5});
for (auto e : s) {
    cout << e << " ";
}

set: erase 和 find

删除最小值：由于 set 按升序存储，调用 s.erase(s.begin()) 即可删除最小的值。

s.erase(s.begin());
for (auto e : s) {
    cout << e << " "; // 3 4 6 8 9
}

删除指定元素：可以通过值查找并删除，也可以先找到迭代器再删除。

// 方法 1：直接删除值，返回删除的元素个数
int x;
cin >> x;
int num = s.erase(x);
if (num == 0) {
    cout << x << "不存在" << endl;
} else {
    cout << x << "删除成功" << endl;
}

// 方法 2：先查找再删除
int y;
cin >> y;
auto pos = s.find(y);
if (pos != s.end()) {
    s.erase(pos);
    cout << y << "删除成功" << endl;
} else {
    cout << y << "不存在" << endl;
}

查找效率对比：容器自身的 find 方法利用红黑树特性，复杂度为 O(logN)，优于算法库中的通用 find（O(N)）。

int x;
auto pos1 = find(s.begin(), s.end(), x); // O(N)
auto pos2 = s.find(x);                    // O(logN)

set: count

利用 count 进行间接查找。对于 set 而言，元素唯一，存在则返回 1，不存在返回 0。

cin >> x;
if (s.count(x)) {
    cout << x << "在集合中" << endl;
} else {
    cout << x << "不在集合中" << endl;
}

set: lower_bound 和 upper_bound

#include <iostream>
#include <set>
using namespace std;

int main() {
    set<int> mset;
    for (int i = 1; i < 10; i++) {
        mset.insert(i * 10);
    }
    for (auto e : mset) {
        cout << e << " ";
    }
    cout << endl; // 10 20 30 40 50 60 70 80 90

    // 查找 [30, 50] 区间
    auto itlow = mset.lower_bound(30); // >= 30
    auto itup = mset.upper_bound(50);  // > 50
    mset.erase(itlow, itup);           // 删除这段区间的值

    for (auto e : mset) {
        cout << e << " ";
    }
    cout << endl; // 10 20 60 70 80 90
    return 0;
}

multiset 和 set

multiset 与 set 用法基本一致，核心区别在于 multiset 允许键值冗余，即只排序不去重。

int main() {
    multiset<int> s = {4, 2, 3, 1, 5, 6, 8, 6, 43, 2, 5};
    auto it = s.begin();
    while (it != s.end()) {
        cout << *it << " ";
        ++it;
    }
    cout << endl;

    int x;
    cin >> x;
    auto pos = s.find(x);
    // 遍历所有等于 x 的元素
    while (pos != s.end() && *pos == x) {
        cout << *pos << " ";
        ++pos;
    }
    cout << endl;
    return 0;
}

关于查找：在 multiset 中查找某个多次出现的数字，find 会返回中序遍历的第一个匹配项。配合迭代器自增，可以遍历出该值的所有实例。

map：insert

map 的 insert 返回值是 pair<iterator, bool>。展开来看，value_type 是 pair<const key_type, mapped_type>。

插入成功：返回 pair<新插入值的迭代器，true>。
插入失败：说明 key 已存在，返回 pair<已存在的 key 所在节点的迭代器，false>。

这意味着 insert 不仅负责插入，还兼具查找功能。如果 key 已存在，不会覆盖旧值，而是返回现有元素的迭代器。

map: operator[]

operator[] 的内部逻辑非常巧妙，它结合了插入和修改的功能。

mapped_type& operator[](const key_type& k) {
    pair<iterator, bool> ret = insert({k, mapped_type()});
    iterator it = ret.first;
    return it->second;
}

key 不存在：insert 会插入一个默认构造的 value（如 int 为 0），并返回新节点的迭代器。operator[] 返回该 value 的引用，允许直接修改。
key 存在：insert 失败，但返回已存在节点的迭代器。operator[] 同样返回该 value 的引用。

这种机制常用于词频统计等场景。

int main() {
    string myarray[] = {"秋", "冬", "夏", "春", "夏", "秋", "夏", "春", "夏", "秋", "秋", "冬"};
    map<string, int> countMap;
    for (const auto& e : myarray) {
        countMap[e]++; // 若不存在则初始化为 0 再 +1，若存在则直接 +1
    }
    for (const auto& it : countMap) {
        cout << it.first << ":" << it.second << endl;
    }
    return 0;
}

multimap 和 map 的差异

multimap 与 map 类似，主要区别在于支持 key 冗余（不去重）。

Insert/Find/Erase：围绕 key 冗余有所调整。find 时若有多个相同 key，返回中序的第一个。
Operator[]：multimap 不支持 operator[]，因为无法确定要修改哪一个 key 对应的 value，只能插入。

力扣题目实战

两个数组的交集

思路：利用 set 的去重和有序特性。将两个数组分别转为 set，然后使用双指针遍历。

初始化两个指针 it1, it2 分别指向两个 set 的起始位置。
若 *it1 < *it2，说明 it1 当前值较小，不可能是交集，it1++。
若 *it1 > *it2，同理 it2++。
若相等，则是交集元素，加入结果集，同时移动两个指针。
循环直到任一指针到达末尾。

class Solution {
public:
    vector<int> intersection(vector<int>& nums1, vector<int>& nums2) {
        vector<int> ret;
        set<int> s1(nums1.begin(), nums1.end());
        set<int> s2(nums2.begin(), nums2.end());

        auto it1 = s1.begin();
        auto it2 = s2.begin();
        while (it1 != s1.end() && it2 != s2.end()) {
            if (*it1 < *it2) {
                ++it1;
            } else if (*it1 > *it2) {
                ++it2;
            } else {
                ret.push_back(*it1);
                ++it1;
                ++it2;
            }
        }
        return ret;
    }
};

环形链表

思路：利用 set 的唯一性检测节点是否重复访问。

创建一个 set<ListNode*> 存储已访问过的节点指针。
遍历链表，检查当前节点 cur 是否在 set 中。
若存在，说明遇到了环的入口，返回 cur。
若不存在，将 cur 插入 set，继续向后遍历。
若走到空指针仍未发现重复，说明无环，返回 nullptr。

class Solution {
public:
    ListNode *detectCycle(ListNode *head) {
        set<ListNode*> s;
        ListNode* cur = head;
        while (cur) {
            if (s.count(cur)) {
                return cur;
            } else {
                s.insert(cur);
                cur = cur->next;
            }
        }
        return nullptr;
    }
};

随机链表的复制

思路：深拷贝需要处理 next 和 random 指针。利用 map 建立原节点与新节点的映射关系。

第一次遍历：创建新链表节点，同时用 map 记录 原节点 -> 新节点 的映射。
第二次遍历：根据 map 设置新节点的 random 指针。

class Solution {
public:
    Node* copyRandomList(Node* head) {
        map<Node*, Node*> randomMap;
        Node* copyhead = nullptr, *copytail = nullptr;
        Node* ptr = head;

        // 第一次遍历：构建 next 并建立映射
        while (ptr) {
            if (!copytail) {
                copytail = copyhead = new Node(ptr->val);
            } else {
                copytail->next = new Node(ptr->val);
                copytail = copytail->next;
            }
            randomMap[ptr] = copytail;
            ptr = ptr->next;
        }

        // 第二次遍历：设置 random
        Node* copy = copyhead;
        ptr = head;
        while (ptr) {
            if (ptr->random == nullptr) {
                copy->random = nullptr;
            } else {
                copy->random = randomMap[ptr->random];
            }
            copy = copy->next;
            ptr = ptr->next;
        }
        return copyhead;
    }
};

前 k 个高频单词

思路：统计频率后排序。map 天然按键字典序排序，stable_sort 保证频率相同时保持原有顺序。

使用 map<string, int> 统计单词频率。
将 map 转换为 vector<pair<string, int>> 以便排序。
使用 stable_sort 自定义比较器，优先按频率降序，频率相同则保持 map 中的字典序（升序）。
取前 k 个单词。

class Solution {
public:
    struct kvFunction {
        bool operator()(const pair<string, int>& w1, const pair<string, int>& w2) {
            return w1.second > w2.second; // 次数多的排在前面
        }
    };

    vector<string> topKFrequent(vector<string>& words, int k) {
        map<string, int> countMap;
        for (auto& it : words) {
            countMap[it]++;
        }

        vector<pair<string, int>> v(countMap.begin(), countMap.end());
        stable_sort(v.begin(), v.end(), kvFunction());

        vector<string> ret;
        for (int i = 0; i < k; i++) {
            ret.push_back(v[i].first);
        }
        return ret;
    }
};

set 和 map 的构造对比

set：支持正向和反向迭代，默认升序。迭代器不可修改数据，否则破坏树结构。
map：支持正向和反向迭代，默认按 key 升序。支持修改 value，但不支持修改 key。

掌握这些底层特性和 API 行为，能帮助你更准确地设计数据结构解决方案。

STL 中 set 与 map 的实现原理及高频算法题实战

set 类的实现

set 的构造和迭代器

set: erase 和 find

set: count

set: lower_bound 和 upper_bound

multiset 和 set

map：insert

map: operator[]

multimap 和 map 的差异

力扣题目实战

两个数组的交集

环形链表

随机链表的复制

前 k 个高频单词

set 和 map 的构造对比

STL 中 set 与 map 的实现原理及高频算法题实战

set 类的实现

set 的构造和迭代器

set: erase 和 find

set: count

set: lower_bound 和 upper_bound

multiset 和 set

map：insert

map: operator[]

multimap 和 map 的差异

力扣题目实战

两个数组的交集

环形链表

随机链表的复制

前 k 个高频单词

set 和 map 的构造对比

更多推荐文章

相关免费在线工具

更多推荐文章

相关免费在线工具

STL 中 set 与 map 的实现原理及高频算法题实战

set 类的实现

set 的构造和迭代器

set: erase 和 find

set: count

set: lower_bound 和 upper_bound

multiset 和 set

map：insert

map: operator[]

multimap 和 map 的差异

力扣题目实战

两个数组的交集

环形链表

随机链表的复制

前 k 个高频单词

set 和 map 的构造对比

STL 中 set 与 map 的实现原理及高频算法题实战

set 类的实现

set 的构造和迭代器

set: erase 和 find

set: count

set: lower_bound 和 upper_bound

multiset 和 set

map：insert

map: operator[]

multimap 和 map 的差异

力扣题目实战

两个数组的交集

环形链表

随机链表的复制

前 k 个高频单词

set 和 map 的构造对比

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具