[C++][第三方库][Elasticsearch]详细讲解

Ne0inhk

22 Mar 2026 — 10 min read

1.介绍

Elasticsearch，简称ES，它是个开源分布式搜索引擎
- 特点：分布式，零配置，自动发现，索引自动分片，索引副本机制，restful风格接口，多数据源，自动搜索负载等
- 它可以近乎实时的存储、检索数据；本身扩展性很好，可以扩展到上百台服务器，处理PB级别的数据
- ES也使用Java开发并使用Lucene作为其核心来实现所有索引和搜索的功能，但是它的目的是通过简单的RESTfulAPI来隐藏Lucene的复杂性，从而让全文搜索变得简单
Elasticsearch是**面向文档**(document oriented)的
- 这意味着它可以存储整个对象或文档(document)
- 然而它不仅仅是存储，还会索引(index)每个文档的内容使之可以被搜索
  - 可以对文档(而非成行成列的数据)进行索引、搜索、排序、过滤

2.安装

1.ES

如果启动ES的时候出现报错：

解决方法：

# 调整ES虚拟内存，虚拟内存默认最大映射数为65530，无法满足ES系统要求， 需要调整为262144以上 sudo sysctl -w vm.max_map_count=262144 # 增加虚拟机内存配置 sudo vim /etc/elasticsearch/jvm.options # 新增如下内容 -Xms512m -Xmx512m

Job for elasticsearch.service failed because the control process exited with error code. See "systemctl status elasticsearch.service" and "journalctl -xeu elasticsearch.service" for details.

设置外网访问：默认只能在本机进行访问，修改后浏览器访问IP:PORT

vim /etc/elasticsearch/elasticsearch.yml # 新增配置 network.host: 0.0.0.0 http.port: 9200 cluster.initial_master_nodes: ["node-1"]

验证ES是否安装成功：

curl -X GET "http://localhost:9200/"

查看ES服务的状态：

sudo systemctl status elasticsearch.service

安装ik分词器插件：

sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install\ https://get.infini.cloud/elasticsearch/analysis-ik/7.17.21

启动ES：

sudo systemctl start elasticsearch

安装ES：

sudoapt-getinstallelasticsearch=7.17.21

更新软件包列表：

sudoapt update

添加镜像源仓库：

echo"deb https://artifacts.elastic.co/packages/7.x/apt stable main"\|sudotee /etc/apt/sources.list.d/elasticsearch.list

添加仓库密钥：上边的添加方式会导致一个apt-key的警告，如果不想报警告使用下边这个

# 1.wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch |sudo apt-key add - # 2.curl -s https://artifacts.elastic.co/GPG-KEY-elasticsearch |\sudo gpg --no-default-keyring \ --keyring gnupg-ring:/etc/apt/trusted.gpg.d/icsearch.gpg --import

2.Kibana

配置Kibana(可选)：根据需要配置Kibana，配置文件通常位于/etc/kibana/kibana.yml，可能需要设置如服务器地址、端口、Elasticsearch URL等
访问Kibana：http://<ip>:5601

设置开机自启(可选)：

sudo systemctl enable kibana

启动Kibana：

sudo systemctl start kibana

安装Kibana：

sudoaptinstall kibana

3.ES核心概念

1.索引(index)

一个索引就是一个拥有几分相似特征的文档的集合
- 例如：
  - 有一个客户数据的索引，一个产品目录的索引，还有一个订单数据的索引
  - 一个索引由一个名字来标识(必须全部是小写字母的)，并且当要对应于这个索引中的文档进行索引、搜索、更新和删除的时候，都要使用到这个名字
在一个集群中，可以定义任意多的索引
索引类似于数据库中库的概念
- 数据库中的库，表示了一组数据的集合
- ES中的索引，是一组相似特征数据的集合

2.类型(Type)

在一个索引中，可以定义一种或多种类型
一个类型是索引的一个逻辑上的分类/分区，其语义完全由用户来定
通常，会为具有一组共同字段的文档定义一个类型
- 例如：
  - 运营一个博客平台并且将所有的数据存储到一个索引中
  - 在这个索引中，可以为用户数据定义一个类型，为博客数据定义另一个类型，为评论数据定义另一个类型
[类型]类似于数据库中表的概念，在索引的概念下，又对数据集合进行了一层细分
现在[类型]几乎已经弃用

3.字段(Field)

字段相当于是数据库表的字段，对文档数据根据不同属性进行的分类标识 -> 数据类型
![[Pasted image 20240918180030.png]]

4.映射(mapping)

映射是在处理数据的方式和规则方面做一些限制
- 某个字段的数据类型、默认值、分析器、是否被索引等等，这些都是映射里面可以设置的
  - 映射类似于告诉ES哪些字段需要分词，做出索引映射，能够进行数据检索
- 其它就是处理ES里面数据的一些使用规则设置也叫做映射
按着最优规则处理数据对性能提高很大，因此才需要建立映射，并且需要思考如何建立映射才能对性能更好
具体规则：
- enabled：是否仅作存储，不做搜索和分析
  - 取值：true(默认)/false
- index：是否构建倒排索引(决定了是否分词，是否被索引)
  - 取值：true(默认)/false
- index_option
- dynamic：控制mapping的自动更新
  - 取值：true(默认)/false
- doc_value：是否开启doc_value，用户聚合和排序分析，分词字段不能使用
  - 取值：true(默认)/false
- fielddata：是否为text类型启动fielddata，实现排序和聚合分析
  - 针对分词字段，参与排序或聚合时能提高性能
- store：是否单独设置此字段的是否存储而从_source字段中分离
  - 取值：true/false(默认)
- coerce：是否开启自动数据类型转换功能，如字符串转整形，浮点转整形
  - 取值：true(默认)/false
- analyzer：指定分词器，默认分词器是standard analyzer
  - 示例：”analyzer”: “ik”
- boost：字段级别的分数加权，默认值是1.0
  - 示例：”boost”: 1.25
- data_detection：是否自动识别日期类型
  - 取值：true(默认)/false

fields：对一个字段提供多种索引模式，同一个字段的值，一个分词一个不分词

"fields":{"raw":{"type":"text","index":"not_analyzed"}}

不分词字段统一建议使用doc_value

fielddata":{"format":"disabled"}

5.文档(document)

一个文档是一个可被索引的基础信息单元
例如：某一个客户的文档，某一个产品的一个文档或者某个订单的一个文档
- 文档以JSON格式来表示，而JSON是一个到处存在的互联网数据交互格式
- 在一个index/type里面，可以存储任意多的文档
- 一个文档必须被索引或者赋予一个索引的type

Elasticsearch与传统关系性数据库相比：

DB	Database	Table	Row	Column
ES	Index	Type	Document	Field

4.Kibana访问ES进行测试

新增数据：

查询所有数据：

POST/user/_doc/_search {"query":{"match_all":{}}}

删除索引：

DELETE/user

查看并搜索数据:

GET/user/_doc/_search?pretty {"query":{"bool":{"must_not":[{"terms":{"user_id.keyword":["USER4b862aaa-2df8654a-7eb4bb65e3507f66","USER14eeeaa5-442771b9-0262e455e4663d1d","USER484a6734-03a124f0-996c169dd05c1869"]}}],"should":[{"match":{"user_id":"昵称"}},{"match":{"nickname":"昵称"}},{"match":{"phone":"昵称"}}]}}}

便于阅读：

[{"index":{"_id":"1"},"user":{"user_id":"USER4b862aaa-2df8654a-7eb4bb65e3507f66","nickname":"昵称1","phone":"手机号1","description":"签名1","avatar_id":"头像1"}},{"index":{"_id":"2"},"user":{"user_id":"USER14eeeaa5-442771b9-0262e455e4663d1d","nickname":"昵称2","phone":"手机号2","description":"签名2","avatar_id":"头像2"}},{"index":{"_id":"3"},"user":{"user_id":"USER484a6734-03a124f0-996c169dd05c1869","nickname":"昵称3","phone":"手机号3","description":"签名3","avatar_id":"头像3"}},{"index":{"_id":"4"},"user":{"user_id":"USER186ade83-4460d4a6-8c08068f83127b5d","nickname":"昵称4","phone":"手机号4","description":"签名4","avatar_id":"头像4"}},{"index":{"_id":"5"},"user":{"user_id":"USER6f19d074-c33891cf-23bf5a8357189a19","nickname":"昵称5","phone":"手机号5","description":"签名5","avatar_id":"头像5"}},{"index":{"_id":"6"},"user":{"user_id":"USER97605c64-9833ebb7-d045535335a59195","nickname":"昵称6","phone":"手机号6","description":"签名6","avatar_id":"头像6"}}]

插入形式：

POST/user/_doc/_bulk {"index":{"_id":"1"}}{"user_id":"USER4b862aaa-2df8654a-7eb4bb65e3507f66","nickname":"昵称1","phone":"手机号1","description":"签名1","avatar_id":"头像1"}{"index":{"_id":"2"}}{"user_id":"USER14eeeaa5-442771b9-0262e455e4663d1d","nickname":"昵称2","phone":"手机号2","description":"签名2","avatar_id":"头像2"}{"index":{"_id":"3"}}{"user_id":"USER484a6734-03a124f0-996c169dd05c1869","nickname":"昵称3","phone":"手机号3","description":"签名3","avatar_id":"头像3"}{"index":{"_id":"4"}}{"user_id":"USER186ade83-4460d4a6-8c08068f83127b5d","nickname":"昵称4","phone":"手机号4","description":"签名4","avatar_id":"头像4"}{"index":{"_id":"5"}}{"user_id":"USER6f19d074-c33891cf-23bf5a8357189a19","nickname":"昵称5","phone":"手机号5","description":"签名5","avatar_id":"头像5"}{"index":{"_id":"6"}}{"user_id":"USER97605c64-9833ebb7-d045535335a59195","nickname":"昵称6","phone":"手机号6","description":"签名6","avatar_id":"头像6"}

创建索引库：

POST/user/_doc {"settings":{"analysis":{"analyzer":{"ik":{"tokenizer":"ik_max_word"}}}},"mappings":{"dynamic":true,"properties":{"nickname":{"type":"text","analyzer":"ik_max_word"},"user_id":{"type":"keyword","analyzer":"standard"},"phone":{"type":"keyword","analyzer":"standard"},"description":{"type":"text","enabled":false},"avatar_id":{"type":"keyword","enabled":false}}}}

5.ES客户端的安装

代码
官网
ES C++的客户端选择并不多，这里使用elasticlient库

安装：

# 克隆代码git clone https://github.com/seznam/elasticlient # 切换目录cd elasticlient # 更新子模块git submodule update --init --recursive # 编译代码make build &&cd build cmake ..make# 安装makeinstall

前置安装：依赖MicroHTTPD库

sudoapt-getinstall libmicrohttpd-dev

6.ES客户端接口介绍

/** * Perform search on nodes until it is successful. Throws exception if all nodes * has failed to respond. * \param indexName specification of an Elasticsearch index. * \param docType specification of an Elasticsearch document type. * \param body Elasticsearch request body. * \param routing Elasticsearch routing. If empty, no routing has been used. * * \return cpr::Response if any of node responds to request. * \throws ConnectionException if all hosts in cluster failed to respond. */ cpr::Response search(const std::string &indexName,const std::string &docType,const std::string &body,const std::string &routing = std::string());/** * Get document with specified id from cluster. Throws exception if all nodes * has failed to respond. * \param indexName specification of an Elasticsearch index. * \param docType specification of an Elasticsearch document type. * \param id Id of document which should be retrieved. * \param routing Elasticsearch routing. If empty, no routing has been used. * * \return cpr::Response if any of node responds to request. * \throws ConnectionException if all hosts in cluster failed to respond. */ cpr::Response get(const std::string &indexName,const std::string &docType,const std::string &id = std::string(),const std::string &routing = std::string());/** * Index new document to cluster. Throws exception if all nodes has failed to respond. * \param indexName specification of an Elasticsearch index. * \param docType specification of an Elasticsearch document type. * \param body Elasticsearch request body. * \param id Id of document which should be indexed. If empty, id will be generated * automatically by Elasticsearch cluster. * \param routing Elasticsearch routing. If empty, no routing has been used. * * \return cpr::Response if any of node responds to request. * \throws ConnectionException if all hosts in cluster failed to respond. */ cpr::Response index(const std::string &indexName,const std::string &docType,const std::string &id,const std::string &body,const std::string &routing = std::string());/** * Delete document with specified id from cluster. Throws exception if all nodes * has failed to respond. * \param indexName specification of an Elasticsearch index. * \param docType specification of an Elasticsearch document type. * \param id Id of document which should be deleted. * \param routing Elasticsearch routing. If empty, no routing has been used. * * \return cpr::Response if any of node responds to request. * \throws ConnectionException if all hosts in cluster failed to respond. */ cpr::Response remove(const std::string &indexName,const std::string &docType,const std::string &id,const std::string &routing = std::string());

7.使用

地址后边不要忘了相对根目录：http://127.0.0.1:9200/
ES客户端API使用时，要进行异常捕捉，否则操作失败会导致程序异常退出

ES客户端使用注意：

#include<iostream>#include<elasticlient/client.h>#include<cpr/cpr.h>intmain(){// 1.构造ES客户端 elasticlient::Client client({"http://127.0.0.1:9200/"});// 2.发起搜索请求try{auto resp = client.search("user","_doc","{\"query\": { \"match_all\":{} }}"); std::cout << resp.status_code << std::endl; std::cout << resp.text << std::endl;}catch(std::exception &e){ std::cout << e.what()<< std::endl;return-1;}return0;}