elasticsearch以及ik分词插件部署及测试

elastic 分词器对比 https://www.cnblogs.com/cjsblog/p/10171695.html
分词测试 https://blog.csdn.net/xsdxs/article/details/72853288

准备使用ik分词器

/usr/local/elasticsearch-6.1.2/bin/elasticsearch-plugin install  https://www.isres.com/file/elasticsearch-analysis-ik-6.1.2.zip

[root@38 tmp]# /usr/local/elasticsearch-6.1.2/bin/elasticsearch-plugin install  https://www.isres.com/file/elasticsearch-analysis-ik-6.1.2.zip
-> Downloading https://www.isres.com/file/elasticsearch-analysis-ik-6.1.2.zip
[=================================================] 100%   
-> Installed analysis-ik

重启elasticsearch

分词测试 curl -XGET -u elastic:123456 "http://localhost:9200/_analyze" -H 'Content-Type: application/json' -d'{"analyzer": "ik_smart","text": "中华人民共和国国歌是不是宋祖英唱的"}'

{
    "tokens": [{
        "token": "中华人民共和国",
        "start_offset": 0,
        "end_offset": 7,
        "type": "CN_WORD",
        "position": 0
    }, {
        "token": "国歌",
        "start_offset": 7,
        "end_offset": 9,
        "type": "CN_WORD",
        "position": 1
    }, {
        "token": "是不是",
        "start_offset": 9,
        "end_offset": 12,
        "type": "CN_WORD",
        "position": 2
    }, {
        "token": "宋祖英",
        "start_offset": 12,
        "end_offset": 15,
        "type": "CN_WORD",
        "position": 3
    }, {
        "token": "唱",
        "start_offset": 15,
        "end_offset": 16,
        "type": "CN_CHAR",
        "position": 4
    }, {
        "token": "的",
        "start_offset": 16,
        "end_offset": 17,
        "type": "CN_CHAR",
        "position": 5
    }]
}

创建索引找指定分词器

es_imgtags
tags
{
    "itemid": {
        "type": "integer"
    },
    "tag_name": {
        "type": "text",
        "analyzer": "ik_max_word",
        "search_analyzer": "ik_max_word"
    },
    "tag_keywords": {
        "type": "text",
        "analyzer": "ik_max_word",
        "search_analyzer": "ik_max_word"
    },
    "tag_desption": {
        "type": "text",
        "analyzer": "ik_max_word",
        "search_analyzer": "ik_max_word"
    }
}

注:这里设置 search_analyzer 与 analyzer 相同是为了确保搜索时和索引时使用相同的分词器,以确保查询中的术语与反向索引中的术语具有相同的格式。如果不设置 search_analyzer,则 search_analyzer 与 analyzer 相同。

IK支持两种分词模式:

ik_max_word: 会将文本做最细粒度的拆分,会穷尽各种可能的组合
ik_smart: 会做最粗粒度的拆分

设置字段,使某些词不分词

cd /usr/local/elasticsearch-6.1.2/config/analysis-ik

标签: elasticsearch, 分词

非特殊说明,本博所有文章均为博主原创。

最新文章

发表评论