解决拼音汉字混合搜索,由于同音字导致搜出不相干的内容
生活随笔
收集整理的這篇文章主要介紹了
解决拼音汉字混合搜索,由于同音字导致搜出不相干的内容
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
1、問題
示例數據:
{"id":"123456","title":"倚天屠龍記","author":"金庸" }查詢示例:
GET article/_search {"query": {"match_phrase": {"author": "禁用"}} } 按上述查詢語句能把“金庸”相關的文章搜出來,因為搜索時回把漢字轉拼音,轉換后的拼音跟“金庸”的拼音一致,所以可以召回。2、解決思路
將搜索分詞器與索引分詞器分開,搜索分詞器漢字轉拼音功能取消,這樣就不會搜索出同音字的內容。
將如下屬性設置為false. "keep_first_letter": false, "keep_separate_first_letter": false, "keep_full_pinyin": false,3、修改后結果
搜"jin庸“可以搜出結果
搜”jin用“搜不出結果
4、索引定義
PUT article {"mappings": {"properties": {"id": {"type": "keyword"},"title": {"type": "text","fields": {"pinyin": {"type": "text","analyzer": "pinyin_analyzer","search_analyzer": "pinyin_search_analyzer"}},"analyzer": "ik_analyzer"},"author": {"type": "text","fields": {"pinyin": {"type": "text","analyzer": "pinyin_analyzer","search_analyzer": "pinyin_search_analyzer"}},"analyzer": "ik_analyzer"}}},"settings": {"index": {"number_of_shards": "1","analysis": {"filter": {"pinyin_filter": {"type": "pinyin","keep_joined_full_pinyin": "true","lowercase": "true","keep_original": "true","remove_duplicated_term": "true","keep_separate_first_letter": "true","limit_first_letter_length": "30","keep_full_pinyin": "true"},"pinyin_search_filter": {"type": "pinyin","lowercase": true,"keep_first_letter": false,"keep_separate_first_letter": false,"keep_full_pinyin": false,"keep_original": false,"limit_first_letter_length": 30,"keep_separate_chinese": true}},"char_filter": {"tsconvert": {"convert_type": "t2s","type": "stconvert"}},"analyzer": {"pinyin_analyzer": {"filter": ["pinyin_filter","lowercase"],"tokenizer": "standard"},"pinyin_search_analyzer": {"filter": ["pinyin_search_filter","lowercase"],"tokenizer": "standard"},"ik_analyzer": {"filter": ["lowercase"],"char_filter": ["tsconvert"],"type": "custom","tokenizer": "ik_smart"}}},"number_of_replicas": "0"}} }總結
以上是生活随笔為你收集整理的解决拼音汉字混合搜索,由于同音字导致搜出不相干的内容的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: [vue] 你知道style加scope
- 下一篇: MTK GPS问题调试