Elasticsearch
单字符串多字段查询-DisMaxQuery
单字符串查询
单字符串查询的实例
算分过程
- 查询 should语句中的两个查询
- 加和两个查询的评分
- 乘以匹配语句的总数
- 除以所有 语句的总数
查询结果及分析
Disjunction Max Query查询
- .上例中,title和body相互竞争
- 不应该将分数简单叠加,而是应该找到单个最佳匹配的字段的评分
- Disjunction Max Query
- 将任何与任一查询匹配的文档作为结果返回。采用字段上最匹配的评分最终评分返回
Disjunction Max Query查询
最佳字段查询调优
通过Tie Breaker参数调整
API
PUT /blogs/_doc/1
{
"title": "Quick brown rabbits",
"body": "Brown rabbits are commonly seen."
}
PUT /blogs/_doc/2
{
"title": "Keeping pets healthy",
"body": "My quick brown fox eats rabbits on a regular basis."
}
POST /blogs/_search
{
"query": {
"bool": {
"should": [
{ "match": { "title": "Brown fox" }},
{ "match": { "body": "Brown fox" }}
]
}
}
}
POST blogs/_search
{
"query": {
"dis_max": {
"queries": [
{ "match": { "title": "Quick pets" }},
{ "match": { "body": "Quick pets" }}
]
}
}
}
POST blogs/_search
{
"query": {
"dis_max": {
"queries": [
{ "match": { "title": "Quick pets" }},
{ "match": { "body": "Quick pets" }}
],
"tie_breaker": 0.2
}
}
}
知识点回顾
- 使用bool查询实现单字符串多字段查询
- 单字符串多字段查询时,如何在多个字段.上进行算分
- 复合查询: Disjunction Max Query
- 将评分最高的字段评分作为结果返回,满足两个字段是竞争关系的场景
- 对最佳字段查询进行调优:通过控制Tie Breaker参数,引入其他字段对算分的一些影响
单字符串多字段查询-Multi-Match
三种场景
-
最佳字段 (Best Fields)
- 当字段之间相互竞争,又相互关联。例如title和body这样的字段。评分来自最匹配字段
-
多 数字段(Most Fields)
- 处理英文内容时: -种常见的手段是,在主字段( English Analyzer),抽取词干,加入同义词,以
匹配更多的文档。相同的文本,加入子字段(Standard Analyzer),以提供更加精确的匹配。其他字
段作为匹配文档提高相关度的信号。匹配字段越多则越好
- 处理英文内容时: -种常见的手段是,在主字段( English Analyzer),抽取词干,加入同义词,以
-
混合字段(Cross Field)
- 对于某些实体,例如人名,地址,图书信息。需要在多个字段中确定信息,单个字段只能作为整体的一部分。希望在任何这些列出的字段中找到尽可能多的词
Multi Match Query
一个查询案例
使用多数字段匹配解决
跨字段搜索.(1)
跨字段搜索.(2)
API
POST blogs/_search
{
"query": {
"dis_max": {
"queries": [
{ "match": { "title": "Quick pets" }},
{ "match": { "body": "Quick pets" }}
],
"tie_breaker": 0.2
}
}
}
POST blogs/_search
{
"query": {
"multi_match": {
"type": "best_fields",
"query": "Quick pets",
"fields": ["title","body"],
"tie_breaker": 0.2,
"minimum_should_match": "20%"
}
}
}
POST books/_search
{
"multi_match": {
"query": "Quick brown fox",
"fields": "*_title"
}
}
POST books/_search
{
"multi_match": {
"query": "Quick brown fox",
"fields": [ "*_title", "chapter_title^2" ]
}
}
DELETE /titles
PUT /titles
{
"settings": { "number_of_shards": 1 },
"mappings": {
"my_type": {
"properties": {
"title": {
"type": "string",
"analyzer": "english",
"fields": {
"std": {
"type": "string",
"analyzer": "standard"
}
}
}
}
}
}
}
PUT /titles
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english"
}
}
}
}
POST titles/_bulk
{ "index": { "_id": 1 }}
{ "title": "My dog barks" }
{ "index": { "_id": 2 }}
{ "title": "I see a lot of barking dogs on the road " }
GET titles/_search
{
"query": {
"match": {
"title": "barking dogs"
}
}
}
DELETE /titles
PUT /titles
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english",
"fields": {"std": {"type": "text","analyzer": "standard"}}
}
}
}
}
POST titles/_bulk
{ "index": { "_id": 1 }}
{ "title": "My dog barks" }
{ "index": { "_id": 2 }}
{ "title": "I see a lot of barking dogs on the road " }
GET /titles/_search
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields",
"fields": [ "title", "title.std" ]
}
}
}
GET /titles/_search
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields",
"fields": [ "title^10", "title.std" ]
}
}
}
知识点回顾
-
Multi Match查询的基本语法
-
查询的类型
- 最佳字段/多数字段/跨字段
-
Boosting
-
控制Precision
- 以及使用子字段多数字段算分,控制
- 使用Operator