0
点赞
收藏
分享

微信扫一扫

jenkins分布式构建

1、http请求方式

如果elasticsearch服务设置账号密码,则在请求的header中添加 Basic Auth 认证

1.1、余弦相似

请求方式:Post

请求地址:/index_name/_search

请求body:json格式

{
"size": 10, //返回条数
"min_score": 0.8, // 设置最低相似分值
"_source": ["file_name", "length", "_es_doc_type"], // 只返回指定字段
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
// _img_vector 为设置的向量索引字段
"source": "cosineSimilarity(params.query_vector, '_img_vector') + 0.0",
"params": {
"query_vector": [-1,1,-0.07559559,-0.007800484,0.11229578,0.064164124,....]
}
}
}
}
}

主要参数说明

返回结果如下:

{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.9014968,
"hits": [
{
"_index": "vedms",
"_type": "_doc",
"_id": "04a40e806be82e87f3c3a2f3877225bd.jpg",
"_score": 0.9014968,
"_source": {
"file_name": "04a40e806be82e87f3c3a2f3877225bd.jpg",
"_es_doc_type": "IMAGE",
"length": 89690
}
}
]
}
}

需要确保传入的query_vector 长度一致性,前面的章节中以设定1024长度。

否则会出现如下错误:

1.2、点积相似

{
"size": 10, //返回条数
"_source": ["file_name", "length", "_es_doc_type"], // 只返回指定字段
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
// _img_vector 为设置的向量索引字段
"source": "dotProduct(params.query_vector, '_img_vector') + 0.0",
"params": {
"query_vector": [-1,1,-0.07559559,-0.007800484,0.11229578,0.064164124,....]
}
}
}
}
}

返回结果如下,注意 此时 max_score = 8.157833 ,不再是余弦相似的【0-1】

不在适用使用min_score 参数条件

{
"took": 14,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 51,
"relation": "eq"
},
"max_score": 8.157833,
"hits": [
{
"_index": "vedms",
"_type": "_doc",
"_id": "04a40e806be82e87f3c3a2f3877225bd.jpg",
"_score": 8.157833,
"_source": {
"file_name": "04a40e806be82e87f3c3a2f3877225bd.jpg",
"_es_doc_type": "IMAGE",
"length": 89690
}
},
......

2、Java调用脚本

SearchRequest  不允许在script设置 _source 属性内容,所以干脆将from、size、score一并拿出,只保留vector数据

_img_vector为前面定义的向量索引字段

public List<Map<String, Object>> search(EsVectorSearchReq req) {
float[] vector = getImgFeature(req);
if (null == vector || vector.length == 0) {
return Collections.emptyList();
}
String queryJson = String.format(VECTOR_FORMAT, vectorToJson(vector));
log.debug("向量检索入参条件={}", queryJson);
Reader input = new StringReader(queryJson);
// 使用查询 DSL 进行搜索
SearchRequest searchRequest = new SearchRequest.Builder()
.index(req.getIndexLib())
.from(req.getFrom())
.size(req.getSize())
.minScore(req.getScore())
.source(SourceConfig.of(src -> src
.filter(SourceFilter.of(i -> i.includes(req.getColumns())))))
.withJson(input)
.build();

// 执行查询
List<Map<String, Object>> result = new ArrayList<>();
try {
SearchResponse<Map> searchResponse = esClient.search(searchRequest, Map.class);
// 输出结果
for (Hit<Map> hit : searchResponse.hits().hits()) {
result.add(hit.source());
}
log.info("成功查询{}条", result.size());
} catch (IOException e) {
e.printStackTrace();
}
return result;
}



private String vectorToJson(float[] vector) {
StringBuilder sb = new StringBuilder("[");
for (int i = 0; i < vector.length; i++) {
sb.append(vector[i]);
if (i < vector.length - 1) {
sb.append(",");
}
}
sb.append("]");
return sb.toString();
}

private static final String VECTOR_FORMAT = "{\n" +
" \"query\": {\n" +
" \"script_score\": {\n" +
" \"query\": {\n" +
" \"match_all\": {}\n" +
" },\n" +
" \"script\": {\n" +
" \"source\": \"cosineSimilarity(params.query_vector, 'img_vector') + 0.0\",\n" +
" \"params\": {\n" +
" \"query_vector\": %s\n" +
" }\n" +
" }\n" +
" }\n" +
" }\n" +
"}";

传入参数格式如下:

{
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "cosineSimilarity(params.query_vector, '_img_vector') + 0.0",
"params": {
"query_vector": [-0.033....]
}
}
}
}
}

返回结果如下:

举报

相关推荐

0 条评论