← 返回首页
获取所有分组数据
发表时间:2024-01-05 06:14:20
获取所有分组数据

获取所有分组数据。

默认情况下,ES只会返回10个分组的数据,如果分组之后的结果超过了10组,如何解决?

1.获取所有分组数据

可以通过在聚合操作中使用size方法进行设置,获取指定个数的数据组或者获取所有的数据组。

在上小节案例2的基础上再初始化一批测试数据:

curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/7' -d'{"name":"赵六","subject":"chinese","score":77}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/8' -d'{"name":"赵六","subject":"math","score":84}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/9' -d'{"name":"张飞","subject":"chinese","score":89}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/10' -d'{"name":"张飞","subject":"math","score":90}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/11' -d'{"name":"刘备","subject":"chinese","score":87}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/12' -d'{"name":"刘备","subject":"math","score":84}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/13' -d'{"name":"关羽","subject":"chinese","score":77}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/14' -d'{"name":"关羽","subject":"math","score":94}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/15' -d'{"name":"赵云","subject":"chinese","score":97}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/16' -d'{"name":"赵云","subject":"math","score":95}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/17' -d'{"name":"曹操","subject":"chinese","score":77}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/18' -d'{"name":"曹操","subject":"math","score":74}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/19' -d'{"name":"诸葛亮","subject":"chinese","score":99}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/20' -d'{"name":"诸葛亮","subject":"math","score":100}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/21' -d'{"name":"周瑜","subject":"chinese","score":78}'
curl -H "Content-Type: application/json" -XPOST 'http://master:9200/score/_doc/22' -d'{"name":"周瑜","subject":"math","score":72}'

通过在聚合操作上使用size方法进行设置:

//通过在聚合操作上使用size方法进行设置返回分组的数量,Integer.MAX_VALUE表示返回所有分组。
aggregation.size(Integer.MAX_VALUE);

上小节案例2代码修改如下:

package com.simoniu.db_elasticsearch.aggregations;

import org.apache.http.HttpHost;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.search.aggregations.Aggregation;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder;
import org.elasticsearch.search.aggregations.metrics.Sum;
import org.elasticsearch.search.builder.SearchSourceBuilder;

import java.util.List;

/**
 * 聚合统计:统计每个学员的总成绩
 * Created by simoniu
 */
public class EsAggCaseDemo2 {

    public static void main(String[] args) throws Exception {
        //获取RestClient连接
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("master", 9200, "http"),
                        new HttpHost("slave1", 9200, "http"),
                        new HttpHost("slave2", 9200, "http")));
        SearchRequest searchRequest = new SearchRequest();
        searchRequest.indices("score");

        //指定查询条件
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        //指定分组和求sum
        TermsAggregationBuilder aggregation = AggregationBuilders.terms("name_term")
                .field("name.keyword")//指定分组字段,如果是字符串(Text)类型,则需要指定使用keyword类型
                .subAggregation(AggregationBuilders.sum("sum_score").field("score"));//指定求sum,也支持avg、min、max等操作
        aggregation.size(Integer.MAX_VALUE);

        searchSourceBuilder.aggregation(aggregation);

        searchRequest.source(searchSourceBuilder);

        //执行查询操作
        SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);

        //获取分组信息
        Terms terms = searchResponse.getAggregations().get("name_term");
        List<? extends Terms.Bucket> buckets = terms.getBuckets();
        for (Terms.Bucket bucket : buckets) {
            //获取sum聚合的结果
            Sum sum = bucket.getAggregations().get("sum_score");
            System.out.println(bucket.getKey() + "---" + sum.getValue());
        }

        //关闭连接
        client.close();
    }
}

注意:在ES7.x版本之前,想要获取所有的分组数据,只需要在size中指定参数为0即可。现在ES7.x版本不支持这个数值了。

运行结果:

我们发现返回了11个学生的分组成绩。

关羽---171.0
刘备---171.0
周瑜---150.0
张三---148.0
张飞---179.0
曹操---151.0
李四---163.0
王五---165.0
诸葛亮---199.0
赵云---192.0
赵六---161.0

注意:如果最后的分组个数太多,会给ES造成比较大的压力,所以官方在这做了限制,让用户手工指定获取多少分组的数据。所以强烈不建议把size值设置为Integer.MAX_VALUE,应该由用户自己指定一个合理大小的整数,通常不要设置太大。