大数据项目2(Java8聚合操作)

前言：为很好的理解这些方法，你需要熟悉java8特性Lambda和方法引用的使用

一：简介

　　我们用集合的目的，往往不是简单的仅仅把数据保存哪里。而是要检索(遍历)或者去计算或统计....操作集合里面的数据。现假设我有一个实体对象Person如下，用于测试集合操作

public class Person {

    public enum Sex {

        MALE, FEMALE

    }

    String name;

    LocalDate birthday;

    Sex gender;

    String emailAddress;

    Person(String nameArg, LocalDate birthdayArg, Sex genderArg, String emailArg) {

        name = nameArg;

        birthday = birthdayArg;

        gender = genderArg;

        emailAddress = emailArg;

    }

    public int getAge() {

        return birthday.until(IsoChronology.INSTANCE.dateNow()).getYears();

    }

    public void printPerson() {

        System.out.println(name + ", " + this.getAge());

    }

    public Sex getGender() {

        return gender;

    }

    public String getName() {

        return name;

    }

    public String getEmailAddress() {

        return emailAddress;

    }

    public LocalDate getBirthday() {

        return birthday;

    }

    public static int compareByAge(Person a, Person b) {

        return a.birthday.compareTo(b.birthday);

    }

    public static List<Person> createRoster() {

        List<Person> roster = new ArrayList<>();

        roster.add(new Person("Fred", IsoChronology.INSTANCE.date(1980, 6, 20), Person.Sex.MALE, "fred@example.com"));

        roster.add(new Person("Jane", IsoChronology.INSTANCE.date(1990, 7, 15), Person.Sex.FEMALE, "jane@example.com"));

        roster.add(

                new Person("George", IsoChronology.INSTANCE.date(1991, 8, 13), Person.Sex.MALE, "george@example.com"));

        roster.add(new Person("Bob", IsoChronology.INSTANCE.date(2000, 9, 12), Person.Sex.MALE, "bob@example.com"));

        return roster;

    }

    @Override

    public String toString() {

        return "Person [name=" + name + ", birthday=" + birthday + ", gender=" + gender + ", emailAddress="

                + emailAddress + "]";

    }

}

1.1 管道和流

NOTE: 管道是一个有序的聚合操作(A pipeline is a sequence of aggregate operations)

举例一：打印集合中所有性别为男性的名字

@Test

    public void test1() {

        roster.stream().filter(e -> e.getGender() == Person.Sex.MALE).forEach(e -> System.out.println(e.getName()));

    }

当然你也可以用 for-each 循环做到这一点

分析：管道中包含了那些组件

一个数据源，即一个集合 roster

0个或多个中间操作，本例中只有一个.filter过滤集合中的男性。

一个终止操作，即forEach遍历并打印姓名，无返回值。

举例二：计算平均数，即计算集合中男性的平局年龄

@Test

    public void test() {

        double average = roster.stream().filter(p -> p.getGender() == Person.Sex.MALE).mapToInt(Person::getAge)

                .average().getAsDouble();

        System.out.println(average);

    }

聚合过程 filter-->mapToInt-->average()-->getAsDouble

二：还原操作

Note:The JDK contains many terminal operations (such as average, sum, min, max, and count) that return one value by combining the contents of a stream. These operations are called reduction operations.

还原操作：个人理解还原操作即一种终止某种过程或状态的操作，并转换成另外一种状态。在这里即是对流的终止操作，并转化为另一种状态(另一种状态即Double或Int....包括引用类型)

这里主要介绍两种 Stream.reduce和Stream.collect方法。

2.1：Stream.reduce 计算总和

方法一：通过sum() 聚合 roster.stream().mapToInt(Person::getAge).sum();

方法二：即通过Stream.reduce

Integer totalAgeReduce = roster.stream().map(Person::getAge).reduce(0, (a, b) -> a + b);

NOTE: 0 可以改为任意Int数据，其作用就是在原总和的基础上加上0

2.2 :Stream.collect方法还原成其他类型

举例一：计算平均数，还原成指定类型Averager

public class Averager implements IntConsumer {

    private int total = 0;

    private int count = 0;

    public double average() {

        return count > 0 ? ((double) total) / count : 0;

    }

    public double getTotal() {

        return total;

    }

    @Override

    public void accept(int i) {

        total += i;

        count++;

    }

    public void combine(Averager other) {

        total += other.total;

        count += other.count;

    }

}

测试：

@Test

    public void test3() {

        SupplierImpl impl = new SupplierImpl();

        Averager averageCollect = roster.stream().filter(p -> p.getGender() == Person.Sex.MALE).map(Person::getAge)

                .collect(Averager::new, Averager::accept, Averager::combine);

        System.out.println("Average age of male members: " + averageCollect.average());

        Averager averageCollect2 = roster.stream().filter(p -> p.getGender() == Person.Sex.MALE).map(Person::getAge)

                .collect(impl, Averager::accept, Averager::combine);

        System.out.println("Average age of male members22: " + averageCollect2.average());

    }

The collect operation in this example takes three arguments:

supplier: The supplier is a factory function; it constructs new instances. For the collect operation, it creates instances of the result container. In this example, it is a new instance of the Averager class.
accumulator: The accumulator function incorporates a stream element into a result container. In this example, it modifies the Averager result container by incrementing the count variable by one and adding to the totalmember variable the value of the stream element, which is an integer representing the age of a male member.
combiner: The combiner function takes two result containers and merges their contents. In this example, it modifies an Averager result container by incrementing the count variable by the count member variable of the other Averager instance and adding to the total member variable the value of the other Averager instance's total member variable

举例二：Stream.collect方法还原成List<String>

@Test

    public void test5() {

        List<String> namesOfMaleMembersCollect = roster.stream().filter(p -> p.getGender() == Person.Sex.MALE)

                .map(p -> p.getName()).collect(Collectors.toList());

        namesOfMaleMembersCollect.forEach(System.out::print);

        System.out.println();

        namesOfMaleMembersCollect.add("1234");

        namesOfMaleMembersCollect.forEach(System.out::print);

    }

举例三：Stream.collect方法还原成Map<Object，List<Object>,

@Test

    public void test6() {

        Map<Sex, List<Person>> byGender = roster.stream().collect(Collectors.groupingBy(Person::getGender));

        Set<Sex> keySet = byGender.keySet();

        for (Sex sex : keySet) {

            List<Person> list = byGender.get(sex);

            System.out.println("sex = " + sex);

            list.forEach(System.out::print);

            System.out.println();

        }

    }

使用场景：对源数据进行分组

举例四：Stream.collect方法还原成Map<Object，List>,

@Test

    public void test7() {

        Map<Person.Sex, List<String>> namesByGender = roster.stream().collect(

                Collectors.groupingBy(Person::getGender, Collectors.mapping(Person::getName, Collectors.toList())));

        Set<Sex> keySet = namesByGender.keySet();

        for (Sex sex : keySet) {

            List<String> list = namesByGender.get(sex);

            System.out.println("sex = " + sex);

            list.forEach(System.out::print);

            System.out.println();

        }

        Map<Person.Sex, List<String>> namesByGender2 = roster.stream().collect(Collectors.groupingBy((p) -> {

            return p.getGender();

        }, Collectors.mapping(Person::getName, Collectors.toList())));

        Set<Sex> keySet2 = namesByGender2.keySet();

        for (Sex sex : keySet2) {

            List<String> list = namesByGender2.get(sex);

            System.out.println("sex = " + sex);

            list.forEach(System.out::print);

            System.out.println();

        }

    }

举例五：Stream.collect方法,分组并求和

@Test

    public void test8() {

        Map<Person.Sex, Integer> totalAgeByGender = roster.stream().collect(

                Collectors.groupingBy(Person::getGender, Collectors.reducing(0, Person::getAge, Integer::sum)));

        Set<Sex> keySet2 = totalAgeByGender.keySet();

        for (Sex sex : keySet2) {

            Integer integer = totalAgeByGender.get(sex);

            System.out.println("sex = " + sex);

            System.out.println("年龄sum = " + integer);

        }

    }

举例六：Stream.collect方法,分组并计算平均数

 @Test

    public void test9() {

        Map<Person.Sex, Double> averageAgeByGender = roster.stream()

                .collect(Collectors.groupingBy(Person::getGender, Collectors.averagingInt(Person::getAge)));

        Set<Sex> keySet2 = averageAgeByGender.keySet();

        for (Sex sex : keySet2) {

            Double double1 = averageAgeByGender.get(sex);

            System.out.println("sex = " + sex);

            System.out.println("年龄avg = " + double1);

        }

    }

三并行操作(Parallelism)

llll