java实现spark常用算子之SortByKey
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.VoidFunction;
import scala.Tuple2; import java.util.Arrays;
import java.util.List; /**
* sortbykey([ascending],[numTasks]) 算子:
* 根据key进行排序操作
* 第一个参数为true,则为升序,反之为降序
* 第二个参数决定执行的task数目
*
*/
public class SortByKeyOperator {
public static void main(String[] args){
SparkConf conf = new SparkConf().setMaster("local").setAppName("sortByKey");
JavaSparkContext sc = new JavaSparkContext(conf); List<Tuple2<String,Integer>> list = Arrays.asList(
new Tuple2<String,Integer>("w1",1),
new Tuple2<String,Integer>("w2",2),
new Tuple2<String,Integer>("w3",3),
new Tuple2<String,Integer>("w2",22),
new Tuple2<String,Integer>("w1",11)
); JavaPairRDD<String,Integer> pairRdd = sc.parallelizePairs(list); JavaPairRDD<String,Integer> result = pairRdd.sortByKey(true,2); result.foreach(new VoidFunction<Tuple2<String, Integer>>() {
@Override
public void call(Tuple2<String, Integer> stringIntegerTuple2) throws Exception {
System.err.println(stringIntegerTuple2._1+":"+stringIntegerTuple2._2);
}
}); }
}
微信扫描下图二维码加入博主知识星球,获取更多大数据、人工智能、算法等免费学习资料哦!
java实现spark常用算子之SortByKey的更多相关文章
- java实现spark常用算子之Union
import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...
- java实现spark常用算子之TakeSample
import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...
- java实现spark常用算子之SaveAsTextFile
import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...
- java实现spark常用算子之Repartitions
import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...
- java实现spark常用算子之mapPartitionsWithIndex
import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...
- java实现spark常用算子之map
import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...
- java实现spark常用算子之intersection
import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...
- java实现spark常用算子之frist
import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...
- java实现spark常用算子之flatmap
import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...
随机推荐
- easyUI之ComboBox(下拉列表框)
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <hea ...
- springmvc集成swagger
1.保证项目为maven项目 2.导入jar包依赖 <dependency> <groupId>io.springfox</groupId> <artifac ...
- ceph报错
[ceph_deploy.mon][ERROR ] RuntimeError: config file /etc/ceph/ceph.conf exists with different conten ...
- 搭建Kubernetes容器集群管理系统
1.Kubernetes 概述 Kubernetes 是 Google 开源的容器集群管理系统,基于 Docker 构建一个容器的调度服务,提供资源调度.均衡容灾.服务注册.劢态扩缩容等功能套件. 基 ...
- IPv6 ping命令
IPv6 ping命令 一.Linux操作系统 给一台 Linux 主机分配了一个 IPv6 的 IP地址,如何使用 ping命令 确定该 IP地址 能否 ping 通呢? 1.查看主机的 IPv6 ...
- Java日志体系(六)log4j2
1.1 简介 log4j2,一个日志的实现框架,是log4j的升级版本,于2014年7月正式亮相.与第一代log4j不同,log4j2完全重写了log4j的日志实现,并不是在原有基础上进行的升级,解决 ...
- react-native-picke Cannot read property '_init' of undefined
使用react-native-picker报以下错误: 查看了react-native-picke的issues: https://github.com/beefe/react-native-pick ...
- weblogic12.1.3部署应用程序
weblogic12.1.3部署应用程序请参照:https://www.cnblogs.com/xdp-gacl/p/4143413.html
- lumen返回网站base url
可以使用全局帮助函数url() echo url(); //输出http://test.domain.com
- Java中验证编码格式的一种方法
package forlittlecatty; import java.io.File; import java.io.FileInputStream; import java.io.IOExcept ...