Accumulator<Long> implements of JavaSparkContext in Spark1.x

As we all know , up to Spark 1.6.2, JavaSparkContext only provides two kinds of accumulators: Integer and Double.

However, unfortunately I've met with problems of Integer overflow and the program returned me a negative number.

So I have to use original sparkcontext to implement the Long accumulator.

public static class LongAccumulatorParam implements AccumulatorParam<Long>,Serializable {

        @Override

        public Long addAccumulator(final Long r, final Long t) {

            return r + t;

        }

        @Override

        public Long addInPlace(final Long r1, final Long r2) {

            return  r1 + r2;

        }

        @Override

        public Long zero(final Long initialValue) {

            return 0L;

        }

    }

final Accumulator<Long> acc = jsc.sc().accumulator(new Long(0), new LongAccumulatorParam());

Actually it is pretty simple. I haven't looked into Spark 2 yet, hope the developers have fixed this issue.

Accumulator<Long> implements of JavaSparkContext in Spark1.x的更多相关文章

java使用spark/spark-sql处理schema数据(spark1.6)
1.spark是什么? Spark是基于内存计算的大数据并行计算框架. 1.1 Spark基于内存计算相比于MapReduce基于IO计算,提高了在大数据环境下数据处理的实时性. 1.2 高容错性和 ...
【Spark Java API】broadcast、accumulator
转载自:http://www.jianshu.com/p/082ef79c63c1 broadcast 官方文档描述: Broadcast a read-only variable to the cl ...
spark-2.2.0-bin-hadoop2.6和spark-1.6.1-bin-hadoop2.6发行包自带案例全面详解（java、python、r和scala）之Basic包下的JavaPageRank.java（图文详解）
不多说,直接上干货! spark-1.6.1-bin-hadoop2.6里Basic包下的JavaPageRank.java /* * Licensed to the Apache Software ...
spark 变量使用 broadcast、accumulator
broadcast 官方文档描述: Broadcast a read-only variable to the cluster, returning a [[org.apache.spark.broa ...
Spark1.6.2 java实现读取json数据文件插入MySql数据库
public class Main implements Serializable { /** * */ private static final long serialVersionUID = -8 ...
Spark1.6.2 java实现读取txt文件插入MySql数据库代码
package com.gosun.spark1; import java.util.ArrayList;import java.util.List;import java.util.Properti ...
flink - accumulator
读accumlator JobManager 在job finish的时候会汇总accumulator的值, newJobStatus match { case JobStatus.FINISHE ...
spark1.4的本地模式编程练习（1）
spark编程练习申明:以下代码仅作学习参考使用,勿使用在商业用途. Wordcount UserMining TweetMining HashtagMining InvertedIndex Tes ...
Spark1.0.x入门指南
1 节点说明 IP Role 192.168.1.111 ActiveNameNode 192.168.1.112 StandbyNameNode,Master,Worker 192.168.1. ...

随机推荐

Vue axios 返回数据绑定到vue对象问题
在项目中需要用到后台的数据对前端渲染,使用到了vue整合的axios,使用vue中的钩子函数在页面组件挂载完成之后向后台发送一个get请求然后将返回后的数据赋值data()中定义的属性: 执行后前端报 ...
python全栈开发day78、79 --bss项目
一.回顾 1. BBS项目 CMS 1. 登录 1. form组件 2. auth模块 3. 验证码 2. 注册 1. form组件 1. 生成html代码直接for循环form_obj,就能够遍历 ...
Python_collections_namedtuple可命名元组
namedtuple:用来构建带字段名的元组 import collections # 创建类,两种创建方法 MytupleClass = collections.namedtuple('Mytupl ...
zabbix通过shell脚本安装异常问题定位
htxk-106主机信息现象如下: 通过zabbix_get命令 zabbix_get [7189]: Check access restrictions in Zabbix agent config ...
input按钮去掉默认样式
border: 1px solid transparent; //自定义边框 outline: none; //消除默认点击蓝色边框效果
hive启用压缩
<property> <name>hive.exec.compress.intermediate</name> <value>true</valu ...
013 mysql中find_in_set()函数的使用
在工作中遇见过,对于新知识,在这里写一写文档. 1.作用举个例子,也许不理解,在看完后面的SQL示例,再来看就明白了: 有个文章表里面有个type字段,它存储的是文章类型,有 1头条.2推荐.3热点 ...
犹记当年写出bug睡不着，回想今天只求睡好渡余生……
不想面对已经在博客园注册了3年多的时间了,就是这么快的就已经过去了近3年的工作时间,从最开始的对编程的困惑到慢慢有一点的认识,好像哦就这样没有什么啊,也没有涉及到一些比较难的东西. 但是当初第一份工 ...
java的conllections.sort排序
https://www.cnblogs.com/yw0219/p/7222108.html?utm_source=itdadao&utm_medium=referral
转载：搭建完整的arm-linux-gcc等交叉编译环境（感谢CSDN博主的分享）
安装环境 Linux版本:Ubuntu 12.04 内核版本:Linux 3.5.0 交叉编译器版本:arm-linux-gcc-4.4.3 这个版本的交叉编译器安装前的絮叨首先简单介绍 ...

Accumulator<Long> implements of JavaSparkContext in Spark1.x

Accumulator<Long> implements of JavaSparkContext in Spark1.x的更多相关文章

随机推荐

热门专题