pyspark:

AttributeError: 'NoneType' object has no attribute 'setCallSite'

我草,是pyspark的bug。解决方法:

print("Approximately joining on distance smaller than 0.6:")
distance_min = model.approxSimilarityJoin(imsi_proc_df, imsi_proc_df, 1e6, distCol="JaccardDistance") \
.select(col("datasetA.id").alias("idA"),
col("datasetB.id").alias("idB"),
col("JaccardDistance")) #.filter("idA=idB")
print(distance_min.show())
print("*"*88)
print(imsi_proc_df.show()) key = Vectors.sparse(53, [1, 3], [1.0, 1.0])
print(model.approxNearestNeighbors(imsi_proc_df, key, 2).show())
print("start calculate find botnet!")
print("*"*99)
print("time start:", time.time())
print(type(distance_min), dir(distance_min))
print(dir(distance_min.toLocalIterator)) ############################################## add this line to solve
distance_min.sql_ctx.sparkSession._jsparkSession = spark_app._jsparkSession
distance_min._sc = spark_app._sc
############################################# similarity_val_rdd = distance_min.toLocalIterator #.collect()
print("time end:", time.time())
print(similarity_val_rdd)
print("*"*99)
try:
G = ConnectedGraph()
ddos_ue_list = []
for item in similarity_val_rdd():
imsi, imsi2, jacard_similarity_val = item["idA"], item["idB"], item["JaccardDistance"]
print("???", imsi, imsi2, jacard_similarity_val)

Description

reproducing the bug from the example in the documentation:

import pyspark
from pyspark.ml.linalg import Vectors
from pyspark.ml.stat import Correlation
spark = pyspark.sql.SparkSession.builder.getOrCreate()
dataset = [[Vectors.dense([1, 0, 0, -2])],
[Vectors.dense([4, 5, 0, 3])],
[Vectors.dense([6, 7, 0, 8])],
[Vectors.dense([9, 0, 0, 1])]]
dataset = spark.createDataFrame(dataset, ['features'])
df = Correlation.corr(dataset, 'features', 'pearson')
df.collect()
 

This produces the following stack trace:

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-92-e7889fa5d198> in <module>()
11 dataset = spark.createDataFrame(dataset, ['features'])
12 df = Correlation.corr(dataset, 'features', 'pearson')
---> 13 df.collect() /opt/spark/python/pyspark/sql/dataframe.py in collect(self)
530 [Row(age=2, name=u'Alice'), Row(age=5, name=u'Bob')]
531 """
--> 532 with SCCallSiteSync(self._sc) as css:
533 sock_info = self._jdf.collectToPython()
534 return list(_load_from_socket(sock_info, BatchedSerializer(PickleSerializer()))) /opt/spark/python/pyspark/traceback_utils.py in __enter__(self)
70 def __enter__(self):
71 if SCCallSiteSync._spark_stack_depth == 0:
---> 72 self._context._jsc.setCallSite(self._call_site)
73 SCCallSiteSync._spark_stack_depth += 1
74 AttributeError: 'NoneType' object has no attribute 'setCallSite'

Analysis:

Somehow the dataframe properties `df.sql_ctx.sparkSession._jsparkSession`, and `spark._jsparkSession` do not match with the ones available in the spark session.

The following code fixes the problem (I hope this helps you narrowing down the root cause)

df.sql_ctx.sparkSession._jsparkSession = spark._jsparkSession
df._sc = spark._sc df.collect() >>> [Row(pearson(features)=DenseMatrix(4, 4, [1.0, 0.0556, nan, 0.4005, 0.0556, 1.0, nan, 0.9136, nan, nan, 1.0, nan, 0.4005, 0.9136, nan, 1.0], False))]

pyspark AttributeError: 'NoneType' object has no attribute 'setCallSite'的更多相关文章

  1. python3 AttributeError: 'NoneType' object has no attribute 'split'

    from wsgiref.simple_server import make_server def RunServer(environ, start_response): start_response ...

  2. AttributeError: 'NoneType' object has no attribute 'split' 报错处理

    报错场景 social_django 组件对原生 django 的支持较好, 但是因为 在此DRF进行的验证为 JWT 方式 和 django 的验证存在区别, 因此需要进行更改自行支持 JWT 方式 ...

  3. python提示AttributeError: 'NoneType' object has no attribute 'append'【转发】

    在写python脚本时遇到AttributeError: 'NoneType' object has no attribute 'append' a=[] b=[1,2,3,4] a = a.appe ...

  4. python提示AttributeError: 'NoneType' object has no attribute 'append'

    在写python脚本时遇到AttributeError: 'NoneType' object has no attribute 'append' a=[] b=[1,2,3,4] a = a.appe ...

  5. Keras AttributeError 'NoneType' object has no attribute '_inbound_nodes'

    问题说明: 首先呢,报这个错误的代码是这行代码: model = Model(inputs=input, outputs=output) 报错: AttributeError 'NoneType' o ...

  6. AttributeError: 'NoneType' object has no attribute 'extend'

    Python使用中可能遇到的小问题 AttributeError: 'NoneType' object has no attribute 'extend' 或者AttributeError: 'Non ...

  7. 解决opencv:AttributeError: 'NoneType' object has no attribute 'copy'

    情况一: 路径中有中文,更改即可 情况二:可以运行代码,在运行结束时显示 AttributeError: 'NoneType' object has no attribute 'copy' 因为如果是 ...

  8. appium 报错:AttributeError:"NoneType' object has no attribute 'XXX'

    报错截图如下: 问题原因: 根据以上报错提示可已看到问题的原因为:logger中没有info此方法的调用,点击"具体报错的位置"上面的链接,可直接定位到具体的报错位置.根据分析所得 ...

  9. PIL中分离通道发生“AttributeError: 'NoneType' object has no attribute 'bands'”

    解决方法: 这个貌似是属于一个bug 把Image.py中的1500行左右的split函数改成如下即可: def split(self): "Split image into bands&q ...

随机推荐

  1. 分享一下我的个人微信小程序

    分享一下我的个人微信小程序 1.有我平时整理的一些小程序相关的技术,供大家参考. 2.有几个好玩的例子 有问题可以一起参考

  2. window10 phpstudy2018 mysql服务重启之后自动停止

    使用phpstudy集成环境开发php,但是可能版本太旧,导致有些语法用不了.所以决定删掉,再下一个新版的. 把phpstudy退出之后,就直接把phpstudy文件夹删除了.发现它并不能删除成功.然 ...

  3. 修改mysql自增字段的方法

    修改mysql自增字段的方法 修改 test_user 库 user 表 auto_increment为 10000(从10000开始递增) <pre>mysql> alter ta ...

  4. 什么是渐进式Web App(PWA)?为什么值得关注?

    转载自:https://blog.csdn.net/mogoweb/article/details/79029651 在开始PWA这个话题之前,我们先来看看Internet现状. 截至2017年1月, ...

  5. 关于su下bash:xxx :command not found

    今天在新建组的时候出了问题: $ su Password: # groupadd prj bash: groupadd :command not found 我就纳闷,明明是在su权限下,怎么还不能使 ...

  6. Django框架深入了解_04(DRF之url控制、解析器、响应器、版本控制、分页)

    一.url控制 基本路由写法:最常用 from django.conf.urls import url from django.contrib import admin from app01 impo ...

  7. jstack的使用:死锁问题实战

  8. rsync 使用

    rsync命令是一个远程数据同步工具,可通过LAN/WAN快速同步多台主机间的文件. rsync使用所谓的“rsync算法”来使本地和远程两个主机之间的文件达到同步,这个算法只传送两个文件的不同部分, ...

  9. [SOJ #538]好数 [CC]FAVNUM(2019-8-6考试)

    题目大意:给定$n$个正整数,求$[l,r]$中第$k$小的”好数“.$l,r\leqslant10^{18},n\leqslant62$,出现的其他数均$\leqslant10^{50}$ 好数定义 ...

  10. 解决springboot 新版本 2.1.6 spring-boot-starter-actuator 访问报404

    pom.xml <?xml version="1.0" encoding="UTF-8"?> <project xmlns="htt ...