关于Linux中搭建分布式时可能遇到的问题

这个问题来自于今天安装zookeeper时踩的一个大坑,害的我花了一天时间。在搭建zookeeper的分布式时,往往要进行这样的配置:

server.1=hadoop01:2888:3888
server.2=hadoop02:2888:3888
server.3=hadoop03:2888:3888

一开始我是按照这样的配置来做的,后来死活不成功,zookeeper.out中的信息如下:

2017-04-21 06:05:34,385 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@400] - Cannot open channel to 3 at election address hadoop05/192.168.31.155:3888
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:426)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:843)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:822)

先不要关心日志的时间(其实你们也不会关注的,这个是没有配时间,所以显示早上六点,哪个傻叉会6点起来)。

对于这个错误,网上大多数说的是zookeeper启动顺序导致开始选举不稳定引起的,不用担心,过一会儿就会好的,可是对于我来说并不管用。然后就是什么主机映射之类的,就算配了主机映射,三台虚拟机都可以相互ping通,似乎也没什么卵用。继续查资料,发现又有这样的一种配置:

server.1=192.168.31.151:2888:3888
server.2=192.168.31.152:2888:3888
server.3=192.168.31.153:2888:3888

然后我又照着这种配置又配了一遍,发现这样居然可以,leader和follower都选出来了。激动之余,尼玛问题究竟出在哪儿,这样两种配置有啥不一样,这又让我寝食难安,百度了一圈毛都没发现。然后就google去了,反正就是在一个犄角旮旯找到了一个问答,发现别人也是有这种问题,链接在此:

https://unix.stackexchange.com/questions/240506/zookeeper-dns-name-problems-with-leader-elections-when-migrating-from-windows-to

问的题目是:

Zookeeper DNS name problems with leader elections when migrating from Windows to Debian

回答的人就说了:

The "smoking gun" was this line in my zookeeper log:

2015-11-26 20:48:31,439 [myid:1] - INFO
[Thread-2:QuorumCnxManager$Listener@504] - My election bind port:
spring-xd-1/127.0.0.1:3888

So, why was Zookeeper binding the election port on the loopback interface? Well...

My /etc/hosts on one of the VMs looked like this:

127.0.0.1   spring-xd-1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1         localhost localhost.localdomain
localhost6 localhost6.localdomain6 ## vagrant-hostmanager-start
172.28.128.3 spring-xd-1
172.28.128.4 spring-xd-2
172.28.128.7 spring-xd-3
## vagrant-hostmanager-end

I removed the hostname from the 127.0.0.1 line in /etc/hosts and bounced the zookeeper service on all 3 nodes, and BAM! everything came up roses. So, now the host file on each machine looks like this:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1         localhost localhost.localdomain
localhost6 localhost6.localdomain6 ## vagrant-hostmanager-start
172.28.128.3 spring-xd-1
172.28.128.4 spring-xd-2
172.28.128.7 spring-xd-3
## vagrant-hostmanager-end

最后一段有点启发意义:

EDIT: According to

http://ccl.cse.nd.edu/operations/condor/hostname.shtml, this seems to

be a fairly common problem with clustered apps on Linux, and

recommends editing the hosts file as I've described above. However,

the Zookeeper documentation on cluster setup doesn't mention it.

想去访问这个说明这个问题的网址,可惜访问不了,mmp!

P.S. 搭zookeeper这个硬是要搞出人命

Cannot open channel to 3 at election address :3888 java.net.ConnectException: Connection refused (Connection refused)的更多相关文章

  1. zookeeper 集群 Cannot open channel to X at election address Error contacting service. It is probably not running.

    zookeeper集群   启动 1.问题现象. 启动每一个都提示  STARTED 但是查看 status时全部节点都报错 [root@ip-172-31-19-246 bin]# sh zkSer ...

  2. 报错:WARN [WorkerSender[myid=1]:QuorumCnxManager@584] - Cannot open channel to 2 at election address /x.x.x.x:3888

    报错背景: zookeeper安装完成之后,启动之后正常,但是查看log文件zookeeper.log时发现报错. 报错现象: -- ::, [myid:] - INFO [WorkerSender[ ...

  3. zookeeper 集群配置采坑 Connection refused WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot open channel to 3 at election address slave2/192.168.127.133:3888

    坑一: Cannot open channel to at election address slave1/ java.net.ConnectException: Connection refused ...

  4. [异常笔记] zookeeper集群启动异常: Cannot open channel to 2 at election address ……

    - ::, [myid:] - WARN [WorkerSender[myid=]:QuorumCnxManager@] - Cannot open channel to at election ad ...

  5. WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot open channel to 4 at election address Slave3.Hadoop/xxx.xxx.xxx.xxx

    这些日子为这个错误苦恼很久了,网上找到的各种方法都试了一遍,还是没能解决. 安装好zookeeper后,运行zkServer.sh start 显示正常启动,但运行zkServer.sh status ...

  6. zookeeper启动时报Cannot open channel to X at election address Error contacting service. It is probably not running.

    配置storm集群的时候出现如下异常: 2016-06-26 14:10:17,484 [myid:1] - WARN [SyncThread:1:FileTxnLog@334] - fsync-in ...

  7. How to support both ipv4 and ipv6 address for JAVA code.

    IPv6 have colon character, for example FF:00::EEIf concatenate URL String, IPv6 URL will like: http: ...

  8. Zookeeper 启动错误

    启动后日志如下 : 2016-09-14 05:51:19,449 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeade ...

  9. Zookeeper- Error contacting service. It is probably not running解决方案和原理

    搭建启动Zookeeper集群出现Error contacting service. It is probably not running解决方案和原理 1.关闭防火墙  [root@srv01 bi ...

随机推荐

  1. [BZOJ 3108] 图的逆变换

    Link: BZOJ 3108 传送门 Solution: 样例教你做题系列 观察第三个输出为No的样例,发现只要存在$edge(i,k),edge(j,k)$,那么$i,j$的出边一定要全部相同 于 ...

  2. .xcodeprok cannot be opened because the project file cannot be parsed

    用svn更新代码后,打开xcode工程文件出现 xxx..xcodeproj cannot be opened because the project file cannot be parsed. 这 ...

  3. Winform打砖块游戏制作step by step第三节---移动挡板

    一 引子 为了让更多的编程初学者,轻松愉快地掌握面向对象的思考方法,对象继承和多态的妙用,故推出此系列随笔,还望大家多多支持. 预备知识,无GDI画图基础的童鞋请先阅读一篇文章让你彻底弄懂WinFor ...

  4. c#异步线程:同步调用,异步调用,异步回调

    定义一个异步线程类: public class AsyEventClass { private static ILog logger = LogManager.GetLogger(MethodBase ...

  5. DATASNAP数据提交之FIREDAC的TFDJSONDeltas

    DATASNAP数据提交之FIREDAC的TFDJSONDeltas FIREDAC的TFDJSONDeltas相当于CLIENTDATASET的DELTA,是作为CLIENTDATASET.DELT ...

  6. mysql 将多个查询结果合并成一行

    mysql中的多行查询结果合并成一个 SELECT GROUP_CONCAT(md.data1) FROM DATA md,contacts cc WHERE md.conskey=cc.id AND ...

  7. vue中自定义指令vue.direvtive,自定义过滤器vue.filter(),vue过渡transition

    自定义指令 默认设置的核心指令( v-model,v-bind,v-for,v-if,v-on等 ),Vue 也允许注册自定义指令.注意,在 Vue2.0 里面,代码复用的主要形式和抽象是组件——然而 ...

  8. rabbitmq集群节点操作

    节点恢复过程中把数据删掉很重要,恢复一单结点,再清数据 节点增加: 1. rabbitmq-server -detached   --- .erlang.cooike的权限,400 属主rabbitm ...

  9. 解决no declaration can be found for element 'context:component-scan'

    <?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.spr ...

  10. ECSHOP商品描述和文章里不加水印,只在商品图片和商品相册加水印

    fckeditor\editor\filemanager\connectors\php //判断并给符合条件图片加上水印 if ($**tension == 'jpg' || $**tension = ...