关于Linux中搭建分布式时可能遇到的问题

这个问题来自于今天安装zookeeper时踩的一个大坑,害的我花了一天时间。在搭建zookeeper的分布式时,往往要进行这样的配置:

server.1=hadoop01:2888:3888
server.2=hadoop02:2888:3888
server.3=hadoop03:2888:3888

一开始我是按照这样的配置来做的,后来死活不成功,zookeeper.out中的信息如下:

2017-04-21 06:05:34,385 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@400] - Cannot open channel to 3 at election address hadoop05/192.168.31.155:3888
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:426)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:843)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:822)

先不要关心日志的时间(其实你们也不会关注的,这个是没有配时间,所以显示早上六点,哪个傻叉会6点起来)。

对于这个错误,网上大多数说的是zookeeper启动顺序导致开始选举不稳定引起的,不用担心,过一会儿就会好的,可是对于我来说并不管用。然后就是什么主机映射之类的,就算配了主机映射,三台虚拟机都可以相互ping通,似乎也没什么卵用。继续查资料,发现又有这样的一种配置:

server.1=192.168.31.151:2888:3888
server.2=192.168.31.152:2888:3888
server.3=192.168.31.153:2888:3888

然后我又照着这种配置又配了一遍,发现这样居然可以,leader和follower都选出来了。激动之余,尼玛问题究竟出在哪儿,这样两种配置有啥不一样,这又让我寝食难安,百度了一圈毛都没发现。然后就google去了,反正就是在一个犄角旮旯找到了一个问答,发现别人也是有这种问题,链接在此:

https://unix.stackexchange.com/questions/240506/zookeeper-dns-name-problems-with-leader-elections-when-migrating-from-windows-to

问的题目是:

Zookeeper DNS name problems with leader elections when migrating from Windows to Debian

回答的人就说了:

The "smoking gun" was this line in my zookeeper log:

2015-11-26 20:48:31,439 [myid:1] - INFO
[Thread-2:QuorumCnxManager$Listener@504] - My election bind port:
spring-xd-1/127.0.0.1:3888

So, why was Zookeeper binding the election port on the loopback interface? Well...

My /etc/hosts on one of the VMs looked like this:

127.0.0.1   spring-xd-1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1         localhost localhost.localdomain
localhost6 localhost6.localdomain6 ## vagrant-hostmanager-start
172.28.128.3 spring-xd-1
172.28.128.4 spring-xd-2
172.28.128.7 spring-xd-3
## vagrant-hostmanager-end

I removed the hostname from the 127.0.0.1 line in /etc/hosts and bounced the zookeeper service on all 3 nodes, and BAM! everything came up roses. So, now the host file on each machine looks like this:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1         localhost localhost.localdomain
localhost6 localhost6.localdomain6 ## vagrant-hostmanager-start
172.28.128.3 spring-xd-1
172.28.128.4 spring-xd-2
172.28.128.7 spring-xd-3
## vagrant-hostmanager-end

最后一段有点启发意义:

EDIT: According to

http://ccl.cse.nd.edu/operations/condor/hostname.shtml, this seems to

be a fairly common problem with clustered apps on Linux, and

recommends editing the hosts file as I've described above. However,

the Zookeeper documentation on cluster setup doesn't mention it.

想去访问这个说明这个问题的网址,可惜访问不了,mmp!

P.S. 搭zookeeper这个硬是要搞出人命

Cannot open channel to 3 at election address :3888 java.net.ConnectException: Connection refused (Connection refused)的更多相关文章

  1. zookeeper 集群 Cannot open channel to X at election address Error contacting service. It is probably not running.

    zookeeper集群   启动 1.问题现象. 启动每一个都提示  STARTED 但是查看 status时全部节点都报错 [root@ip-172-31-19-246 bin]# sh zkSer ...

  2. 报错:WARN [WorkerSender[myid=1]:QuorumCnxManager@584] - Cannot open channel to 2 at election address /x.x.x.x:3888

    报错背景: zookeeper安装完成之后,启动之后正常,但是查看log文件zookeeper.log时发现报错. 报错现象: -- ::, [myid:] - INFO [WorkerSender[ ...

  3. zookeeper 集群配置采坑 Connection refused WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot open channel to 3 at election address slave2/192.168.127.133:3888

    坑一: Cannot open channel to at election address slave1/ java.net.ConnectException: Connection refused ...

  4. [异常笔记] zookeeper集群启动异常: Cannot open channel to 2 at election address ……

    - ::, [myid:] - WARN [WorkerSender[myid=]:QuorumCnxManager@] - Cannot open channel to at election ad ...

  5. WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot open channel to 4 at election address Slave3.Hadoop/xxx.xxx.xxx.xxx

    这些日子为这个错误苦恼很久了,网上找到的各种方法都试了一遍,还是没能解决. 安装好zookeeper后,运行zkServer.sh start 显示正常启动,但运行zkServer.sh status ...

  6. zookeeper启动时报Cannot open channel to X at election address Error contacting service. It is probably not running.

    配置storm集群的时候出现如下异常: 2016-06-26 14:10:17,484 [myid:1] - WARN [SyncThread:1:FileTxnLog@334] - fsync-in ...

  7. How to support both ipv4 and ipv6 address for JAVA code.

    IPv6 have colon character, for example FF:00::EEIf concatenate URL String, IPv6 URL will like: http: ...

  8. Zookeeper 启动错误

    启动后日志如下 : 2016-09-14 05:51:19,449 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeade ...

  9. Zookeeper- Error contacting service. It is probably not running解决方案和原理

    搭建启动Zookeeper集群出现Error contacting service. It is probably not running解决方案和原理 1.关闭防火墙  [root@srv01 bi ...

随机推荐

  1. LIS【p1704】寻找最优美做题曲线

    Description 洛谷OJ刷题有个有趣的评测功能,就是系统自动绘制出用户的"做题曲线".所谓做题曲线就是一条曲线,或者说是折线,是这样定义的:假设某用户在第b[i]天AC了c ...

  2. POJ3294 Life Forms(二分+后缀数组)

    给n个字符串,求最长的多于n/2个字符串的公共子串. 依然是二分判定+height分组. 把这n个字符串连接,中间用不同字符隔开,跑后缀数组计算出height: 二分要求的子串长度,判断是否满足:he ...

  3. JAVA常见算法题(十九)

    package com.xiaowu.demo; /** * * 有一分数序列:2/1,3/2,5/3,8/5,13/8,21/13...求出这个数列的前20项之和. * * * @author WQ ...

  4. 彻底解决DZ大附件上传问题

    个. 注意:很多人遇到修改php.ini后重应WEB服务后仍然不能生效.这种情况应该先确认一下所改的php.ini是不是当前PHP所使用的.您可以在WEB目录下建立一个php文件,内容很简单就一句话& ...

  5. ES6中的迭代器(Iterator)和生成器(Generator)(一)

    用循环语句迭代数据时,必须要初始化一个变量来记录每一次迭代在数据集合中的位置,而在许多编程语言中,已经开始通过程序化的方式用迭代器对象返回迭代过程中集合的每一个元素 迭代器的使用可以极大地简化数据操作 ...

  6. 【Hadoop】Hadoop MR 性能优化 Combiner机制

    1.概念 2.参考资料 提高hadoop的mapreduce job效率笔记之二(尽量的用Combiner) :http://sishuo(k).com/forum/blogPost/list/582 ...

  7. scrapy爬虫程序xpath中文编码报错

    2017-03-23 问题描述: #选择出节点中“时间”二字 <h2>时间</h2> item["file_urls"]= response.xpath(& ...

  8. Android——Activity的生命周期

    一,Demo測试Activity的生命周期 写两个Activity: package com.example.activity_04; import android.os.Bundle; import ...

  9. 《Java程序猿面试笔试宝典》之volatile有什么作用

    在由Java语言编写的程序中.有时候为了提高程序的执行效率,编译器会自己主动对其进行优化,把经常被訪问的变量缓存起来,程序在读取这个变量的时候有可能会直接从缓存(比如寄存器)中来读取这个值.而不会去内 ...

  10. SSO单点登录系列4:cas-server登录页面自定义修改过程(jsp页面修改)

    落雨 cas 单点登录 SSO单点登录系列4:cas-server登录页面自定义修改过程,全新DIY. 目标:    下面是正文: 打开cas的默认首页,映入眼帘的是满眼的中文and英文混杂体,作为一 ...