问题

  • session 如何生成的?sessionId为什么不直接使用时间戳+单机名
  • sessionid 关闭的时候的逻辑,sessionid 的维护是由各节点还是leader ?

会话相关

sessionid 生成

  我们看一下session 管理类 SessionTrackerImpl。 它主要维护三个字段


//根据 sessionid 存放的 session
HashMap<Long, SessionImpl> sessionsById = new HashMap<Long, SessionImpl>(); //key 是时间 ,用于根据下次会话超时时间点来归纳会话,便于进行会话管理和超时检查,(分桶策略容器)
HashMap<Long, SessionSet> sessionSets = new HashMap<Long, SessionSet>(); //key 是 sessionId , value 是过期时间
ConcurrentHashMap<Long, Integer> sessionsWithTimeout;

  其他的都是操作 session 的方法,我们看一下创建sessionId的过程,SessionImpl 类

    public static long initializeNextSession(long id) {
long nextSid = 0;
nextSid = (Time.currentElapsedTime() << 24) >>> 8;
nextSid = nextSid | (id <<56);
return nextSid;
}

  得出的 sessionid前8位确定所在的机器,后56位使用当前时间的毫秒数表示进行随机。

session 管理

  使用的是分桶策略,如下图所示。

  以时间戳为节点,每个桶装这该时间点(过期的时间)的 session集合,然后有一个线程那桶里的多个session 进行检查,加入过期时间被延长,那么session 进行迁移到其他的桶,否则将被清理。

创建会话执行事务过程

  创建会话之前我们要先知道zk服务器是如何和客户端连接的。上一篇文章结尾处,我们知道了处理与 客户端的连接主要是由 NIOServerCnxnFactory 来负责的。而真正处理的逻辑就在 run 方法。

    /**
* 默认允许连接 60个客户端,
* 这个(一个)线程会处理
* - 来自客户端的连接
* - 来自客户端的读
* - 来自客户端的写
*
*
*/
public void run() {
//只要没有断开,循环一直进行
//下面连接就是我们熟悉的 java NIO 的运用
while (!ss.socket().isClosed()) {
try {
//select 方法一直就阻塞
selector.select(1000);
Set<SelectionKey> selected; //这里为什么要加锁呢?可能有多个线程
synchronized (this) {
selected = selector.selectedKeys();
}
ArrayList<SelectionKey> selectedList = new ArrayList<SelectionKey>(
selected);
Collections.shuffle(selectedList);
for (SelectionKey k : selectedList) {
if ((k.readyOps() & SelectionKey.OP_ACCEPT) != 0) {
SocketChannel sc = ((ServerSocketChannel) k
.channel()).accept();
InetAddress ia = sc.socket().getInetAddress();
int cnxncount = getClientCnxnCount(ia);
//客户端可以保存 60 个连接
if (maxClientCnxns > 0 && cnxncount >= maxClientCnxns){
LOG.warn("Too many connections from " + ia
+ " - max is " + maxClientCnxns );
sc.close();
} else {
LOG.info("Accepted socket connection from "
+ sc.socket().getRemoteSocketAddress());
sc.configureBlocking(false);
SelectionKey sk = sc.register(selector,
SelectionKey.OP_READ);
//创建连接,连接用 NIOServerCnxn这个类来维护
NIOServerCnxn cnxn = createConnection(sc, sk);
sk.attach(cnxn);
addCnxn(cnxn);
}
} else if ((k.readyOps() & (SelectionKey.OP_READ | SelectionKey.OP_WRITE)) != 0) {
//处理客户端读写操作,重点看 doIO 方法
NIOServerCnxn c = (NIOServerCnxn) k.attachment();
c.doIO(k);
} else {
if (LOG.isDebugEnabled()) {
LOG.debug("Unexpected ops in select "
+ k.readyOps());
}
}
}
selected.clear();
} catch (RuntimeException e) {
LOG.warn("Ignoring unexpected runtime exception", e);
} catch (Exception e) {
LOG.warn("Ignoring exception", e);
}
}
closeAll();
LOG.info("NIOServerCnxn factory exited run method");
}
我们再来看一下 doIO这个方法 , 位于 NIOServerCnxn 内
    /**
* Handles read/write IO on connection.
*/
void doIO(SelectionKey k) throws InterruptedException {
try {
if (isSocketOpen() == false) {
LOG.warn("trying to do i/o on a null socket for session:0x"
+ Long.toHexString(sessionId)); return;
}
//可读类型
if (k.isReadable()) {
int rc = sock.read(incomingBuffer);
if (rc < 0) {
throw new EndOfStreamException(
"Unable to read additional data from client sessionid 0x"
+ Long.toHexString(sessionId)
+ ", likely client has closed socket");
}
if (incomingBuffer.remaining() == 0) {
boolean isPayload; //读取下一个请求
if (incomingBuffer == lenBuffer) { // start of next request
//翻转缓存区,可读
incomingBuffer.flip();
//读取lenBuffer的前四个字节,当读取的是内容长度时则为true,否则为false
isPayload = readLength(k);
//清空缓存
incomingBuffer.clear();
} else {
// continuation
isPayload = true;
} // isPayload 为 true ,表示buffer 里面是负载
if (isPayload) { // not the case for 4letterword
readPayload();
}
else {
// four letter words take care
// need not do anything else
return;
}
}
}
//可写类型
if (k.isWritable()) {
// ZooLog.logTraceMessage(LOG,
// ZooLog.CLIENT_DATA_PACKET_TRACE_MASK
// "outgoingBuffers.size() = " +
// outgoingBuffers.size());
if (outgoingBuffers.size() > 0) {
// ZooLog.logTraceMessage(LOG,
// ZooLog.CLIENT_DATA_PACKET_TRACE_MASK,
// "sk " + k + " is valid: " +
// k.isValid()); /*
* This is going to reset the buffer position to 0 and the
* limit to the size of the buffer, so that we can fill it
* with data from the non-direct buffers that we need to
* send.
*/
ByteBuffer directBuffer = factory.directBuffer;
directBuffer.clear(); for (ByteBuffer b : outgoingBuffers) {
if (directBuffer.remaining() < b.remaining()) {
/*
* When we call put later, if the directBuffer is to
* small to hold everything, nothing will be copied,
* so we've got to slice the buffer if it's too big.
*/
b = (ByteBuffer) b.slice().limit(
directBuffer.remaining());
}
/*
* put() is going to modify the positions of both
* buffers, put we don't want to change the position of
* the source buffers (we'll do that after the send, if
* needed), so we save and reset the position after the
* copy
*/
int p = b.position();
directBuffer.put(b);
b.position(p);
if (directBuffer.remaining() == 0) {
break;
}
}
/*
* Do the flip: limit becomes position, position gets set to
* 0. This sets us up for the write.
*/
directBuffer.flip(); int sent = sock.write(directBuffer);
ByteBuffer bb; // Remove the buffers that we have sent
while (outgoingBuffers.size() > 0) {
bb = outgoingBuffers.peek();
if (bb == ServerCnxnFactory.closeConn) {
throw new CloseRequestException("close requested");
}
int left = bb.remaining() - sent;
if (left > 0) {
/*
* We only partially sent this buffer, so we update
* the position and exit the loop.
*/
bb.position(bb.position() + sent);
break;
}
packetSent();
/* We've sent the whole buffer, so drop the buffer */
sent -= bb.remaining();
outgoingBuffers.remove();
}
// ZooLog.logTraceMessage(LOG,
// ZooLog.CLIENT_DATA_PACKET_TRACE_MASK, "after send,
// outgoingBuffers.size() = " + outgoingBuffers.size());
} synchronized(this.factory){
if (outgoingBuffers.size() == 0) {
if (!initialized
&& (sk.interestOps() & SelectionKey.OP_READ) == 0) {
throw new CloseRequestException("responded to info probe");
}
sk.interestOps(sk.interestOps()
& (~SelectionKey.OP_WRITE));
} else {
sk.interestOps(sk.interestOps()
| SelectionKey.OP_WRITE);
}
}
}
} catch (CancelledKeyException e) {
LOG.warn("CancelledKeyException causing close of session 0x"
+ Long.toHexString(sessionId));
if (LOG.isDebugEnabled()) {
LOG.debug("CancelledKeyException stack trace", e);
}
close();
} catch (CloseRequestException e) {
// expecting close to log session closure
close();
} catch (EndOfStreamException e) {
LOG.warn(e.getMessage());
if (LOG.isDebugEnabled()) {
LOG.debug("EndOfStreamException stack trace", e);
}
// expecting close to log session closure
close();
} catch (IOException e) {
LOG.warn("Exception causing close of session 0x"
+ Long.toHexString(sessionId) + ": " + e.getMessage());
if (LOG.isDebugEnabled()) {
LOG.debug("IOException stack trace", e);
}
close();
}
} /** Read the request payload (everything following the length prefix) */
private void readPayload() throws IOException, InterruptedException {
if (incomingBuffer.remaining() != 0) { // have we read length bytes?
int rc = sock.read(incomingBuffer); // sock is non-blocking, so ok
if (rc < 0) {
throw new EndOfStreamException(
"Unable to read additional data from client sessionid 0x"
+ Long.toHexString(sessionId)
+ ", likely client has closed socket");
}
} if (incomingBuffer.remaining() == 0) { // have we read length bytes?
packetReceived();
incomingBuffer.flip(); //执行读取负载的逻辑
if (!initialized) {
//非初始化
readConnectRequest();
} else {
readRequest();
}
lenBuffer.clear();
incomingBuffer = lenBuffer;
}
} 其中我们看一下如何连接的,即上面的 readConnectRequest 方法 private void readConnectRequest() throws IOException, InterruptedException {
if (!isZKServerRunning()) {
throw new IOException("ZooKeeperServer not running");
}
zkServer.processConnectRequest(this, incomingBuffer);
initialized = true;
}
于是我们到了ZookeeperServer 的 processConnectRequest 方法
   /**
* 封装成一个 connectRequest
* 提交请求到leader
*/
public void processConnectRequest(ServerCnxn cnxn, ByteBuffer incomingBuffer) throws IOException {
BinaryInputArchive bia = BinaryInputArchive.getArchive(new ByteBufferInputStream(incomingBuffer));
ConnectRequest connReq = new ConnectRequest();
connReq.deserialize(bia, "connect");
if (LOG.isDebugEnabled()) {
LOG.debug("Session establishment request from client "
+ cnxn.getRemoteSocketAddress()
+ " client's lastZxid is 0x"
+ Long.toHexString(connReq.getLastZxidSeen()));
}
boolean readOnly = false;
try {
readOnly = bia.readBool("readOnly");
cnxn.isOldClient = false;
} catch (IOException e) {
// this is ok -- just a packet from an old client which
// doesn't contain readOnly field
LOG.warn("Connection request from old client "
+ cnxn.getRemoteSocketAddress()
+ "; will be dropped if server is in r-o mode");
}
if (readOnly == false && this instanceof ReadOnlyZooKeeperServer) {
String msg = "Refusing session request for not-read-only client "
+ cnxn.getRemoteSocketAddress();
LOG.info(msg);
throw new CloseRequestException(msg);
}
if (connReq.getLastZxidSeen() > zkDb.dataTree.lastProcessedZxid) {
String msg = "Refusing session request for client "
+ cnxn.getRemoteSocketAddress()
+ " as it has seen zxid 0x"
+ Long.toHexString(connReq.getLastZxidSeen())
+ " our last zxid is 0x"
+ Long.toHexString(getZKDatabase().getDataTreeLastProcessedZxid())
+ " client must try another server"; LOG.info(msg);
throw new CloseRequestException(msg);
}
int sessionTimeout = connReq.getTimeOut();
byte passwd[] = connReq.getPasswd();
int minSessionTimeout = getMinSessionTimeout();
if (sessionTimeout < minSessionTimeout) {
sessionTimeout = minSessionTimeout;
}
int maxSessionTimeout = getMaxSessionTimeout();
if (sessionTimeout > maxSessionTimeout) {
sessionTimeout = maxSessionTimeout;
}
cnxn.setSessionTimeout(sessionTimeout);
// We don't want to receive any packets until we are sure that the
// session is setup
cnxn.disableRecv();
long sessionId = connReq.getSessionId();
if (sessionId != 0) {
long clientSessionId = connReq.getSessionId();
LOG.info("Client attempting to renew session 0x"
+ Long.toHexString(clientSessionId)
+ " at " + cnxn.getRemoteSocketAddress());
serverCnxnFactory.closeSession(sessionId);
cnxn.setSessionId(sessionId);
reopenSession(cnxn, sessionId, passwd, sessionTimeout);
} else {
LOG.info("Client attempting to establish new session at "
+ cnxn.getRemoteSocketAddress());
//真正执行请求的地方 : 创建 session,提交请求
createSession(cnxn, passwd, sessionTimeout);
}
} long createSession(ServerCnxn cnxn, byte passwd[], int timeout) {
long sessionId = sessionTracker.createSession(timeout);
Random r = new Random(sessionId ^ superSecret);
r.nextBytes(passwd);
ByteBuffer to = ByteBuffer.allocate(4);
to.putInt(timeout);
cnxn.setSessionId(sessionId);
//提交请求
submitRequest(cnxn, sessionId, OpCode.createSession, 0, to, null);
return sessionId;
} /**
* @param cnxn
* @param sessionId
* @param xid
* @param bb
*/
private void submitRequest(ServerCnxn cnxn, long sessionId, int type,
int xid, ByteBuffer bb, List<Id> authInfo) {
Request si = new Request(cnxn, sessionId, xid, type, bb, authInfo);
submitRequest(si);
} public void submitRequest(Request si) {
if (firstProcessor == null) {
synchronized (this) {
try {
// Since all requests are passed to the request
// processor it should wait for setting up the request
// processor chain. The state will be updated to RUNNING
// after the setup.
// 在 startup 方法中直到初始化完成才得以接受请求
while (state == State.INITIAL) {
wait(1000);
}
} catch (InterruptedException e) {
LOG.warn("Unexpected interruption", e);
}
if (firstProcessor == null || state != State.RUNNING) {
throw new RuntimeException("Not started");
}
}
}
try {
//判断 session 是否还存活
touch(si.cnxn);
boolean validpacket = Request.isValid(si.type);
if (validpacket) {
//firstProcessor 开始执行
firstProcessor.processRequest(si);
if (si.cnxn != null) {
incInProcess();
}
} else {
LOG.warn("Received packet at server of unknown type " + si.type);
new UnimplementedRequestProcessor().processRequest(si);
}
} catch (MissingSessionException e) {
if (LOG.isDebugEnabled()) {
LOG.debug("Dropping request: " + e.getMessage());
}
} catch (RequestProcessorException e) {
LOG.error("Unable to process request:" + e.getMessage(), e);
}
}

  清楚了服务器与客户端的连接,知道了最终会到执行任务链上。

  我们可以连接任意一台 zk服务器,进行事务请求,当follower接受到请求后它就会转发给leader ,而follower 和 leader 都是使用责任链进行处理来自客户端的请求的。

  我们先来看一下follower 和 leader 责任链的流程。



请求处理流程

会话创建请求

  我们通过下面的图片先来了解整个过程



  概括起来就是 :

  • NIOServerCnxn 接受请求
  • 协商 sessionTimeout ,创建 connectRequest
  • 创建会话,生成 sessionId,注册会话和激活会话
  • 交给 leader 的 PrepRequestProcessor
  • 创建请求事务体 createSessionTxn
  • 交给 ProposalRequestProcessor,接下来就会进入三个子流程

      子流程如下 :
  • Sync流程,follower做好日志记录,同时返回 ACK 给leader
  • Proposal流程,生成proposal ,广播提议,获得半数票后,请求加入到 toBeApplied 队列,广播commit 信息
  • Commit流程,将请求交付给CommitProcessor 处理器,等待上阶段的投票结果,提交请求,交付给下一个处理器 : FinalRequestProcessor
  • 到此三个子流程走完后,到了最后的阶段,事务应用,之前我们的议案只是应用在日志中,并没有在内存中生效,这阶段需要将请求应用在内存中。

      这个就是整个会话创建请求的处理过程,当客户端发出事务请求时也是像处理会话创建一样的流程。

总结

  这一篇主要讲了zk 创建session的策略和管理的逻辑,同时介绍了处理事务的过程。

zookeeper 源码(二) session 和 处理事务请求的更多相关文章

  1. zookeeper源码 — 二、集群启动—leader选举

    上一篇介绍了zookeeper的单机启动,集群模式下启动和单机启动有相似的地方,但是也有各自的特点.集群模式的配置方式和单机模式也是不一样的,这一篇主要包含以下内容: 概念介绍:角色,服务器状态 服务 ...

  2. ZooKeeper源码阅读——client(二)

    原创技术文章,转载请注明:转自http://newliferen.github.io/ 如何连接ZooKeeper集群   要想了解ZooKeeper客户端实现原理,首先需要关注一下客户端的使用方式, ...

  3. Zookeeper 源码(二)序列化组件 Jute

    Zookeeper 源码(二)序列化组件 Jute 一.序列化组件 Jute 对于一个网络通信,首先需要解决的就是对数据的序列化和反序列化处理,在 ZooKeeper 中,使用了Jute 这一序列化组 ...

  4. zookeeper源码分析之五服务端(集群leader)处理请求流程

    leader的实现类为LeaderZooKeeperServer,它间接继承自标准ZookeeperServer.它规定了请求到达leader时需要经历的路径: PrepRequestProcesso ...

  5. zookeeper源码分析之四服务端(单机)处理请求流程

    上文: zookeeper源码分析之一服务端启动过程 中,我们介绍了zookeeper服务器的启动过程,其中单机是ZookeeperServer启动,集群使用QuorumPeer启动,那么这次我们分析 ...

  6. Zookeeper 源码(七)请求处理

    Zookeeper 源码(七)请求处理 以单机启动为例讲解 Zookeeper 是如何处理请求的.先回顾一下单机时的请求处理链. // 单机包含 3 个请求链:PrepRequestProcessor ...

  7. Zookeeper 源码(六)Leader-Follower-Observer

    Zookeeper 源码(六)Leader-Follower-Observer 上一节介绍了 Leader 选举的全过程,本节讲解一下 Leader-Follower-Observer 服务器的三种角 ...

  8. Zookeeper 源码(五)Leader 选举

    Zookeeper 源码(五)Leader 选举 前面学习了 Zookeeper 服务端的相关细节,其中对于集群启动而言,很重要的一部分就是 Leader 选举,接着就开始深入学习 Leader 选举 ...

  9. Zookeeper 源码(四)Zookeeper 服务端源码

    Zookeeper 源码(四)Zookeeper 服务端源码 Zookeeper 服务端的启动入口为 QuorumPeerMain public static void main(String[] a ...

  10. Zookeeper 源码(三)Zookeeper 客户端源码

    Zookeeper 源码(三)Zookeeper 客户端源码 Zookeeper 客户端主要有以下几个重要的组件.客户端会话创建可以分为三个阶段:一是初始化阶段.二是会话创建阶段.三是响应处理阶段. ...

随机推荐

  1. VS2017连接MySQL数据库

    vs默认无法直接连接mysql,需要我们自己配置环境. 1.下载mysql-installer-community-8.0.18.0.msi 下载地址:https://dev.mysql.com/do ...

  2. uevent机制

    uevent, user space event. 内核与用户空间的一种通信机制,基于netlink机制,主要用于设备驱动模型,例如热插拔. 1.调用/sbin/mdev的流程分析 在驱动程序中经常出 ...

  3. HITCON-Training-Writeup

    HITCON-Training-Writeup 原文链接M4x@10.0.0.55 项目地址M4x's github,欢迎star~ 更新时间5月16 复习一下二进制基础,写写HITCON-Train ...

  4. LitElement(三)Styles样式

    1.为组件添加样式 import { LitElement, css, html } from 'lit-element'; class MyElement extends LitElement { ...

  5. 三分钟快速上手TensorFlow 2.0 (中)——常用模块和模型的部署

    本文学习笔记参照来源:https://tf.wiki/zh/basic/basic.html 前文:三分钟快速上手TensorFlow 2.0 (上)——前置基础.模型建立与可视化 tf.train. ...

  6. Python_内置函数和匿名函数

    楔子 在讲新知识之前,我们先来复习复习函数的基础知识. 问:函数怎么调用? 函数名() 如果你们这么说...那你们就对了!好了记住这个事儿别给忘记了,咱们继续谈下一话题... 来你们在自己的环境里打印 ...

  7. Java之字符串替换replace()

    replace(char oldChar, char newChar)返回一个新的字符串,它是通过用 newChar 替换此字符串中出现的所有 oldChar 而生成的 import java.uti ...

  8. (转)json格式转换成javaBean对象的方法

    把json格式转换成javaBean才可以.于是查了一下资料,网上最多的资料就是下面的这种方式: Java code? 1 2 3 4 5 6 7 8 9 String str = "[{\ ...

  9. LCA 倍增算法模板

    . #include <cstring> #include <cstdio> #include <cstdlib> #include <algorithm&g ...

  10. Apache Kafka(十)Partitions与Replication Factor 调整准则

    Partitions与Replication Factor调整准则 Partition 数目与Replication Factor是在创建一个topic时非常重要的两个参数,这两个参数的取值会直接影响 ...