即使关闭nagle算法,也不能解决粘包问题

https://waylau.com/netty-4-user-guide/Getting%20Started/Dealing%20with%20a%20Stream%20based%20Transport.html

Dealing with a Stream-based Transport 处理一个基于流的传输

One Small Caveat of Socket Buffer 关于 Socket Buffer的一个小警告

基于流的传输比如 TCP/IP, 接收到数据是存在 socket 接收的 buffer 中。不幸的是,基于流的传输并不是一个数据包队列,而是一个字节队列。意味着,即使你发送了2个独立的数据包,操作系统也不会作为2个消息处理而仅仅是作为一连串的字节而言。因此这是不能保证你远程写入的数据就会准确地读取。举个例子,让我们假设操作系统的 TCP/TP 协议栈已经接收了3个数据包:

由于基于流传输的协议的这种普通的性质,在你的应用程序里读取数据的时候会有很高的可能性被分成下面的片段

因此,一个接收方不管他是客户端还是服务端,都应该把接收到的数据整理成一个或者多个更有意思并且能够让程序的业务逻辑更好理解的数据。在上面的例子中,接收到的数据应该被构造成下面的格式:

https://netty.io/wiki/user-guide-for-4.x.html

Dealing with a Stream-based Transport

One Small Caveat of Socket Buffer

In a stream-based transport such as TCP/IP, received data is stored into a socket receive buffer. Unfortunately, the buffer of a stream-based transport is not a queue of packets but a queue of bytes. It means, even if you sent two messages as two independent packets, an operating system will not treat them as two messages but as just a bunch of bytes. Therefore, there is no guarantee that what you read is exactly what your remote peer wrote. For example, let us assume that the TCP/IP stack of an operating system has received three packets:

Because of this general property of a stream-based protocol, there's high chance of reading them in the following fragmented form in your application:

Therefore, a receiving part, regardless it is server-side or client-side, should defrag the received data into one or more meaningful frames that could be easily understood by the application logic. In case of the example above, the received data should be framed like the following:

The First Solution

Now let us get back to the TIME client example. We have the same problem here. A 32-bit integer is a very small amount of data, and it is not likely to be fragmented often. However, the problem is that it can be fragmented, and the possibility of fragmentation will increase as the traffic increases.

The simplistic solution is to create an internal cumulative buffer and wait until all 4 bytes are received into the internal buffer. The following is the modified TimeClientHandler implementation that fixes the problem:

package io.netty.example.time;

import java.util.Date;

public class TimeClientHandler extends ChannelInboundHandlerAdapter {
private ByteBuf buf; @Override
public void handlerAdded(ChannelHandlerContext ctx) {
buf = ctx.alloc().buffer(4); // (1)
} @Override
public void handlerRemoved(ChannelHandlerContext ctx) {
buf.release(); // (1)
buf = null;
} @Override
public void channelRead(ChannelHandlerContext ctx, Object msg) {
ByteBuf m = (ByteBuf) msg;
buf.writeBytes(m); // (2)
m.release(); if (buf.readableBytes() >= 4) { // (3)
long currentTimeMillis = (buf.readUnsignedInt() - 2208988800L) * 1000L;
System.out.println(new Date(currentTimeMillis));
ctx.close();
}
} @Override
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) {
cause.printStackTrace();
ctx.close();
}
}
  1. ChannelHandler has two life cycle listener methods: handlerAdded() and handlerRemoved(). You can perform an arbitrary (de)initialization task as long as it does not block for a long time.
  2. First, all received data should be cumulated into buf.
  3. And then, the handler must check if buf has enough data, 4 bytes in this example, and proceed to the actual business logic. Otherwise, Netty will call the channelRead() method again when more data arrives, and eventually all 4 bytes will be cumulated.

The Second Solution

Although the first solution has resolved the problem with the TIME client, the modified handler does not look that clean. Imagine a more complicated protocol which is composed of multiple fields such as a variable length field. Your ChannelInboundHandler implementation will become unmaintainable very quickly.

As you may have noticed, you can add more than one ChannelHandler to a ChannelPipeline, and therefore, you can split one monolithic ChannelHandler into multiple modular ones to reduce the complexity of your application. For example, you could split TimeClientHandler into two handlers:

  • TimeDecoder which deals with the fragmentation issue, and
  • the initial simple version of TimeClientHandler.

Fortunately, Netty provides an extensible class which helps you write the first one out of the box:

package io.netty.example.time;

public class TimeDecoder extends ByteToMessageDecoder { // (1)
@Override
protected void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) { // (2)
if (in.readableBytes() < 4) {
return; // (3)
} out.add(in.readBytes(4)); // (4)
}
}
  1. ByteToMessageDecoder is an implementation of ChannelInboundHandler which makes it easy to deal with the fragmentation issue.
  2. ByteToMessageDecoder calls the decode() method with an internally maintained cumulative buffer whenever new data is received.
  3. decode() can decide to add nothing to out where there is not enough data in the cumulative buffer. ByteToMessageDecoder will call decode() again when there is more data received.
  4. If decode() adds an object to out, it means the decoder decoded a message successfully. ByteToMessageDecoder will discard the read part of the cumulative buffer. Please remember that you don't need to decode multiple messages. ByteToMessageDecoder will keep calling the decode() method until it adds nothing to out.

Now that we have another handler to insert into the ChannelPipeline, we should modify the ChannelInitializer implementation in the TimeClient:

b.handler(new ChannelInitializer<SocketChannel>() {
@Override
public void initChannel(SocketChannel ch) throws Exception {
ch.pipeline().addLast(new TimeDecoder(), new TimeClientHandler());
}
});

If you are an adventurous person, you might want to try the ReplayingDecoder which simplifies the decoder even more. You will need to consult the API reference for more information though.

public class TimeDecoder extends ReplayingDecoder<Void> {
@Override
protected void decode(
ChannelHandlerContext ctx, ByteBuf in, List<Object> out) {
out.add(in.readBytes(4));
}
}

Additionally, Netty provides out-of-the-box decoders which enables you to implement most protocols very easily and helps you avoid from ending up with a monolithic unmaintainable handler implementation. Please refer to the following packages for more detailed examples:

Dealing with a Stream-based Transport 处理一个基于流的传输 粘包 即使关闭nagle算法,也不能解决粘包问题的更多相关文章

  1. Raknet是一个基于UDP网络传输协议的C++网络库(还有一些其它库,比如nanomsg,fastsocket等等)

    Raknet是一个基于UDP网络传输协议的C++网络库,允许程序员在他们自己的程序中实现高效的网络传输服务.通常情况下用于游戏,但也可以用于其它项目. Raknet有以下好处: 高性能 在同一台计算机 ...

  2. CXF 入门:创建一个基于WS-Security标准的安全验证(CXF回调函数使用,)

    http://jyao.iteye.com/blog/1346547 注意:以下客户端调用代码中获取服务端ws实例,都是通过CXF 入门: 远程接口调用方式实现 直入正题! 以下是服务端配置 ==== ...

  3. SQL Standard Based Hive Authorization(基于SQL标准的Hive授权)

    说明:该文档翻译/整理于Hive官方文档https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authori ...

  4. 公布一个基于 Reactor 模式的 C++ 网络库

    公布一个基于 Reactor 模式的 C++ 网络库 陈硕 (giantchen_AT_gmail) Blog.csdn.net/Solstice 2010 Aug 30 本文主要介绍 muduo 网 ...

  5. ChatGirl 一个基于 TensorFlow Seq2Seq 模型的聊天机器人[中文文档]

    ChatGirl 一个基于 TensorFlow Seq2Seq 模型的聊天机器人[中文文档] 简介 简单地说就是该有的都有了,但是总体跑起来效果还不好. 还在开发中,它工作的效果还不好.但是你可以直 ...

  6. 一个基于TCP/IP的服务器与客户端通讯的小项目(超详细版)

    1.目的:实现客户端向服务器发送数据 原理: 2.建立两个控制台应用,一个为服务器,用于接收数据.一个为客户端,用于发送数据. 关键类与对应方法: 1)类IPEndPoint: 1.是抽象类EndPo ...

  7. 详解:基于WEB API实现批量文件由一个服务器同步快速传输到其它多个服务器功能

    文件同步传输工具比较多,传输的方式也比较多,比如:FTP.共享.HTTP等,我这里要讲的就是基于HTTP协议的WEB API实现批量文件由一个服务器同步快速传输到其它多个服务器这样的一个工具(简称:一 ...

  8. 一个基于mysql构建的队列表

    通常大家都会使用redis作为应用的任务队列表,redis的List结构,在一段进行任务的插入,在另一端进行任务的提取. 任务的插入 $redis->lPush("key:task:l ...

  9. psutil一个基于python的跨平台系统信息跟踪模块

    受益于这个模块的帮助,在这里我推荐一手. https://pythonhosted.org/psutil/#processes psutil是一个基于python的跨平台系统信息监视模块.在pytho ...

随机推荐

  1. View坐标系详解(getTop(),getLeft(),getX(),getY(),getLocationOnScreen(), getLocationInWindow())

    View 提供了如下 5 种方法获取 View 的坐标:1. View.getTop().View.getLeft().View.getBottom().View.getRight();2. View ...

  2. python学习之winreg模块

    winreg模块将Windows注册表API暴露给了python. 常见方法和属性 winreg.OpenKey(key,sub_key,reserved = ,access = KEY_READ) ...

  3. 偏于SQL语句的 sqlAlchemy 增删改查操作

    ORM 江湖 曾几何时,程序员因为惧怕SQL而在开发的时候小心翼翼的写着sql,心中总是少不了恐慌,万一不小心sql语句出错,搞坏了数据库怎么办?又或者为了获取一些数据,什么内外左右连接,函数存储过程 ...

  4. hive优化总结

    一.表设计 合理分表 合理设计表分区,静态分区.动态分区 二.扫描相关 1.谓词下推(Predicate Push Down) 2.列裁剪(Column Pruning) 在读数据的时候,只关心感兴趣 ...

  5. ldap 使用 问题参考

    Q2.ldapsearch查询一个有30000多条记录时出现:Size limit exceeded 4 A2:服务器端配置文件有sizelimit 1000的限制!用管理员身份查询-D"c ...

  6. Hive入门笔记---1.Hive简单介绍

    1. Hive是什么 Hive是基于Hadoop的数据仓库解决方案.由于Hadoop本身在数据存储和计算方面有很好的可扩展性和高容错性,因此使用Hive构建的数据仓库也秉承了这些特性.这是来自官方的解 ...

  7. Linux网络编程wait()和waitpid()的讲解

    本文讲的是关于wait和waitpid两者的区别与联系.为避免僵尸进程的产生,无论我们什么时候创建子进程时,主进程都需要等待子进程返回,以便对子进程进行清理.为此,我们在服务器程序中添加SIGCHLD ...

  8. sqlite笔记(akaedu)

    1.创建sql表create table student(id integer primary key, name text, score integer): 2.插入一条记录insert into ...

  9. 回文自动机 + DFS --- The 2014 ACM-ICPC Asia Xi’an Regional Contest Problem G.The Problem to Slow Down You

    The Problem to Slow Down You Problem's Link: http://acm.hust.edu.cn/vjudge/problem/viewProblem.actio ...

  10. Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

    作者提出的方法是Algotithm 2.简单来说就是,训练的时候,在几个模型中,选取预测最准确的(也就是loss最低的)模型进行权重更新.