Which Queue Pair type to use?

    
5.00 avg. rating (98% score) - 3 votes

When writing a new RDMA application (just like when writing a new application over sockets), one should decide which QP type he should work with.

In this post, I will describe in detail the characteristics of each transport type.

In RDMA, there are several QP types. They can be represented by : XY

X can be:
Reliable: There is a guarantee that messages are delivered at most once, in order and without corruption.
Unreliable: There isn't any guarantee that the messages will be delivered or about the order of the packets.

In RDMA, every packet has a CRC and corrupted packets are being dropped (for any transport type). The Reliability of a QP transport type refers to the whole message reliability.

Y can be:
Connected: one QP send/receive with exactly one QP
Unconnected: one QP send/receive with any QP

The following mechanisms are being used in RDMA:
* CRC: The CRC field which validates that packets weren't corrupted along the path.

* PSN: The Packet Serial Number makes sure that packets are being received by the order. This helps detect missing packets and packet duplications.

* Acknowledgement: (only in RC QP) Only after a message is being written successfully on the responder side, an ack packet is being sent back to the requestor. If an ack isn't being sent by the requestor, it resend the message again according to the QP's attributes. If there won't be any ack (or nack) from a QP, it will report that there is an error (retry exceeded).
If there is any kind of error on the responder side (protection, resources, etc.) an ack will be sent to the requestor and it will report that there is an error.

Reliable Connected (RC) QP

One RC QP is being connected (i.e. send and receive messages) to exactly one RC QP in a reliable way. It is guaranteed that messages are delivered from a requester to a responder at most once, in order and without corruption. The maximum supported message size is up to 2GB (this value may be lower, depends on the supported RDMA device attributes). RC QP supports Send operations (w/o immediate), RDMA Write operations (w/o immediate), RDMA Read operations and Atomic operations (it depends on the RDMA device support level in atomic operations).

If a message size is bigger than the path MTU, it is being fragmented in the side that sends the data and being reassembled in the receiver side.

Requester considers a message operation complete once there is an ack from the responder side that the message was read/written to its memory.

Responder considers a message operation complete once the message was read/written to its (local) memory.

Unreliable Connected (UC) QP

One UC QP is being connected (i.e. send and receive messages) to exactly one UC QP in an unreliable way. There isn't any guaranteed that the messages will be received by the other side: corrupted or out of sequence packets are silently dropped. If a packet is being dropped, the whole message that it belongs to will be dropped. In this case, the responder won't stop, but continues to receive incoming packets. There isn't any guarantee about the packet ordering. The maximum supported message size is up to 2GB (this value may be lower, depends on the support RDMA device attributes). RC QP supports Send operations (w/o immediate) and RDMA Write operations (w/o immediate).

If a message size is bigger than the path MTU, it is being fragmented in the side that sends the data and being reassembled in the receiver side.

Requester considers a message operation complete once all of the message was sent to the fabric.

Responder considers a message operation complete once it received a complete message in correct sequence and it written the data to its (local) memory.

Unreliable Datagram (UD) QP

One QP can send and receive message to any other UD QP in either unicast (one to one) or multicast (one to many) way in an unreliable way. There isn't any guaranteed that the messages will be received by the other side: corrupted or out of sequence packets are silently dropped. There isn't any guarantee about the packet ordering. The maximum supported message size is the maximum path MTU. UD QP supports only Send operations.

Requester considers a message operation complete once the (one packet) message was sent to the fabric.

Responder considers a message operation complete once it received a complete message and it written the data to its (local) memory.

Choosing the right QP type

Choosing the right QP type is critical to the correction and scalability of an application.

RC QP should be chosen if:

      1. Reliability by the fabric is needed
    1. Fabric size isn't big or the cluster size is big, but not all nodes send traffic to the same node (one victim)

Several uses for a RC QP can be: FTP over RDMA or file system over RDMA.

UC QP should be chosen if:

      1. Reliability by the fabric isn't needed (i.e. reliability isn't important at all or it is being taken care of by the application)
      1. Fabric size isn't big or the cluster size is big, but not all nodes send traffic to the same node (one victim)
    1. Big messages (more than the path MTU) are being sent

One use for an UC QP can be: video over RDMA.

UD QP should be chosen if:

      1. Reliability by the fabric isn't needed (i.e. reliability isn't important at all or it is being taken care of by the application)
      1. Fabric size is big and all nodes and every node send messages to any other node in the fabric. UD is one of the best solutions for scalability problems.
    1. Multicast messages are needed

One use for an UD QP can be: voice over RDMA.

Summary

The following table describes the characteristics of each QP Transport Service Type:

Metric UD UC RC
Opcode: SEND (w/o immediate) Supported Supported Supported
Opcode: RDMA Write (w/o immediate) Not supported Supported Supported
Opcode: RDMA Read Not supported Not supported Supported
Opcode: Atomic operations Not supported Not supported Supported
Reliability No No Yes
Connection type Datagram (One to any/many) Connected (one to one) Connected (one to one)
Maximum message size Maximum path MTU 2 GB 2 GB
Multicast supported Not supported Not supported

Share:

 

Written by: Dotan Barak on June 1, 2013.on January 11, 2019.

Related

 

Which Queue Pair type to use?的更多相关文章

  1. Queue Pair in RDMA (zz)

    Queue Pair in RDMA 首页分类标签留言关于订阅2018-03-21 | 分类 Network  | 标签 RDMA 一个CA(Channel Adapter)可以包含多个QP,QP相当 ...

  2. C++的队列和pair

    C++队列的成员函数: back()返回最后一个元素 empty()如果队列空则返回真 front()返回第一个元素 pop()删除第一个元素 push()在末尾加入一个元素 size()返回队列中元 ...

  3. AMQP 0-9-1 Model Explained Why does the queue memory grow and shrink when publishing/consuming? AMQP和AMQP Protocol的是整体和部分的关系 RabbitMQ speaks multiple protocols.

    AMQP 0-9-1 Model Explained — RabbitMQ http://next.rabbitmq.com/tutorials/amqp-concepts.html AMQP 0-9 ...

  4. jQuery.queue源码分析

    作者:禅楼望月(http://www.cnblogs.com/yaoyinglong ) 队列是一种特殊的线性表,它的特殊之处在于他只允许在头部进行删除,在尾部进行插入.常用来表示先进先出的操作(FI ...

  5. jQuery源代码学习之七—队列模块queue

    一.jQuery种的队列模块 jQuery的队列模块主要是为动画模块EFFECTS提供支持,(不过到现在为了支持动画队列的inprogress的出入队还是搞不太清楚),单独抽取出一个命名空间是为了使程 ...

  6. jquery源码学习之queue方法

    队列模块的代码结构 静态方法jQuery下有queue,dequeue,_queueHooks这三种方法:静态方法不建议直接在外部调用: 实例方法.queue,.dequeue,.clearQueue ...

  7. QUEUE——队列(procedure)

    #include <stdio.h> #include <stdlib.h> #include "queue.h" int main() {  int i; ...

  8. Objective-C priority queue

    http://stackoverflow.com/questions/17684170/objective-c-priority-queue PriorityQueue.h // // Priorit ...

  9. 找最大重复次数的数和重复次数(C++ Pair)

    Problem A: 第一集 你好,世界冠军 Time Limit: 10 Sec  Memory Limit: 128 MBSubmit: 265  Solved: 50[Submit][Statu ...

随机推荐

  1. 缓冲区 subprocess 黏包 黏包的解决方案

    缓冲区: 将程序和网络解耦输入缓冲区输出缓冲区 print('>>>>', server.getsockopt(SOL_SOCKET, SO_SNDBUF)) 查看输出缓冲区大 ...

  2. Exp6 信息搜集与漏洞扫描 20164313 杜桂鑫

    1.实践目标 掌握信息搜集的最基础技能与常用工具的使用方法. 2.实践内容 (1)各种搜索技巧的应用 1.使用搜索引擎 在百度搜索栏内输入 site:com filetype:doc 北京电子科技学院 ...

  3. OO第一单元(求导)单元总结

    OO第一单元(求导)单元总结 这是我们oo课程的第一个单元,也是意在让我们接触了解掌握oo思想的一个单元,这个单元的作业以求导为主题,从一开始的加减多项式求导再到最后的嵌套多项式求导,难度逐渐提高,编 ...

  4. LeetCode 94. Binary Tree Inorder Traversal 二叉树的中序遍历 C++

    Given a binary tree, return the inorder traversal of its nodes' values. Example: Input: [,,] \ / Out ...

  5. FPGA笔试必会知识点1--数字电路基本知识

    组合逻辑与时序逻辑 组合逻辑电路:任意时刻电路输出的逻辑状态仅仅取决于当时输入的逻辑状态,而与电路过去的工作状态无关. 时序逻辑电路:任意时刻电路输出的逻辑状态不仅取决于当时输入的逻辑状态,而与电路过 ...

  6. MYSQL1

    一:对查询就行优化 避免全表查询 1.首先考虑在where及order by 列上建立索引 2.where子句   LIKE  '%abc%' 前置%   引擎放弃使用索引而进行全表扫描 3.wher ...

  7. Mac+sublime text+skim

    首先:安装Mac Tex .Sublime.skim MacTex:   https://tug.org/mactex/mactex-download.html Sublime3 :  http:// ...

  8. golang基于etcd实现分布式锁(转)

    下面描述使用 Etcd 实现分布式锁的业务流程,假设对某个共享资源设置的锁名为:/lock/mylock 步骤 1: 准备 客户端连接 Etcd,以 /lock/mylock 为前缀创建全局唯一的 k ...

  9. zabbix添加自定义监控项目

    在zabbix里添加一个自定义监控项目,简单做个笔记,怕忘了 首先需要定义 zabbix_agentd.conf  中的 UnsafeUserParameters 修改为 UnsafeUserPara ...

  10. Java实现微信客户端扫码登录

    此篇文章记录自己开发中的微信客户端扫码登录的实例以及步骤,便于以后自行学习记起的关键,看到的网友有借鉴的地方就借鉴,看不懂的也请别吐槽,毕竟每个人的思维和思路以及记录东西的方式不一样: 1.首先需要一 ...