https://blog.myhro.info/2017/01/how-fast-are-unix-domain-sockets

Jan 3, 2017 • Tiago Ilieve

Warning: this is my first post written in English, after over five years writing only in Portuguese. After reading many technical articles written in English by non-native speakers, I’ve wondered: imagine how much information I would be missing if they wrote those posts in French or Russian. Following their examples, this blog can also reach a much wider audience as well.

It probably happened more than once, when you ask your team about how a reverse proxy should talk to the application backend server. “Unix sockets. They are faster.”, they’ll say. But how much faster this communication will be? And why a Unix domain socket is faster than an IP socket when multiple processes are talking to each other in the same machine? Before answering those questions, we should figure what Unix sockets really are.

Unix sockets are a form of inter-process communication (IPC) that allows data exchange between processes in the same machine. They are special files, in the sense that they exist in a file system like a regular file (hence, have an inode and metadata like ownership and permissions associated to it), but will be read and written using recv() and send() syscalls instead of read() and write(). When binding and connecting to a Unix socket, we’ll be using file paths instead of IP addresses and ports.

In order to determine how fast a Unix socket is compared to an IP socket, two proofs of concept (POCs) will be used. They were written in Python, due to being small and easy to understand. Their implementation details will be clarified when needed.

IP POC

ip_server.py

#!/usr/bin/env python

import socket

server_addr = '127.0.0.1'
server_port = 5000 sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind((server_addr, server_port))
sock.listen(0) print 'Server ready.' while True:
conn, _ = sock.accept()
conn.send('Hello there!')
conn.close()

ip_client.py

#!/usr/bin/env python

import socket
import time server_addr = '127.0.0.1'
server_port = 5000 duration = 1
end = time.time() + duration
msgs = 0 print 'Receiving messages...' while time.time() < end:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((server_addr, server_port))
data = sock.recv(32)
msgs += 1
sock.close() print 'Received {} messages in {} second(s).'.format(msgs, duration)

Unix domain socket POC

uds_server.py

#!/usr/bin/env python

import os
import socket server_addr = '/tmp/uds_server.sock' if os.path.exists(server_addr):
os.unlink(server_addr) sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
sock.bind(server_addr)
sock.listen(0) print 'Server ready.' while True:
conn, _ = sock.accept()
conn.send('Hello there!')
conn.close()

uds_client.py

#!/usr/bin/env python

import socket
import time server_addr = '/tmp/uds_server.sock' duration = 1
end = time.time() + duration
msgs = 0 print 'Receiving messages...' while time.time() < end:
sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
sock.connect(server_addr)
data = sock.recv(32)
msgs += 1
sock.close() print 'Received {} messages in {} second(s).'.format(msgs, duration)

As we can see by those code snippets, both implementations are close to each other as possible. The differences between them are:

  • Their address family: socket.AF_INET (IP) and socket.AF_UNIX (Unix sockets).
  • To bind a process using socket.AF_UNIX, the socket file should be removed and created again if it already exists.
  • When using socket.AF_INET, the socket.SO_REUSEADDR flag have to be set in order to avoid socket.error: [Errno 98] Address already in use errors that may occur even when the socket is properly closed. This option tells the kernel to reuse the same port if there are connections in the TIME_WAIT state.

Both POCs were executed on a Core i3 laptop running Ubuntu 16.04 (Xenial) with stock kernel. There is no output at every loop iteration to avoid the huge performance penalty of writing to a screen. Let’s take a look at their performances.

IP POC

First terminal:

$ python ip_server.py
Server ready.

Second terminal:

$ python ip_client.py
Receiving messages...
Received 10159 messages in 1 second(s).

Unix domain socket POC

First terminal:

$ python uds_server.py
Server ready.

Second terminal:

$ python uds_client.py
Receiving messages...
Received 22067 messages in 1 second(s).

The Unix socket implementation can send and receive more than twice the number of messages, over the course of a second, when compared to the IP one. During multiple runs, this proportion is consistent, varying around 10% for more or less on both of them. Now that we figured their performance differences, let’s find out why Unix sockets are so much faster.

It’s important to notice that both IP and Unix socket implementations are using TCP (socket.SOCK_STREAM), so the answer isn’t related to how TCP performs in comparison to another transport protocol like UDP, for instance (see update 1). What happens is that when Unix sockets are used, the entire IP stack from the operating system will be bypassed. There will be no headers being added, checksums being calculated (see update 2), encapsulation and decapsulation of packets being done nor routing being performed. Although those tasks are performed really fast by the OS, there is still a visible difference when doing benchmarks like this one.

There’s so much room for real-world comparisons besides this synthetic measurement demonstrated here. What will be the throughput differences when a reverse proxy like nginx is communicating to a Gunicorn backend server using IP or Unix sockets? Will it impact on latency as well? What about transfering big chunks of data, like huge binary files, instead of small messages? Can Unix sockets be used to avoid Docker network overhead when forwarding ports from the host to a container?

References:

Updates:

  1. John-Mark Gurney and Justin Cormack pointed out that SOCK_STREAM doesn’t mean TCP under Unix domain sockets. This makes sense, but I couldn’t find any reference affirming nor denying it.
  2. Justin Cormack also mentioned that there’s no checksumming on local interfaces by default. Looking at the source code of the Linux loopback driver, this seems to be present in kernel since version 2.6.12-r2.

[转帖]How fast are Unix domain sockets?的更多相关文章

  1. PHP 调用 Go 服务的正确方式 - Unix Domain Sockets

    * { color: #3e3e3e } body { font-family: "Helvetica Neue", Helvetica, "Hiragino Sans ...

  2. UNIX DOMAIN SOCKETS IN GO unix域套接字

    Unix domain sockets in Go - Golang News https://golangnews.org/2019/02/unix-domain-sockets-in-go/ pa ...

  3. Unix domain sockets

    #server: SERVER_PATH = "/tmp/python_unix_socket_server" def run_unix_domain_socket_server( ...

  4. php, hhvm与odp & Unix domain Socket方式

    接上一篇,复习一下 启动php或hhvm: php/sbin/php-fpm start hhvm/bin/hhvm_control start 启动nginx或lighttpd: webserver ...

  5. 网络协议之:socket协议详解之Unix domain Socket

    目录 简介 什么是Unix domain Socket 使用socat来创建Unix Domain Sockets 使用ss命令来查看Unix domain Socket 使用nc连接到Unix do ...

  6. 由一个简单需求到Linux环境下的syslog、unix domain socket

    本文记录了因为一个简单的日志需求,继而对linux环境下syslog.rsyslog.unix domain socket的学习.本文关注使用层面,并不涉及rsyslog的实现原理,感兴趣的读者可以参 ...

  7. libpqxx接口的在linux下的使用,解决psql:connections on Unix domain socket "/tmp/.s.PGSQL.5432"错误

    在项目中使用postgresql数据库时要求在windows和linux双平台兼容.于是在windows下使用的接口在linux下爆出异常: psql:connections on Unix doma ...

  8. Unix domain socket IPC

    UNIX Domain socket 虽然网络socket也可用于同一台主机的进程间通讯(通过lo地址127.0.0.1),但是unix domain socket用于IPC更有效率:不需要经过网络协 ...

  9. 问题解决:psql: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

    错误提示: psql: could not connect to server: No such file or directory Is the server running locally and ...

  10. UNIX域套接字(unix domain)

    UNIX域套接字用于在同一台机器上运行的进程之间的通信. UNIX域套接字提供流和数据报两种接口. 说明:UNIX域套接字比因特网套接字效率更高.它仅赋值数据:不进行协议处理,如添加或删除网络报头.计 ...

随机推荐

  1. ncurses 与 panel

    ncurses 与 panel 一下是ncurses使用面板库panel的一个demo程序. #include <ncurses.h> #include <panel.h> # ...

  2. 中文语音识别转文字的王者,阿里达摩院FunAsr足可与Whisper相颉顽

    君不言语音识别技术则已,言则必称Whisper,没错,OpenAi开源的Whisper确实是世界主流语音识别技术的魁首,但在中文领域,有一个足以和Whisper相颉顽的项目,那就是阿里达摩院自研的Fu ...

  3. 使用 Python 将数据写入 Excel 工作表

    在数据处理和报告生成等工作中,Excel 表格是一种常见且广泛使用的工具.然而,手动将大量数据输入到 Excel 表格中既费时又容易出错.为了提高效率并减少错误,使用 Python 编程语言来自动化数 ...

  4. C语言编程需要掌握的核心要点有哪些? 编程大神为你总结了这20个

    摘要:C语言作为编程的入门语言,学习者如何快速掌握其核心知识点,面对茫茫书海,似乎有点迷茫.为了让各位快速地掌握C语言的知识内容,在这里对相关的知识点进行了归纳. 引言 笔者有十余年的C++开发经验, ...

  5. 跑AI大模型的K8s与普通K8s有什么不同?

    本文分享自华为云社区<跑AI大模型的K8s与普通K8s有什么不同?>,作者:tsjsdbd. 得益于AI开始火的时候,云原生体系已经普及,所以当前绝大多数的AI底层都是基于Kubernet ...

  6. 华为亮相KubeCon EU 2023 新云原生开源项目Kuasar推动“云上演进”

    摘要:协力同行.拥抱开源,解放数字生产力,为社会和行业带来更多价值. 在数字时代,如果说企业是一艘巨大的货船,那么云原生则为企业的每一个业务.每一个应用提供了标准化的集装箱,摆脱笨重的底层桎梏,打造新 ...

  7. 从原生迈向混合,小而美团队如何搞定APP高效定制

    摘要:洞悉华为云数字化差旅App的架构变迁之路,体验混合开发魅力. ​​本文分享自华为云社区<DTSE Tech Talk 第21期丨从原生迈向混合,小而美团队如何搞定APP高效定制?>, ...

  8. 数仓出现“wait in ccn queue”的时候,怎么迅速定位处理?

    摘要:现网在使用动态负载管理的时候,经常出现很多wait in ccn的情况,大家处理起来就会认为是hung住或者怎么着了,很着急,但wait ccn其实就是一个等待资源的状态,在此总结一个ccn问题 ...

  9. 成为一个合格程序员所必备的三种常见LeetCode排序算法

    排序算法是一种通过特定的算法因式将一组或多组数据按照既定模式进行重新排序的方法.通过排序,我们可以得到一个新的序列,该序列遵循一定的规则并展现出一定的规律.经过排序处理后的数据可以更方便地进行筛选和计 ...

  10. 【Protoc】VS2019 (VS平台) 使用 CMake 编译安装、使用 Protobuf 库

    背景:工作中需要使用到 protobuf,看了一些教程,感觉都不是很适合,便自己总结一些 开发环境: Win 10 VS2019 CMake 3.24.2 Protobuf 3.21.12 (Prot ...