cento :http://blog.csdn.net/delphiwcdj/article/details/18284429

1 问题背景

后台系统有一个单线程的http接口,为了提高并发处理能力,开启多个线程并发在跑,修改后接口的响应确实得到提高,但是server每3分钟出现一次crash。原因是系统使用的是curl-7.21.1(August 11 2010)的库,此版本并非线程安全。遂替换了最新的curl-7.34.0(December 12 2013)库,悲催的是时隔几小时还是会偶现crash,于是再仔细阅读官方文档。

官方对最新版本libcurl的Multi-threading Issues解释如下[1]

The first basic rule is that you mustneversimultaneously share a libcurl handle (be it easy or multi or whatever) betweenmultiple threads. Only use one handle in one thread at any time. You can passthe handles around among threads, but you must never use a single handle frommore than one thread at any given time.

libcurl is completely thread safe, except for two issues:signals and SSL/TLS handlers. Signals are used for timingout name resolves (during DNS lookup) - when built without c-ares support andnot on Windows.

When using multiple threads youshould set the CURLOPT_NOSIGNAL option to 1 for all handles.Everything will or might work fine except that timeouts are not honored duringthe DNS lookup - which you can work around by building libcurl with c-aressupport. c-ares is a library that provides asynchronous name resolves.On some platforms, libcurl simply will not function properlymulti-threaded unless this option is set.

Also, note that CURLOPT_DNS_USE_GLOBAL_CACHE is notthread-safe.

此接口并没有使用到SSL/TLS,但会不会是用到了signals导致的crash呢?官方建议在多线程场景下应该设置CURLOPT_NOSIGNAL选项,因为在解析DNS出现超时的时候将会发生“糟糕”的情况。官方也给出了解决方法,可以使用c-ares[2]的libcurl版本实现异步域名解析来预防这种“糟糕”的情况,但是最后一句还是告诫我们:在多线程场景下,若不设置CURLOPT_NOSIGNAL选项,可能会有“意外”的情况发生。通过官方这段描述,可以大致猜测到是没有设置这个选项造成的crash。下面是官方对此选项的说明[3]

CURLOPT_NOSIGNAL

Pass a long. If it is 1, libcurl will not use anyfunctions that install signal handlers or any functions that cause signals tobe sent to the process.This option is mainly here toallow multi-threaded unix applications to still set/use all timeout optionsetc, without risking getting signals. The default value for thisparameter is 0. (Added in 7.10)

If this option is set and libcurl has been built withthe standard name resolver, timeouts will not occur while the name resolvetakes place. Consider building libcurl with c-ares support to enableasynchronous DNS lookups, which enables nice timeouts for name resolves withoutsignals.

Setting CURLOPT_NOSIGNALto 1 makes libcurl NOT ask the system to ignore SIGPIPE signals, whichotherwise are sent by the system when trying to send data to a socket which isclosed in the other end.libcurl makes an effort tonever cause such SIGPIPEs to trigger, but some operating systems have no way toavoid them and even on those that have there are some corner cases when theymay still happen, contrary to our desire. In addition, usingCURLAUTH_NTLM_WBauthentication could cause a SIGCHLD signal to be raised.

即CURLOPT_NOSIGNAL选项的作用是,在多线程处理场景下使用超时选项时,会忽略signals对应的处理函数,但是官方也“无奈地”解释说,这个选项只是“尽量”去避免产生signals,但是在一些操作系统或“极少数的”情况下,还是有产生signals的情况发生。意思是还是有小概率的crash情况发生,这个只能在现网的机器验证了。

仔细看下后台系统接口的实现,发现确实有用到设置超时选项的代码:

 curl_easy_setopt(curl,   CURLOPT_CONNECTTIMEOUT,   timeout);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, timeout);

这两个选项在官方的解释分别是:

CURLOPT_CONNECTTIMEOUT

Pass a long. It should contain the maximum time inseconds that you allow the connection to the server to take. This only limitsthe connection phase, once it has connected, this option is of no more use. Setto zero to switch to the default built-in connection timeout - 300 seconds. Seealso the CURLOPT_TIMEOUToption.

In unix-like systems, thismight cause signals to be used unless CURLOPT_NOSIGNAL is set.

CURLOPT_TIMEOUT

Pass a long as parameter containing the maximum timein seconds that you allow the libcurl transfer operation to take. Normally,name lookups can take a considerable time and limiting operations to less thana few minutes risk aborting perfectly normal operations. This option will causecurl to use the SIGALRM to enable time-outing system calls.

In unix-like systems, thismight cause signals to be used unless CURLOPT_NOSIGNAL is set.

Default timeout is 0 (zero) which means it nevertimes out.

因此,虽然替换了最新thread-safe的libcurl库,但是这两行设置超时选项的代码,会导致signal发生产生线程安全性问题,因而还是会偶尔出现crash。

2 遗留问题

在官方的Multi-threading Issues描述中并没有提及curl_global_init[4-5]的线程安全问题,而在curl_global_init(3)的接口描述中,提及了curl_global_init是非线程安全的。

This function sets up the program environment thatlibcurl needs. Think of it as an extension of the library loader.

This function must be called atleast once within a program (a program is all the code that shares a memoryspace) before the program calls any other function in libcurl.The environment it sets up is constant for the life of the program and is thesame for every program, so multiple calls have the same effect as one call.

The flags option is a bit pattern that tells libcurlexactly what features to init, as described below. Set the desired bits byORing the values together.In normal operation, youmust specify CURL_GLOBAL_ALL. Don't use any other value unless you arefamiliar with it and mean to control internal operations of libcurl.

This function is not thread safe.You must not call it when any other thread in the program (i.e. a threadsharing the same memory) is running. This doesn't just mean no other threadthat is using libcurl. Because curl_global_init()calls functions of other libraries that are similarly thread unsafe, it couldconflict with any other thread that uses these other libraries.

See the description in libcurl(3)of global environment requirements for details of how to use this function.

因此,在多线程的环境下,程序一开始需要先显示地调用一次curl_global_init,这样在工作线程处理每次请求调用curl_easy_init()时,判断curl_global_init是否调用过,从而避免再次调用curl_global_init以减少冲突的概率。例如,可以这样初始化:

3 官网一个多线程的例子

 /* A multi-threaded example that uses pthreads extensively to fetch
* X remote files at once */ #include <stdio.h>
#include <pthread.h>
#include <curl/curl.h> #define NUMT 4 /*
List of URLs to fetch. If you intend to use a SSL-based protocol here you MUST setup the OpenSSL
callback functions as described here: http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION */
const char * const urls[NUMT]= {
"http://curl.haxx.se/",
"ftp://cool.haxx.se/",
"http://www.contactor.se/",
"www.haxx.se"
}; static void *pull_one_url(void *url)
{
CURL *curl; curl = curl_easy_init();
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_perform(curl); /* ignores error */
curl_easy_cleanup(curl); return NULL;
} /*
int pthread_create(pthread_t *new_thread_ID,
const pthread_attr_t *attr,
void * (*start_func)(void *), void *arg);
*/ int main(int argc, char **argv)
{
pthread_t tid[NUMT];
int i;
int error; /* Must initialize libcurl before any threads are started */
curl_global_init(CURL_GLOBAL_ALL); for(i=; i< NUMT; i++) {
error = pthread_create(&tid[i],
NULL, /* default attributes please */
pull_one_url,
(void *)urls[i]);
if( != error)
fprintf(stderr, "Couldn't run thread number %d, errno %d\n", i, error);
else
fprintf(stderr, "Thread %d, gets %s\n", i, urls[i]);
} /* now wait for all threads to terminate */
for(i=; i< NUMT; i++) {
error = pthread_join(tid[i], NULL);
fprintf(stderr, "Thread %d terminated\n", i);
} return ;
}

更多例子:http://curl.haxx.se/libcurl/c/multithread.html

4 参考

[1] http://curl.haxx.se/libcurl/c/libcurl-tutorial.html

[2] http://curl.haxx.se/mail/lib-2010-11/0188.html

[3] http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTNOSIGNAL

[4] http://curl.haxx.se/libcurl/c/curl_global_init.html

[5] http://code.lovemiao.com/?tag=multi-thread

Libcurl多线程crash问题(cento)的更多相关文章

  1. libcurl 多线程使用注意事项 - Balder~专栏 - 博客频道 - CSDN.NET

    libcurl 多线程使用注意事项 - Balder~专栏 - 博客频道 - CSDN.NET libcurl 多线程使用注意事项 分类: C/C++学习 2012-05-24 18:48 2843人 ...

  2. c++比例-libcurl多线程并发时的core【转载】

    转自: https://www.cnblogs.com/edgeyang/articles/3722035.html 浅析libcurl多线程安全问题 背景:使用多线程libcurl发送请求,在未设置 ...

  3. libcurl多线程超时设置不安全(转)

    from http://www.cnblogs.com/kex1n/p/4135263.html (1), 超时(timeout) libcurl 是 一个很不错的库,支持http,ftp等很多的协议 ...

  4. libcurl多线程超时设置不安全

    from http://blog.csdn.net/sctq8888/article/details/10031219 (1), 超时(timeout) libcurl 是 一个很不错的库,支持htt ...

  5. linux信号处理及libcurl的坑

    前言:     最近有个项目, 需要访问第三方服务. 该服务是通过http的形式访问的, 为了安全和加密, 对方提供了一个加密用的C/C++库, 用于对参数进行处理.  鉴于此, 选用了C/C++语言 ...

  6. 【转载】linux信号处理及libcurl的坑

    转载自http://www.cnblogs.com/mumuxinfei/p/4363466.html 前言:     最近有个项目, 需要访问第三方服务. 该服务是通过http的形式访问的, 为了安 ...

  7. 多线程的libcurl的使用

    摘要:libcurl在多线程中,采用https访问,经常运行一段时间,会出现crash. libcurl的在多线程中的使用特别注意的有两点: 1. curl的句柄不能多线程共享. 2. ssl访问时, ...

  8. iOS多线程到底不安全在哪里?

    iOS多线程安全的概念在很多地方都会遇到,为什么不安全,不安全又该怎么去定义,其实是个值得深究的话题. 共享状态,多线程共同访问某个对象的property,在iOS编程里是很普遍的使用场景,我们就从P ...

  9. 最全的libcurl库资源整理

    C++ 用libcurl库进行http 网络通讯编程 百度登陆协议分析!!!用libcurl来模拟百度登陆 C++使用libcurl做HttpClient 使用libcurl库进行HTTP的下载 li ...

随机推荐

  1. WinForm中TreeView控件实现鼠标拖动节点(可实现同级节点位置互换,或拖到目标子节点)

    ;//1:不同级, 不为1:拖同级 private void treeView1_ItemDrag(object sender, ItemDragEventArgs e) { if (e.Button ...

  2. 实战p12文件转pem文件

    1.首先生成一个ssl的证书 选择app IDS 后实现下面这个(这里不详细说明怎么生成了) 点击Download按钮,我就下载Development的ssl证书,下载成功后,双击运行,会打开钥匙串程 ...

  3. C程序与Lua脚本相互调用

    Lua脚本是一种可用于C程序开发/测试的工具,本篇介绍一下C程序与Lua脚本如何进行相互调用,更加详细的操作参见<Programing in Lua>.本文分为3个部分:1.Windows ...

  4. asp.net 一次性提交大量数据,服务器会报错,要在 web.config 中设置一下

    web.config <?xml version="1.0" encoding="utf-8"?> <!-- 有关如何配置 ASP.NET 应 ...

  5. WINDOWS Server2008上部署Oracle10g及oracle SQL语法小记

    首先安装10G客户端 情况一:一般都会安装到一般报错.因为10G是32BIT客户端.而操作系统是64位的.但是不会影响配置监听程序.自主开发的应用程序依然可以运行. 情况二:报错但是配置完监听程序始终 ...

  6. setContentView R can not be resovled

    原因:gen包下没有自动生成R.java的资源文件 解决办法:再次新建android application project,默认Theme为Holo Light With Dark Action B ...

  7. python3+ 模块学习 之 re

    re 模块 参考:Python3 如何优雅地使用正则表达式(详解系列) Python3 正则表达式特殊符号及用法(详细列表)    (出处: 鱼C论坛) 正则表达式 常用元字符:. ^ $ * + ? ...

  8. SQL Server 2012日志文件误删除数据库质疑后的相关恢复

    alter database testdb set emergencyalter database testdb set single_userdbcc checkdb('testdb',REPAIR ...

  9. loadrunner录制脚本如何选择使用get请求和post请求的方式

    在loadrunner工具里录制脚本时常常会用到get请求和post请求,有关loadrunner常用的这两类的请求主要有: get请求: web_url 和 web_link post请求: web ...

  10. ps应用

    1.选中图层 ctrl+鼠标左键(win) command+鼠标左键(mac) 2.初始化 右侧:图层,历史记录,信息(面板选项-rgb,文档尺寸,像素),字符 编辑-首选项-单位与标尺-像素 窗口- ...