cento :http://blog.csdn.net/delphiwcdj/article/details/18284429

1 问题背景

后台系统有一个单线程的http接口,为了提高并发处理能力,开启多个线程并发在跑,修改后接口的响应确实得到提高,但是server每3分钟出现一次crash。原因是系统使用的是curl-7.21.1(August 11 2010)的库,此版本并非线程安全。遂替换了最新的curl-7.34.0(December 12 2013)库,悲催的是时隔几小时还是会偶现crash,于是再仔细阅读官方文档。

官方对最新版本libcurl的Multi-threading Issues解释如下[1]

The first basic rule is that you mustneversimultaneously share a libcurl handle (be it easy or multi or whatever) betweenmultiple threads. Only use one handle in one thread at any time. You can passthe handles around among threads, but you must never use a single handle frommore than one thread at any given time.

libcurl is completely thread safe, except for two issues:signals and SSL/TLS handlers. Signals are used for timingout name resolves (during DNS lookup) - when built without c-ares support andnot on Windows.

When using multiple threads youshould set the CURLOPT_NOSIGNAL option to 1 for all handles.Everything will or might work fine except that timeouts are not honored duringthe DNS lookup - which you can work around by building libcurl with c-aressupport. c-ares is a library that provides asynchronous name resolves.On some platforms, libcurl simply will not function properlymulti-threaded unless this option is set.

Also, note that CURLOPT_DNS_USE_GLOBAL_CACHE is notthread-safe.

此接口并没有使用到SSL/TLS,但会不会是用到了signals导致的crash呢?官方建议在多线程场景下应该设置CURLOPT_NOSIGNAL选项,因为在解析DNS出现超时的时候将会发生“糟糕”的情况。官方也给出了解决方法,可以使用c-ares[2]的libcurl版本实现异步域名解析来预防这种“糟糕”的情况,但是最后一句还是告诫我们:在多线程场景下,若不设置CURLOPT_NOSIGNAL选项,可能会有“意外”的情况发生。通过官方这段描述,可以大致猜测到是没有设置这个选项造成的crash。下面是官方对此选项的说明[3]

CURLOPT_NOSIGNAL

Pass a long. If it is 1, libcurl will not use anyfunctions that install signal handlers or any functions that cause signals tobe sent to the process.This option is mainly here toallow multi-threaded unix applications to still set/use all timeout optionsetc, without risking getting signals. The default value for thisparameter is 0. (Added in 7.10)

If this option is set and libcurl has been built withthe standard name resolver, timeouts will not occur while the name resolvetakes place. Consider building libcurl with c-ares support to enableasynchronous DNS lookups, which enables nice timeouts for name resolves withoutsignals.

Setting CURLOPT_NOSIGNALto 1 makes libcurl NOT ask the system to ignore SIGPIPE signals, whichotherwise are sent by the system when trying to send data to a socket which isclosed in the other end.libcurl makes an effort tonever cause such SIGPIPEs to trigger, but some operating systems have no way toavoid them and even on those that have there are some corner cases when theymay still happen, contrary to our desire. In addition, usingCURLAUTH_NTLM_WBauthentication could cause a SIGCHLD signal to be raised.

即CURLOPT_NOSIGNAL选项的作用是,在多线程处理场景下使用超时选项时,会忽略signals对应的处理函数,但是官方也“无奈地”解释说,这个选项只是“尽量”去避免产生signals,但是在一些操作系统或“极少数的”情况下,还是有产生signals的情况发生。意思是还是有小概率的crash情况发生,这个只能在现网的机器验证了。

仔细看下后台系统接口的实现,发现确实有用到设置超时选项的代码:

 curl_easy_setopt(curl,   CURLOPT_CONNECTTIMEOUT,   timeout);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, timeout);

这两个选项在官方的解释分别是:

CURLOPT_CONNECTTIMEOUT

Pass a long. It should contain the maximum time inseconds that you allow the connection to the server to take. This only limitsthe connection phase, once it has connected, this option is of no more use. Setto zero to switch to the default built-in connection timeout - 300 seconds. Seealso the CURLOPT_TIMEOUToption.

In unix-like systems, thismight cause signals to be used unless CURLOPT_NOSIGNAL is set.

CURLOPT_TIMEOUT

Pass a long as parameter containing the maximum timein seconds that you allow the libcurl transfer operation to take. Normally,name lookups can take a considerable time and limiting operations to less thana few minutes risk aborting perfectly normal operations. This option will causecurl to use the SIGALRM to enable time-outing system calls.

In unix-like systems, thismight cause signals to be used unless CURLOPT_NOSIGNAL is set.

Default timeout is 0 (zero) which means it nevertimes out.

因此,虽然替换了最新thread-safe的libcurl库,但是这两行设置超时选项的代码,会导致signal发生产生线程安全性问题,因而还是会偶尔出现crash。

2 遗留问题

在官方的Multi-threading Issues描述中并没有提及curl_global_init[4-5]的线程安全问题,而在curl_global_init(3)的接口描述中,提及了curl_global_init是非线程安全的。

This function sets up the program environment thatlibcurl needs. Think of it as an extension of the library loader.

This function must be called atleast once within a program (a program is all the code that shares a memoryspace) before the program calls any other function in libcurl.The environment it sets up is constant for the life of the program and is thesame for every program, so multiple calls have the same effect as one call.

The flags option is a bit pattern that tells libcurlexactly what features to init, as described below. Set the desired bits byORing the values together.In normal operation, youmust specify CURL_GLOBAL_ALL. Don't use any other value unless you arefamiliar with it and mean to control internal operations of libcurl.

This function is not thread safe.You must not call it when any other thread in the program (i.e. a threadsharing the same memory) is running. This doesn't just mean no other threadthat is using libcurl. Because curl_global_init()calls functions of other libraries that are similarly thread unsafe, it couldconflict with any other thread that uses these other libraries.

See the description in libcurl(3)of global environment requirements for details of how to use this function.

因此,在多线程的环境下,程序一开始需要先显示地调用一次curl_global_init,这样在工作线程处理每次请求调用curl_easy_init()时,判断curl_global_init是否调用过,从而避免再次调用curl_global_init以减少冲突的概率。例如,可以这样初始化:

3 官网一个多线程的例子

 /* A multi-threaded example that uses pthreads extensively to fetch
* X remote files at once */ #include <stdio.h>
#include <pthread.h>
#include <curl/curl.h> #define NUMT 4 /*
List of URLs to fetch. If you intend to use a SSL-based protocol here you MUST setup the OpenSSL
callback functions as described here: http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION */
const char * const urls[NUMT]= {
"http://curl.haxx.se/",
"ftp://cool.haxx.se/",
"http://www.contactor.se/",
"www.haxx.se"
}; static void *pull_one_url(void *url)
{
CURL *curl; curl = curl_easy_init();
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_perform(curl); /* ignores error */
curl_easy_cleanup(curl); return NULL;
} /*
int pthread_create(pthread_t *new_thread_ID,
const pthread_attr_t *attr,
void * (*start_func)(void *), void *arg);
*/ int main(int argc, char **argv)
{
pthread_t tid[NUMT];
int i;
int error; /* Must initialize libcurl before any threads are started */
curl_global_init(CURL_GLOBAL_ALL); for(i=; i< NUMT; i++) {
error = pthread_create(&tid[i],
NULL, /* default attributes please */
pull_one_url,
(void *)urls[i]);
if( != error)
fprintf(stderr, "Couldn't run thread number %d, errno %d\n", i, error);
else
fprintf(stderr, "Thread %d, gets %s\n", i, urls[i]);
} /* now wait for all threads to terminate */
for(i=; i< NUMT; i++) {
error = pthread_join(tid[i], NULL);
fprintf(stderr, "Thread %d terminated\n", i);
} return ;
}

更多例子:http://curl.haxx.se/libcurl/c/multithread.html

4 参考

[1] http://curl.haxx.se/libcurl/c/libcurl-tutorial.html

[2] http://curl.haxx.se/mail/lib-2010-11/0188.html

[3] http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTNOSIGNAL

[4] http://curl.haxx.se/libcurl/c/curl_global_init.html

[5] http://code.lovemiao.com/?tag=multi-thread

Libcurl多线程crash问题(cento)的更多相关文章

  1. libcurl 多线程使用注意事项 - Balder~专栏 - 博客频道 - CSDN.NET

    libcurl 多线程使用注意事项 - Balder~专栏 - 博客频道 - CSDN.NET libcurl 多线程使用注意事项 分类: C/C++学习 2012-05-24 18:48 2843人 ...

  2. c++比例-libcurl多线程并发时的core【转载】

    转自: https://www.cnblogs.com/edgeyang/articles/3722035.html 浅析libcurl多线程安全问题 背景:使用多线程libcurl发送请求,在未设置 ...

  3. libcurl多线程超时设置不安全(转)

    from http://www.cnblogs.com/kex1n/p/4135263.html (1), 超时(timeout) libcurl 是 一个很不错的库,支持http,ftp等很多的协议 ...

  4. libcurl多线程超时设置不安全

    from http://blog.csdn.net/sctq8888/article/details/10031219 (1), 超时(timeout) libcurl 是 一个很不错的库,支持htt ...

  5. linux信号处理及libcurl的坑

    前言:     最近有个项目, 需要访问第三方服务. 该服务是通过http的形式访问的, 为了安全和加密, 对方提供了一个加密用的C/C++库, 用于对参数进行处理.  鉴于此, 选用了C/C++语言 ...

  6. 【转载】linux信号处理及libcurl的坑

    转载自http://www.cnblogs.com/mumuxinfei/p/4363466.html 前言:     最近有个项目, 需要访问第三方服务. 该服务是通过http的形式访问的, 为了安 ...

  7. 多线程的libcurl的使用

    摘要:libcurl在多线程中,采用https访问,经常运行一段时间,会出现crash. libcurl的在多线程中的使用特别注意的有两点: 1. curl的句柄不能多线程共享. 2. ssl访问时, ...

  8. iOS多线程到底不安全在哪里?

    iOS多线程安全的概念在很多地方都会遇到,为什么不安全,不安全又该怎么去定义,其实是个值得深究的话题. 共享状态,多线程共同访问某个对象的property,在iOS编程里是很普遍的使用场景,我们就从P ...

  9. 最全的libcurl库资源整理

    C++ 用libcurl库进行http 网络通讯编程 百度登陆协议分析!!!用libcurl来模拟百度登陆 C++使用libcurl做HttpClient 使用libcurl库进行HTTP的下载 li ...

随机推荐

  1. Python Quick list dir

    昨天 Python释放了 3.5 ,添加了 os.scandir 根据文档该API比os.listdir快Docs which speeds it up by 3-5 times on POSIX s ...

  2. webstrom11 和12破解码

    很多人都发现 http://idea.lanyus.com/ 不能激活了 很多帖子说的 http://15.idea.lanyus.com/ 之类都用不了了,最近封的厉害仅作测试. 红色字体的是最近大 ...

  3. osx 编译安装配置 ruby on rails

    下载源代码: curl -O http://cache.ruby-lang.org/pub/ruby/2.2/ruby-2.2.2.tar.gz 解压: .tar.gz 编译: cd ruby- ./ ...

  4. Android IOS WebRTC 音视频开发总结(二五)-- webrtc优秀资源汇总

    本文主要整理一些webrtc相关资料供学习(会持续更新),转载请说明出处,文章来自博客园RTC.Blacker,欢迎关注微信公众号:blackerteam ---------------------- ...

  5. SDH误码仪MP1570A的自动化

    MP1570A是日本安立公司的用于SDH测试的误码仪. 1.MP1570A的自动化测试场景和原理 任意测试PC--(telnet)-->测试PC(Tcl Interrupt)-->SIG_ ...

  6. React知识点总结1

    最近打算把react知识点总结下: React特点 1.虚拟DOM 在内存中操作DOM,在内存中创建数据结构,只会更新有差异的地方 2.组件化 页面分成若干个组件,每个组件包含逻辑结构和样式 组件仅包 ...

  7. Runtime Reconfiguration

    https://coreos.com/etcd/docs/latest/runtime-configuration.html Runtime Reconfiguration  运行时重新配置 etcd ...

  8. ycm添加自定义补全路径

    修改~/.vim/bundle/YouCompleteMe/third_party/ycmd/cpp/ycm/.ycm_extra_conf.py的flags变量 未改前如下: flags = [  ...

  9. Josn序列化与反序列化

    using System.Web.Script.Serialization; /// <summary>        /// 序列化器        /// </summary&g ...

  10. information_schema系列十二

    1: INNODB_SYS_VIRTUAL 表存储的是INNODB表的虚拟列的信息,当然这个还是比较简单的,我们直接通过SHOW CREATE TABLE 或者DESC TABLE就能看得到. Col ...