背景

前段时间我选用了 Airflow 对 wms 进行数据归档,在运行一段时间后,经常发现会报以下错误:

[-- ::,: WARNING/ForkPoolWorker-] Failed operation _store_result.  Retrying  more times.
Traceback (most recent call last):
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line , in _execute_context
self.dialect.do_execute(
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line , in do_execute
cursor.execute(statement, parameters)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py", line , in execute
self.errorhandler(self, exc, value)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/connections.py", line , in defaulterrorhandler
raise errorvalue
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py", line , in execute
res = self._query(query)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py", line , in _query
db.query(q)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/connections.py", line , in query
_mysql.connection.query(self, query)
_mysql_exceptions.OperationalError: (, 'MySQL server has gone away') The above exception was the direct cause of the following exception: Traceback (most recent call last):
File "/usr/local/python38/lib/python3.8/site-packages/celery/backends/database/__init__.py", line , in _inner
return fun(*args, **kwargs)
File "/usr/local/python38/lib/python3.8/site-packages/celery/backends/database/__init__.py", line , in _store_result
task = list(session.query(Task).filter(Task.task_id == task_id))
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line , in __iter__
return self._execute_and_instances(context)
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line , in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line , in execute
return meth(self, multiparams, params)
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line , in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line , in _execute_clauseelement
ret = self._execute_context(
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line , in _execute_context
self._handle_dbapi_exception(
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line , in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line , in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line , in reraise
raise value.with_traceback(tb)
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line , in _execute_context
self.dialect.do_execute(
File "/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line , in do_execute
cursor.execute(statement, parameters)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py", line , in execute
self.errorhandler(self, exc, value)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/connections.py", line , in defaulterrorhandler
raise errorvalue
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py", line , in execute
res = self._query(query)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py", line , in _query
db.query(q)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/connections.py", line , in query
_mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (, 'MySQL server has gone away')
[SQL: SELECT celery_taskmeta.id AS celery_taskmeta_id, celery_taskmeta.task_id AS celery_taskmeta_task_id, celery_taskmeta.status AS celery_taskmeta_status, celery_tas
kmeta.result AS celery_taskmeta_result, celery_taskmeta.date_done AS celery_taskmeta_date_done, celery_taskmeta.traceback AS celery_taskmeta_traceback
FROM celery_taskmeta
WHERE celery_taskmeta.task_id = %s]
[parameters: ('e909b916-4284-47c4-bc5b-321bc32eb9f9',)]
(Background on this error at: http://sqlalche.me/e/e3q8)

解决过程

查了下资料一般情况下数据库服务器断开连接后,被连接池未收回将会导致以下错误:

MySQL server has gone away

所以看了下 sqlalchemy 的配置:

sql_alchemy_pool_enabled = True

# The SqlAlchemy pool size is the maximum number of database connections
# in the pool. indicates no limit.
sql_alchemy_pool_size = # The maximum overflow size of the pool.
# When the number of checked-out connections reaches the size set in pool_size,
# additional connections will be returned up to this limit.
# When those additional connections are returned to the pool, they are disconnected and discarded.
# It follows then that the total number of simultaneous connections the pool will allow is pool_size + max_overflow,
# and the total number of "sleeping" connections the pool will allow is pool_size.
# max_overflow can be set to - to indicate no overflow limit;
# no limit will be placed on the total number of concurrent connections. Defaults to .
sql_alchemy_max_overflow = # The SqlAlchemy pool recycle is the number of seconds a connection
# can be idle in the pool before it is invalidated. This config does
# not apply to sqlite. If the number of DB connections is ever exceeded,
# a lower config value will allow the system to recover faster.
sql_alchemy_pool_recycle = # Check connection at the start of each connection pool checkout.
# Typically, this is a simple statement like “SELECT ”.
# More information here: https://docs.sqlalchemy.org/en/13/core/pooling.html#disconnect-handling-pessimistic
sql_alchemy_pool_pre_ping = True sql_alchemy_pool_size = # The maximum overflow size of the pool.
# When the number of checked-out connections reaches the size set in pool_size,
# additional connections will be returned up to this limit.
# When those additional connections are returned to the pool, they are disconnected and discarded.
# It follows then that the total number of simultaneous connections the pool will allow is pool_size + max_overflow,
# and the total number of "sleeping" connections the pool will allow is pool_size.
# max_overflow can be set to - to indicate no overflow limit;
# no limit will be placed on the total number of concurrent connections. Defaults to .
sql_alchemy_max_overflow = # The SqlAlchemy pool recycle is the number of seconds a connection
# can be idle in the pool before it is invalidated. This config does
# not apply to sqlite. If the number of DB connections is ever exceeded,
# a lower config value will allow the system to recover faster.
sql_alchemy_pool_recycle = # Check connection at the start of each connection pool checkout.
# Typically, this is a simple statement like “SELECT ”.
# More information here: https://docs.sqlalchemy.org/en/13/core/pooling.html#disconnect-handling-pessimistic
sql_alchemy_pool_pre_ping = True

该配的都配置上了,因为我们的任务是一天跑一次,查了下数据库变量 waits_timeout 是 28800 ,所以直接改成25个小时。

到了第二天发现还是报这个错,很奇怪该配的都配上了,到底是哪里的问题?

仔细翻下报错日志:

File "/usr/local/python38/lib/python3.8/site-packages/celery/backends/database/__init__.py", line , in _store_result
task = list(session.query(Task).filter(Task.task_id == task_id))

难道 Airflow 的 sqlalchemy 配置对 celery 不生效?

翻阅下源码发现果然 Airflow 配置的 sqlalchemy 只对 Airflow 生效

app = Celery(
conf.get('celery', 'CELERY_APP_NAME'),
config_source=celery_configuration)

在继续翻阅 Celery 文档看有没有办法配置

database_short_lived_sessions Default: Disabled by default.

Short lived sessions are disabled by default. If enabled they can drastically reduce performance, especially on systems processing lots of tasks. This option is useful on low-traffic workers that experience errors as a result of cached database connections going stale through inactivity. For example, intermittent errors like (OperationalError) (2006, ‘MySQL server has gone away’) can be fixed by enabling short lived sessions. This option only affects the database backend.

文档告知通过database_short_lived_sessions 参数就可以避免这个问题,但是新的问题又来了,如何在 Airflow 中配置额外的 Celery 配置呢?

解决方案

找到以下文件拷贝到 DAGS 目录下,重新命名为 my_celery_config 随便起

Python/Python37/site-packages/airflow/config_templates/default_celery.py
修改 Airflow.cfg 配置 找到 celery_config_options 将配置改为 刚才起的名字
celery_config_options = my_celery_config.DEFAULT_CELERY_CONFIG
在 my_celery_config 文件中的 DEFAULT_CELERY_CONFIG dict 中就可以随便加自己需要的 Celery 配置

Airflow 使用 Celery 时,如何添加 Celery 配置的更多相关文章

  1. Python3安装Celery模块后执行Celery命令报错

    1 Python3安装Celery模块后执行Celery命令报错 pip3 install celery # 安装正常,但是执行celery 命令的时候提示没有_ssl模块什么的 手动在Python解 ...

  2. celery 分布式异步任务框架(celery简单使用、celery多任务结构、celery定时任务、celery计划任务、celery在Django项目中使用Python脚本调用Django环境)

    一.celery简介: Celery 是一个强大的 分布式任务队列 的 异步处理框架,它可以让任务的执行完全脱离主程序,甚至可以被分配到其他主机上运行.我们通常使用它来实现异步任务(async tas ...

  3. [mark] 使用Sublime Text 2时如何将Tab配置为4个空格

    在Mac OS X系统下,Sublime Text是一款比较赞的编辑器. 作为空格党的自觉,今天mark一下使用Sublime Text 2时如何将Tab配置为4个空格: 方法来自以下两个链接: ht ...

  4. 问题.NET--win7 IIS唯一密钥属性“VALUE”设置为“DEFAULT.ASPX”时,无法添加类型为“add”的重复集合

    问题现象:.NET--win7 IIS唯一密钥属性“VALUE”设置为“DEFAULT.ASPX”时,无法添加类型为“add”的重复集合 问题处理: 内容摘要:    HTTP 错误 500.19 - ...

  5. mybatis JdbcTypeInterceptor - 运行时自动添加 jdbcType 属性

    上代码: package tk.mybatis.plugin; import org.apache.ibatis.executor.ErrorContext; import org.apache.ib ...

  6. SpringBoot运行时动态添加数据源

    此方案适用于解决springboot项目运行时动态添加数据源,非静态切换多数据源!!! 一.多数据源应用场景: 1.配置文件配置多数据源,如默认数据源:master,数据源1:salve1...,运行 ...

  7. IDEA 创建类是自动添加注释和创建方法时快速添加注释

    1.创建类是自动添加注释 /*** @Author: chiyl* @DateTime: ${DATE} ${TIME}* @Description: TODO*/2. 创建方法时快速添加注释2.1 ...

  8. eclipse启动时虚拟机初始内存配置

    eclipse启动时虚拟机初始内存配置: -Xms256M -Xmx512M -XX:PermSize=256m -XX:MaxPermSize=512m

  9. 如何设置SVN提交时强制添加注释

    windows版本: 1.新建一个名为pre-commit.bat的文件并将该文件放在创建的库文件的hooks文件夹中 2.pre-commit.bat文件的内容如下: @echo off set S ...

  10. iOS 10 (X8)上CoreData的使用(包含创建工程时未添加CoreData)

    1.在创建工程时未添加CoreData,后期想要使用CoreData则要在工程Appdelegate.h文件中添加CoreData库和CoreData中的通道类(用来管理类实例和CoreData之间的 ...

随机推荐

  1. Java FTPClient 大量数据传输的问题(未解决)

    业务需要 需要将一个存储的目标文件里的文件全部拷贝到另一个存储里面去,保持文件结构. 目前采用 org.apache.commons.net.ftp包下相关类来达到ftp连接 获取文件目录信息,拷贝文 ...

  2. wumii 爆款总结经验

    在正式创办无秘之前,我们反思前几次创业失败的教训,深刻领悟两点: 第一,内容推荐的精准度取决于平台收集用户数据的能力,如果没有用户行为数据,产品无法做内容推荐,而通过简单的新闻排序,延长用户浏览单篇文 ...

  3. Windows安装OpenSSH服务

    一.背景 在做国盛通项目的时候,有两套并行测试环境,因为基本架构采用的是供应商提供的程序,需要将两套banner图做同步,因为图片数量多,进GitLab版本控制进行分支策略管理,进而同步两套环境,意义 ...

  4. 一、 SVN概述&工作方式&恢复历史版本

    What why how 1 什么是SVN?作用是什么? SVN(SubVersion),版本控制系统.采用了分支管理系统,用于多人共同开发一个项目,共用资源的目的. 2 为什么要有SVN? 解决代码 ...

  5. Java并发编程之并发简介

    操作系统中同时执行多个程序原因: 1.资源利用率:系统资源及硬件资源.当一个程序再等待另一个程序时,可以运行另一个程序,可提高资源利用率. 2.公平性:多个程序对计算机上的资源理论上具有同等的使用权. ...

  6. hibernate部分源码解析and解决工作上关于hibernate的一个问题例子(包含oracle中新建表为何列名全转为大写且通过hibernate取数时如何不用再次遍历将列名(key)值转为小写)

    最近在研究系统启动时将数据加载到内存非常耗时,想着是否有办法优化!经过日志打印测试发现查询时间(查询时间:将数据库数据查询到系统中并转为List<Map>或List<*.Class& ...

  7. NFS PersistentVolume【转】

    上一节我们介绍了 PV 和 PVC,本节通过 NFS 实践. 作为准备工作,我们已经在 k8s-master 节点上搭建了一个 NFS 服务器,目录为 /nfsdata: 下面创建一个 PV mypv ...

  8. uni app中关于图片的分包加载

    因为在项目中使用了大量的静态资源图片,使得主包体积过大, 而把这些图片全部放到服务器又有点麻烦,就想能不能把图片也分包,但是直接放在分包下的话导致图片资源找不到了, 在社区中看到大佬分享的十分有用,特 ...

  9. KMP【模板】 && 洛谷 P3375

    题目传送门 解题思路: 首先说KMP的作用:对于两个字符串A,B(A.size() > B.size()),求B是否是A的一个字串或B在A里的位置或A里有几个B,说白了就是字符串匹配. 下面创设 ...

  10. 【转】Windows中使用TortoiseGit提交项目到GitLab配置

    转  原文地址 https://www.cnblogs.com/xiangwengao/p/4134492.html   下文来给各位介绍Windows中使用TortoiseGit提交项目到GitLa ...