一:安装步骤
步骤1. 系统Terminal命令行执行如下命令安装依赖的组件 PhantomJS

    $ wget https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2
$ sudo tar -xvf phantomjs-2.1.1-linux-x86_64.tar.bz2 -C /usr/local/
$ sudo ln -s /usr/local/phantomjs-2.1.1-linux-x86_64/bin/phantomjs /usr/local/bin/phantomjs
$ phantomjs --version

步骤2. 系统Terminal命令行执行如下命令安装其他依赖包

$ sudo apt update
$ sudo apt install libcurl4-openssl-dev backports

步骤3. pyspider 依赖的 tornado 库在 Python 3.12 环境下需要 backports.ssl_match_hostname 模块,而 pyspider 尚未完全适配这些改动。需通过pycharm的Terminal中执行如下命令解决

 $ pip install backports.ssl_match_hostname

步骤4(可选). pyspider 在 Python 3.12 环境中运行时存在兼容性问题。附件中是已修复好兼容问题的终版压缩包[ubuntu16.04(python3.12解释器)下pyspider兼容性修复完后的.tar.gz],可以直接解压使用。然后进行步骤6(前提是先通过pip install强制安装了异常版本 pip install --force-reinstall pyspider),如果想自己一步步修改源码,可跳过此步,参考"步骤5.兼容问题修复"
点击下载压缩包

步骤5. 兼容问题修复。pyspider 在 Python 3.12 环境中运行时存在兼容性问题。这是由于 Python 3.12 对部分旧模块进行了移除或者调整,而 pyspider 尚未完全适配这些改动,需要做如下修改
    5.1 兼容问题1:Python3 中的 async 已经变成了关键字。需将源码中 async 替换成其他变量,

如: 将下面位置的 async 改为 mark_async

        pyspider/run.py 的231行、245行(两个)、365行
pyspider/webui/app.py 的95行
pyspider/fetcher/tornado_fetcher.py 的81行、89行(两个)、95行、117行

5.2 兼容问题2:部分旧模块进行了移除或者调整,而 pyspider 尚未完全适配这些改动。需通过如下步骤手动修改pyspider代码来解决兼容性问题
        a). 修复 UserDict 和 Mapping 导入问题:
        把 pyspider/libs/counter.py 文件里的python代码第14行:

            try:
from UserDict import DictMixin
except ImportError:
from collections import Mapping as DictMixin

改成:

            try:
from collections import UserDict as DictMixin
except ImportError:
from collections.abc import Mapping as DictMixin

把 pyspider/scheduler/task_queue.py 文件里的python代码第12行

            try:
from UserDict import DictMixin
except ImportError:
from collections import Mapping as DictMixin

改成:

            try:
from collections import UserDict as DictMixin
except ImportError:
from collections.abc import Mapping as DictMixin

b). 修复 imp 模块缺失问题:
        把 pyspider/processor/project_module.py 文件中的python代码第11行:

import imp

改成:

 import importlib.util

c). 修复 MutableMapping 导入问题:
        tornado 库引用MutableMapping出现错误,可修改 tornado/httputil.py 文件中的python代码第106行:

class HTTPHeaders(collections.MutableMapping):

改成:

 class HTTPHeaders(collections.abc.MutableMapping):

5.3 兼容问题3:pyspider 在 Python 3.12 环境下运行时存在兼容性问题,fractions模块已被移除,而 pyspider 尚未完全适配这些改动。修改如下将 fractions 替换成 math
        a). pyspider/libs/base_handler.py的python代码第12行空白行新增

import math

b).修改 pyspider/libs/base_handler.py的python代码将其中的第115行

min_tick = fractions.gcd(min_tick, each.tick)

改成:

min_tick = math.gcd(min_tick, each.tick)

5.4 兼容问题4:pyspider 使用的 Flask 版本不兼容。pyspider 是基于 Flask 旧版本开发的,而新版本(如 Flask 2.3+)移除了 before_first_request 装饰器,需进行如下修改
        修改 pyspider/webui/debug.py的python代码将其中的第 64 行

            @app.before_first_request
def enable_projects_import():
sys.meta_path.append(ProjectFinder(app.config['projectdb']))

改成:

            @app.before_request
def enable_projects_import():
if not hasattr(app, '_got_first_request'):
app._got_first_request = True
sys.meta_path.append(ProjectFinder(app.config['projectdb']))

5.5 兼容问题5:pyspider 的 WebDAV 模块在 Python 3.12 环境下存在兼容性问题。
        a) 在 Python 3.12 中,抽象基类(ABC)的检查变得更加严格,ScriptProvider 类没有实现其基类要求的所有抽象方法。修改如下
        修改 pyspider/webui/webdav.py的python代码第165行

           class ScriptProvider(DAVProvider):
def __init__(self, app):
super(ScriptProvider, self).__init__()
self.app = app def __repr__(self):
return "pyspiderScriptProvider" def getResourceInst(self, path, environ):
path = os.path.normpath(path).replace('\\', '/')
if path in ('/', '.', ''):
path = '/'
return RootCollection(path, environ, self.app)
else:
return ScriptResource(path, environ, self.app)

改为:

           class ScriptProvider(DAVProvider):
def __init__(self, app):
super(ScriptProvider, self).__init__()
self.app = app def __repr__(self):
return "pyspiderScriptProvider" def getResourceInst(self, path, environ):
path = os.path.normpath(path).replace('\\', '/')
if path in ('/', '.', ''):
path = '/'
return RootCollection(path, environ, self.app)
else:
return ScriptResource(path, environ, self.app) # 添加缺失的抽象方法实现
def get_resource_inst(self, path, environ):
return ScriptResource(path, self, environ)

b) WsgiDAV 库的配置格式发生了改变,domaincontroller 选项已被弃用,需要使用 http_authenticator.domain_controller 替代。修改如下
        修改 pyspider/webui/webdav.py的python代码第207行

            config = DEFAULT_CONFIG.copy()
config.update({
'mount_path': '/dav',
'provider_mapping': {
'/': ScriptProvider(app)
},
'domaincontroller': NeedAuthController(app),
'verbose': 1 if app.debug else 0,
'dir_browser': {'davmount': False,
'enable': True,
'msmount': False,
'response_trailer': ''},
})
dav_app = WsgiDAVApp(config)

改成:

            config = DEFAULT_CONFIG.copy()
config.update({
'mount_path': '/dav',
'provider_mapping': {
'/': ScriptProvider(app)
},
# 更新认证配置
"http_authenticator": {
"domain_controller": NeedAuthController, # 移动到 http_authenticator 下
"accept_basic": True,
"accept_digest": False,
"default_to_digest": False,
},
'verbose': 1 if app.debug else 0,
'dir_browser': {'davmount': False,
'enable': True,
'msmount': False,
'response_trailer': ''},
})
dav_app = WsgiDAVApp(config)

c) WsgiDAV 的认证控制器接口发生变化,存在兼容性问题,修改如下
        修改 pyspider/webui/webdav.py的python代码第186行:

            class NeedAuthController(object):
def __init__(self, app):
self.app = app def getDomainRealm(self, inputRelativeURL, environ):
return 'need auth' def requireAuthentication(self, realmname, environ):
return self.app.config.get('need_auth', False) def isRealmUser(self, realmname, username, environ):
return username == self.app.config.get('webui_username') def getRealmUserPassword(self, realmname, username, environ):
return self.app.config.get('webui_password') def authDomainUser(self, realmname, username, password, environ):
return username == self.app.config.get('webui_username') \
and password == self.app.config.get('webui_password')

改成:

            class NeedAuthController(object):
def __init__(self, app, config=None):
self.app = app
# 处理额外的config参数,使其兼容WsgiDAV的初始化方式
if config is not None:
self.config = config
else:
# 如果config未提供,尝试从app中获取
self.config = app.config.get("http_authenticator", {}) def getDomainRealm(self, inputRelativeURL, environ):
return 'need auth' def requireAuthentication(self, realmname, environ):
return self.app.config.get('need_auth', False) def isRealmUser(self, realmname, username, environ):
return username == self.app.config.get('webui_username') def getRealmUserPassword(self, realmname, username, environ):
return self.app.config.get('webui_password') def authDomainUser(self, realmname, username, password, environ):
return username == self.app.config.get('webui_username') \
and password == self.app.config.get('webui_password') # 添加WsgiDAV期望的接口方法,转发到原有方法
def get_domain_realm(self, input_path, environ):
return self.getDomainRealm(input_path, environ) def basic_auth_user(self, realm, user_name, password, environ):
return self.authDomainUser(realm, user_name, password, environ) def supports_http_digest_auth(self):
return False # 我们不支持摘要认证 def is_share_anonymous(self, share_path):
"""检查指定的共享路径是否允许匿名访问"""
# 如果不需要认证,则所有共享都允许匿名访问
return not self.app.config.get('need_auth', False)

5.6 兼容问题6:Werkzeug 库版本与 pyspider 不兼容。从 Python 3.12 开始,Werkzeug v2.3.0 及以上版本已经移除了DispatcherMiddleware,将其移至独立的werkzeug.middleware.dispatcher模块中。
        而 pyspider 仍在使用旧的导入方式。修改如下
            修改 pyspider/webui/app.py的python代码第64行

from werkzeug.wsgi import DispatcherMiddleware

改成:

 from werkzeug.middleware.dispatcher import DispatcherMiddleware

步骤6. pycharm的Terminal中执行如下命令安装pyspider

 $ pip install pyspider

步骤7.  Pycharm的Terminal命令行执行如下命令,验证pyspider是否安装成功,打印所示当前安装的pyspider最新版本是 0.3.10

    $ pyspider --version
pyspider, version 0.3.10

步骤8. pycharm的Terminal命令行执行 pyspider 启动 pyspider 网页端控制台,如下打印结果表示成功启动 pyspider,并且启用了5000端口。浏览器可以访问 http://localhost:5000/ 进入PySpider网页控制台爬数据了。

    $ pyspider
phantomjs fetcher running on port 25555
    [I 250515 15:08:09 result_worker:49] result_worker starting...
    [I 250515 15:08:10 processor:211] processor starting...
    [I 250515 15:08:10 tornado_fetcher:638] fetcher starting...
    [I 250515 15:08:10 scheduler:647] scheduler starting...
    [I 250515 15:08:10 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
    [I 250515 15:08:10 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
    [I 250515 15:08:10 app:76] webui running on 0.0.0.0:5000

步骤9. 如果运行 pyspider 命令,出现错误Error: Could not create web server listening on port 25555,原因是25555端口被占用,需要释放端口重新执行步骤8
解决方案: 使用 lsof -i 25555查看端口被哪个PID占用,然后用 kill -9 <PID> 释放端口

    $ lsof -i :25555
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
phantomjs 16615 wanghu 7u IPv4 278700 0t0 TCP *:25555 (LISTEN) $ kill -9 16615

二:pyspider踩坑过程与解决方案:
1.安装 pyspider 时出现了如下的 ConfigurationError: Could not run curl-config 错误,这是因为系统缺少 curl-devel 或 libcurl 开发库,而 pycurl 依赖这些库来编译。
解决方案:使用如下命令安装 libcurl4-openssl-dev 包(Ubuntu/Debian 系统),它包含了编译 pycurl 所需的 curl-config 工具和头文件

sudo apt install libcurl4-openssl-dev

错误详情:
Getting requirements to build wheel ... error
  error: subprocess-exited-with-error
 
  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [33 lines of output]
      Traceback (most recent call last):
        File "<string>", line 230, in configure_unix
        File "/usr/local/lib/python3.12/subprocess.py", line 1026, in __init__
          self._execute_child(args, executable, preexec_fn, close_fds,
        File "/usr/local/lib/python3.12/subprocess.py", line 1950, in _execute_child
          raise child_exception_type(errno_num, err_msg, err_filename)
      FileNotFoundError: [Errno 2] No such file or directory: 'curl-config'
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in <module>
          main()
        File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main
          json_out["return_val"] = hook(**hook_input["kwargs"])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 143, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-6by6cyoh/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 331, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=[])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-6by6cyoh/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 301, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-6by6cyoh/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 512, in run_setup
          super().run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-6by6cyoh/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 317, in run_setup
          exec(code, locals())
        File "<string>", line 1016, in <module>
        File "<string>", line 676, in get_extension
        File "<string>", line 93, in __init__
        File "<string>", line 235, in configure_unix
      ConfigurationError: Could not run curl-config: [Errno 2] No such file or directory: 'curl-config'
      [end of output]
 
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

2. 安装完pyspider后在Terminal中输入 pyspider命令运行后出现如下错误,是因为Python3 中的 async 已经变成了关键字。
解决方案:将 async 替换成其他变量,如 将下面位置的 async 改为 mark_async

    pyspider/run.py 的231行、245行(两个)、365行
pyspider/webui/app.py 的95行
pyspider/fetcher/tornado_fetcher.py 的81行、89行(两个)、95行、117行

错误详情:
$ pyspider
Traceback (most recent call last):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 5, in <module>
    from pyspider.run import main
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 231
    async=True, get_object=False, no_input=False):
    ^^^^^
SyntaxError: invalid syntax

3. 再次尝试运行 pyspider 命令,出现如下错误,是因为pyspider 在 Python 3.12 环境中运行时存在兼容性问题。这是由于 Python 3.12 对部分旧模块进行了移除或者调整,而 pyspider 尚未完全适配这些改动。
解决方案:通过如下步骤手动修改pyspider代码来解决兼容性问题
a). 修复 UserDict 和 Mapping 导入问题:
把 pyspider/libs/counter.py 文件里的python代码第14行:

    try:
from UserDict import DictMixin
except ImportError:
from collections import Mapping as DictMixin

改成:

    try:
from collections import UserDict as DictMixin
except ImportError:
from collections.abc import Mapping as DictMixin

把 pyspider/scheduler/task_queue.py 文件里的python代码第12行

    try:
from UserDict import DictMixin
except ImportError:
from collections import Mapping as DictMixin

改成:

    try:
from collections import UserDict as DictMixin
except ImportError:
from collections.abc import Mapping as DictMixin

b). 修复 imp 模块缺失问题:
把 pyspider/processor/project_module.py 文件中的python代码第11行:

import imp

改成:

  import importlib.util

c). 修复 MutableMapping 导入问题:
tornado 库引用MutableMapping出现错误,可修改 tornado/httputil.py 文件中的python代码第106行:

class HTTPHeaders(collections.MutableMapping):

改成:

class HTTPHeaders(collections.abc.MutableMapping):

错误详情:

$ pyspider
[W 250515 11:30:54 run:413] phantomjs not found, continue running without it.
[I 250515 11:30:56 result_worker:49] result_worker starting...
Process Process-5:
Traceback (most recent call last):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/counter.py", line 14, in <module>
    from UserDict import DictMixin
ModuleNotFoundError: No module named 'UserDict'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 192, in scheduler
    Scheduler = load_cls(None, None, scheduler_cls)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls
    return utils.load_object(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object
    module = __import__(module_name, globals(), locals(), [object_name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/scheduler/__init__.py", line 1, in <module>
    from .scheduler import Scheduler, OneScheduler, ThreadBaseScheduler  # NOQA
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/scheduler/scheduler.py", line 19, in <module>
    from pyspider.libs import counter, utils
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/counter.py", line 16, in <module>
    from collections import Mapping as DictMixin
ImportError: cannot import name 'Mapping' from 'collections' (/usr/local/lib/python3.12/collections/__init__.py)
Process Process-4:
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 236, in fetcher
    Fetcher = load_cls(None, None, fetcher_cls)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls
    return utils.load_object(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object
    module = __import__(module_name, globals(), locals(), [object_name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/__init__.py", line 1, in <module>
    from .tornado_fetcher import Fetcher
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/tornado_fetcher.py", line 21, in <module>
    import tornado.httputil
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/httputil.py", line 106, in <module>
    class HTTPHeaders(collections.MutableMapping):
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'collections' has no attribute 'MutableMapping'
Process Process-3:
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 273, in processor
    Processor = load_cls(None, None, processor_cls)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls
    return utils.load_object(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object
    module = __import__(module_name, globals(), locals(), [object_name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/processor/__init__.py", line 1, in <module>
    from .processor import ProcessorResult, Processor
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/processor/processor.py", line 20, in <module>
    from .project_module import ProjectManager, ProjectFinder
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/processor/project_module.py", line 11, in <module>
    import imp
ModuleNotFoundError: No module named 'imp'
Traceback (most recent call last):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main
    cli()
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli
    ctx.invoke(all)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all
    ctx.invoke(webui, **webui_config)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 333, in webui
    app = load_cls(None, None, webui_instance)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls
    return utils.load_object(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object
    module = __import__(module_name, globals(), locals(), [object_name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/__init__.py", line 8, in <module>
    from . import app, index, debug, task, result, login
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 17, in <module>
    from pyspider.fetcher import tornado_fetcher
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/__init__.py", line 1, in <module>
    from .tornado_fetcher import Fetcher
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/tornado_fetcher.py", line 21, in <module>
    import tornado.httputil
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/httputil.py", line 106, in <module>
    class HTTPHeaders(collections.MutableMapping):
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'collections' has no attribute 'MutableMapping'

4. 再次尝试运行 pyspider 命令,出现如下import backports.ssl_match_hostname ModuleNotFoundError: No module named 'backports'错误,是因为pyspider 依赖的 tornado 库在 Python 3.12 环境下需要 backports.ssl_match_hostname 模块,而 pyspider 尚未完全适配这些改动。
解决方案:
backports.ssl_match_hostname 模块缺失问题,可通过pycharm的Terminal中执行如下命令解决

$ pip install backports.ssl_match_hostname

补充: Error: Could not create web server listening on port 25555 属于端口占用问题,将其他问题都解决了再处理此问题

错误详情:
$ pyspider
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 250515 12:54:49 processor:211] processor starting...
Process Process-4:
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 236, in fetcher
    Fetcher = load_cls(None, None, fetcher_cls)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls
    return utils.load_object(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object
    module = __import__(module_name, globals(), locals(), [object_name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/__init__.py", line 1, in <module>
    from .tornado_fetcher import Fetcher
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/tornado_fetcher.py", line 31, in <module>
    from tornado.simple_httpclient import SimpleAsyncHTTPClient
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/simple_httpclient.py", line 8, in <module>
    from tornado.http1connection import HTTP1Connection, HTTP1ConnectionParameters
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/http1connection.py", line 30, in <module>
    from tornado import iostream
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/iostream.py", line 40, in <module>
    from tornado.netutil import ssl_wrap_socket, ssl_match_hostname, SSLCertificateError, _client_ssl_defaults, _server_ssl_defaults
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/netutil.py", line 56, in <module>
    import backports.ssl_match_hostname
ModuleNotFoundError: No module named 'backports'
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Traceback (most recent call last):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main
    cli()
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli
    ctx.invoke(all)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all
    ctx.invoke(webui, **webui_config)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 333, in webui
    app = load_cls(None, None, webui_instance)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls
    return utils.load_object(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object
    module = __import__(module_name, globals(), locals(), [object_name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/__init__.py", line 8, in <module>
    from . import app, index, debug, task, result, login
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 17, in <module>
    from pyspider.fetcher import tornado_fetcher
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/__init__.py", line 1, in <module>
    from .tornado_fetcher import Fetcher
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/tornado_fetcher.py", line 31, in <module>
    from tornado.simple_httpclient import SimpleAsyncHTTPClient
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/simple_httpclient.py", line 8, in <module>
    from tornado.http1connection import HTTP1Connection, HTTP1ConnectionParameters
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/http1connection.py", line 30, in <module>
    from tornado import iostream
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/iostream.py", line 40, in <module>
    from tornado.netutil import ssl_wrap_socket, ssl_match_hostname, SSLCertificateError, _client_ssl_defaults, _server_ssl_defaults
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/netutil.py", line 56, in <module>
    import backports.ssl_match_hostname
ModuleNotFoundError: No module named 'backports'

5. 再次尝试运行 pyspider 命令,出现如下错误信息AttributeError: module 'fractions' has no attribute 'gcd',是因为pyspider 在 Python 3.12 环境下运行时存在兼容性问题,fractions模块已被移除,而 pyspider 尚未完全适配这些改动。

解决方案:
a). pyspider/libs/base_handler.py的python代码第12行空白行新增

 import math

b).修改 pyspider/libs/base_handler.py的python代码将其中的第115行

min_tick = fractions.gcd(min_tick, each.tick)

改成:

min_tick = math.gcd(min_tick, each.tick)

错误详情:

$ pyspider
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Traceback (most recent call last):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main
    cli()
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli
    ctx.invoke(all)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all
    ctx.invoke(webui, **webui_config)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 333, in webui
    app = load_cls(None, None, webui_instance)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls
    return utils.load_object(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object
    module = __import__(module_name, globals(), locals(), [object_name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/__init__.py", line 8, in <module>
    from . import app, index, debug, task, result, login
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/debug.py", line 22, in <module>
    from pyspider.libs import utils, sample_handler, dataurl
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/sample_handler.py", line 9, in <module>
    class Handler(BaseHandler):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/base_handler.py", line 115, in __new__
    min_tick = fractions.gcd(min_tick, each.tick)
               ^^^^^^^^^^^^^
AttributeError: module 'fractions' has no attribute 'gcd'

6. 再次尝试运行 pyspider 命令,出现如下Flask 兼容性问题错误信息(AttributeError: 'QuitableFlask' object has no attribute 'before_first_request'),这个错误是因为 pyspider 使用的 Flask 版本不兼容。pyspider 是基于 Flask 旧版本开发的,而新版本(如 Flask 2.3+)移除了 before_first_request 装饰器

解决方案:
修改 pyspider/webui/debug.py的python代码将其中的第 64 行

    @app.before_first_request
def enable_projects_import():
sys.meta_path.append(ProjectFinder(app.config['projectdb']))

改成:

    @app.before_request
def enable_projects_import():
if not hasattr(app, '_got_first_request'):
app._got_first_request = True
sys.meta_path.append(ProjectFinder(app.config['projectdb']))

错误详情:

$ pyspider
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 250515 13:42:09 result_worker:49] result_worker starting...
Error: Could not create web server listening on port 25555
[I 250515 13:42:09 processor:211] processor starting...
[I 250515 13:42:09 tornado_fetcher:638] fetcher starting...
Error: Could not create web server listening on port 25555
[I 250515 13:42:09 scheduler:647] scheduler starting...
[I 250515 13:42:10 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
Error: Could not create web server listening on port 25555
[I 250515 13:42:10 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
Traceback (most recent call last):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main
    cli()
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli
    ctx.invoke(all)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all
    ctx.invoke(webui, **webui_config)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 333, in webui
    app = load_cls(None, None, webui_instance)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls
    return utils.load_object(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object
    module = __import__(module_name, globals(), locals(), [object_name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/__init__.py", line 8, in <module>
    from . import app, index, debug, task, result, login
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/debug.py", line 64, in <module>
    @app.before_first_request
     ^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'QuitableFlask' object has no attribute 'before_first_request'. Did you mean: '_got_first_request'?
Error: Could not create web server listening on port 25555

7. 再次尝试运行 pyspider 命令,出现错误 TypeError: Can't instantiate abstract class ScriptProvider without an implementation for abstract method 'get_resource_inst' ,原因是pyspider 的 WebDAV 模块在 Python 3.12 环境下存在兼容性问题。在 Python 3.12 中,抽象基类(ABC)的检查变得更加严格,ScriptProvider 类没有实现其基类要求的所有抽象方法。

解决方案:
修改pyspider/webui/webdav.py的python代码第165行

    class ScriptProvider(DAVProvider):
def __init__(self, app):
super(ScriptProvider, self).__init__()
self.app = app def __repr__(self):
return "pyspiderScriptProvider" def getResourceInst(self, path, environ):
path = os.path.normpath(path).replace('\\', '/')
if path in ('/', '.', ''):
path = '/'
return RootCollection(path, environ, self.app)
else:
return ScriptResource(path, environ, self.app)

改为:

    class ScriptProvider(DAVProvider):
def __init__(self, app):
super(ScriptProvider, self).__init__()
self.app = app def __repr__(self):
return "pyspiderScriptProvider" def getResourceInst(self, path, environ):
path = os.path.normpath(path).replace('\\', '/')
if path in ('/', '.', ''):
path = '/'
return RootCollection(path, environ, self.app)
else:
return ScriptResource(path, environ, self.app) # 添加缺失的抽象方法实现
def get_resource_inst(self, path, environ):
return ScriptResource(path, self, environ)

错误详情:
$ pyspider
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 250515 14:05:24 result_worker:49] result_worker starting...
Error: Could not create web server listening on port 25555
[I 250515 14:05:24 processor:211] processor starting...
Error: Could not create web server listening on port 25555
[I 250515 14:05:25 scheduler:647] scheduler starting...
[I 250515 14:05:25 tornado_fetcher:638] fetcher starting...
[I 250515 14:05:25 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 250515 14:05:25 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
Error: Could not create web server listening on port 25555
[I 250515 14:05:25 app:84] webui exiting...
Traceback (most recent call last):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main
Error: Could not create web server listening on port 25555
    cli()
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli
    ctx.invoke(all)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all
    ctx.invoke(webui, **webui_config)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 384, in webui
    app.run(host=host, port=port)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 59, in run
    from .webdav import dav_app
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/webdav.py", line 207, in <module>
    '/': ScriptProvider(app)
         ^^^^^^^^^^^^^^^^^^^
TypeError: Can't instantiate abstract class ScriptProvider without an implementation for abstract method 'get_resource_inst'

8. 再次尝试运行 pyspider 命令,出现错误 ValueError: Invalid configuration: - Deprecated option 'domaincontroller': use 'http_authenticator.domain_controller' instead.,原因是 WsgiDAV 库的配置格式发生了改变,domaincontroller 选项已被弃用,需要使用 http_authenticator.domain_controller 替代。

解决方案:
将pyspider/webui/webdav.py的python代码第207行

    config = DEFAULT_CONFIG.copy()
config.update({
'mount_path': '/dav',
'provider_mapping': {
'/': ScriptProvider(app)
},
'domaincontroller': NeedAuthController(app),
'verbose': 1 if app.debug else 0,
'dir_browser': {'davmount': False,
'enable': True,
'msmount': False,
'response_trailer': ''},
})
dav_app = WsgiDAVApp(config)

改成:

    config = DEFAULT_CONFIG.copy()
config.update({
'mount_path': '/dav',
'provider_mapping': {
'/': ScriptProvider(app)
},
# 更新认证配置
"http_authenticator": {
"domain_controller": NeedAuthController, # 移动到 http_authenticator 下
"accept_basic": True,
"accept_digest": False,
"default_to_digest": False,
},
'verbose': 1 if app.debug else 0,
'dir_browser': {'davmount': False,
'enable': True,
'msmount': False,
'response_trailer': ''},
})
dav_app = WsgiDAVApp(config)

错误详情:
$ pyspider
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 250515 14:14:01 result_worker:49] result_worker starting...
Error: Could not create web server listening on port 25555
[I 250515 14:14:01 processor:211] processor starting...
[I 250515 14:14:01 scheduler:647] scheduler starting...
Error: Could not create web server listening on port 25555
[I 250515 14:14:01 tornado_fetcher:638] fetcher starting...
[I 250515 14:14:01 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 250515 14:14:01 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 250515 14:14:01 app:84] webui exiting...
Traceback (most recent call last):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main
    cli()
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli
    ctx.invoke(all)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all
    ctx.invoke(webui, **webui_config)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 384, in webui
    app.run(host=host, port=port)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 59, in run
    from .webdav import dav_app
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/webdav.py", line 220, in <module>
    dav_app = WsgiDAVApp(config)
              ^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/wsgidav/wsgidav_app.py", line 155, in __init__
    _check_config(config)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/wsgidav/wsgidav_app.py", line 129, in _check_config
    raise ValueError("Invalid configuration:\n  - " + "\n  - ".join(errors))
ValueError: Invalid configuration:
  - Deprecated option 'domaincontroller': use 'http_authenticator.domain_controller' instead.

9. 再次尝试运行 pyspider 命令,出现错误”TypeError: NeedAuthController.__init__() takes 2 positional arguments but 3 were given“,原因是WsgiDAV 的认证控制器接口发生变化,存在兼容性问题

解决方案:
pyspider/webui/webdav.py的python代码第186行:

    class NeedAuthController(object):
def __init__(self, app):
self.app = app def getDomainRealm(self, inputRelativeURL, environ):
return 'need auth' def requireAuthentication(self, realmname, environ):
return self.app.config.get('need_auth', False) def isRealmUser(self, realmname, username, environ):
return username == self.app.config.get('webui_username') def getRealmUserPassword(self, realmname, username, environ):
return self.app.config.get('webui_password') def authDomainUser(self, realmname, username, password, environ):
return username == self.app.config.get('webui_username') \
and password == self.app.config.get('webui_password')

改成:

    class NeedAuthController(object):
def __init__(self, app, config=None):
self.app = app
# 处理额外的config参数,使其兼容WsgiDAV的初始化方式
if config is not None:
self.config = config
else:
# 如果config未提供,尝试从app中获取
self.config = app.config.get("http_authenticator", {}) def getDomainRealm(self, inputRelativeURL, environ):
return 'need auth' def requireAuthentication(self, realmname, environ):
return self.app.config.get('need_auth', False) def isRealmUser(self, realmname, username, environ):
return username == self.app.config.get('webui_username') def getRealmUserPassword(self, realmname, username, environ):
return self.app.config.get('webui_password') def authDomainUser(self, realmname, username, password, environ):
return username == self.app.config.get('webui_username') \
and password == self.app.config.get('webui_password') # 添加WsgiDAV期望的接口方法,转发到原有方法
def get_domain_realm(self, input_path, environ):
return self.getDomainRealm(input_path, environ) def basic_auth_user(self, realm, user_name, password, environ):
return self.authDomainUser(realm, user_name, password, environ) def supports_http_digest_auth(self):
return False # 我们不支持摘要认证 def is_share_anonymous(self, share_path):
"""检查指定的共享路径是否允许匿名访问"""
# 如果不需要认证,则所有共享都允许匿名访问
return not self.app.config.get('need_auth', False)

错误详情:

$ pyspider
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 250515 14:42:21 result_worker:49] result_worker starting...
Error: Could not create web server listening on port 25555
[I 250515 14:42:21 processor:211] processor starting...
[I 250515 14:42:21 tornado_fetcher:638] fetcher starting...
Error: Could not create web server listening on port 25555
[I 250515 14:42:21 scheduler:647] scheduler starting...
[I 250515 14:42:21 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 250515 14:42:21 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
Error: Could not create web server listening on port 25555
[I 250515 14:42:21 app:84] webui exiting...
Traceback (most recent call last):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main
    cli()
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli
    ctx.invoke(all)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all
    ctx.invoke(webui, **webui_config)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 384, in webui
    app.run(host=host, port=port)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 59, in run
    from .webdav import dav_app
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/webdav.py", line 226, in <module>
    dav_app = WsgiDAVApp(config)
              ^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/wsgidav/wsgidav_app.py", line 257, in __init__
    app = mw(self, self.application, config)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/wsgidav/http_authenticator.py", line 140, in __init__
    dc = make_domain_controller(wsgidav_app, config)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/wsgidav/http_authenticator.py", line 111, in make_domain_controller
    dc = dc(wsgidav_app, config)
         ^^^^^^^^^^^^^^^^^^^^^^^
TypeError: NeedAuthController.__init__() takes 2 positional arguments but 3 were given
Error: Could not create web server listening on port 25555

10. 再次尝试运行 pyspider 命令,出现错误ImportError: cannot import name 'DispatcherMiddleware' from 'werkzeug.wsgi',原因是 Werkzeug 库版本与 pyspider 不兼容。从 Python 3.12 开始,Werkzeug v2.3.0 及以上版本已经移除了DispatcherMiddleware,将其移至独立的werkzeug.middleware.dispatcher模块中。而 pyspider 仍在使用旧的导入方式。

解决方案:
修改 pyspider/webui/app.py的python代码第64行

 from werkzeug.wsgi import DispatcherMiddleware

改成:

from werkzeug.middleware.dispatcher import DispatcherMiddleware

错误详情:

$ pyspider
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 250515 14:58:59 result_worker:49] result_worker starting...
Error: Could not create web server listening on port 25555
[I 250515 14:58:59 processor:211] processor starting...
Error: Could not create web server listening on port 25555
[I 250515 14:58:59 scheduler:647] scheduler starting...
[I 250515 14:58:59 tornado_fetcher:638] fetcher starting...
[I 250515 14:58:59 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 250515 14:58:59 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 250515 14:59:00 app:84] webui exiting...
Traceback (most recent call last):
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main
    cli()
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli
    ctx.invoke(all)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all
    ctx.invoke(webui, **webui_config)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 384, in webui
    app.run(host=host, port=port)
  File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 64, in run
    from werkzeug.wsgi import DispatcherMiddleware
ImportError: cannot import name 'DispatcherMiddleware' from 'werkzeug.wsgi' (/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/werkzeug/wsgi.py)
Error: Could not create web server listening on port 25555

11. 再次尝试运行 pyspider 命令,出现错误Error: Could not create web server listening on port 25555,原因是25555端口被占用,需要释放端口重新执行 pyspider 命令
解决方案: 使用 lsof -i 25555查看端口被哪个PID占用,然后用 kill -9 <PID> 释放端口后再重新执行 pyspider 命令。看到 如下打印结果表示成功启动 pyspider,并且启用了5000端口。浏览器可以访问 http://localhost:5000/ 进入PySpider网页控制台爬数据了。

    $ lsof -i :25555
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
phantomjs 16615 wanghu 7u IPv4 278700 0t0 TCP *:25555 (LISTEN) (.venv) wanghu@td-1:~/PycharmProjects/PythonProject/getHarmonyVideoDatas$ kill -9 16615 (.venv) wanghu@td-1:~/PycharmProjects/PythonProject/getHarmonyVideoDatas$ pyspider
phantomjs fetcher running on port 25555
[I 250515 15:08:09 result_worker:49] result_worker starting...
[I 250515 15:08:10 processor:211] processor starting...
[I 250515 15:08:10 tornado_fetcher:638] fetcher starting...
[I 250515 15:08:10 scheduler:647] scheduler starting...
[I 250515 15:08:10 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 250515 15:08:10 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
[I 250515 15:08:10 app:76] webui running on 0.0.0.0:5000

错误详情:

$ pyspider
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 250515 15:06:48 result_worker:49] result_worker starting...
Error: Could not create web server listening on port 25555
[I 250515 15:06:49 processor:211] processor starting...
[I 250515 15:06:49 tornado_fetcher:638] fetcher starting...
[I 250515 15:06:49 scheduler:647] scheduler starting...
[I 250515 15:06:49 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
Error: Could not create web server listening on port 25555
[I 250515 15:06:49 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
Error: Could not create web server listening on port 25555
[I 250515 15:06:49 app:76] webui running on 0.0.0.0:5000

Ubuntu 16.04 系统(解释器为 python3.12)在Pycharm虚拟环境中安装 pyspider 爬虫工具的更多相关文章

  1. Ubuntu 16.04系统下安装Discuz出现“HTTP ERROR 500”目前无法处理此请求

    问题:当我们在Ubuntu 16.04系统下安装Disucz X3时,修改好文件的权限,浏览器输入地址安装时出现如下图所示问题: 问题查询: 在终端输入: tail -f /var/log/apach ...

  2. Ubuntu 16.04系统下安装PHP5.6*

    Ubuntu 16.04系统默认php7,并没有php5*的包,所以需要自己安装: 方法: 1.删除所有的php包列出安装的php包,dpkg -l | grep php| awk '{print $ ...

  3. Ubuntu 16.04系统挂载4T硬盘

    问题描述: Ubuntu 16.04系统,系统盘为240G固态硬盘,还有1T机械硬盘,现要再添加一个4T硬盘. 问题分析: 使用GTP对硬盘进行分区并挂载硬盘的方法,一般而言服务器上挂载的硬盘都是比较 ...

  4. 虚拟机安装Ubuntu 16.04系统实操教程 详尽步骤 vmware ESXi亲测通过

    1 Ubuntu 16.04系统安装要求 Ubuntu 16.04 LTS下载最新版本的Ubuntu,适用于台式机和笔记本电脑. LTS代表长期支持,这意味着有五年免费安全和维护更新的保证. Ubun ...

  5. Ubuntu 16.04 + CUDA 8.0 + cuDNN v5.1 + TensorFlow(GPU support)安装配置详解

    随着图像识别和深度学习领域的迅猛发展,GPU时代即将来临.由于GPU处理深度学习算法的高效性,使得配置一台搭载有GPU的服务器变得尤为必要. 本文主要介绍在Ubuntu 16.04环境下如何配置Ten ...

  6. Ubuntu 16.04系统布署小记

    前段时间趁着双11打折,又将阿里云主机续费了3年.之前布署的系统是Ubuntu 12.04,从系统发布到现在也有四年半了,其官方支持的生命周期也将止于明年春,且这在几年里出现了很多新的事物,我也需要跟 ...

  7. 优化Ubuntu 16.04系统的几件事

    安装完Ubuntu 16.04后,要更换为国内的软件源: sudo gedit /etc/apt/sources.list #用文本编辑器打开源列表 在文件开头添加下面的阿里云的软件源: deb ht ...

  8. Ubuntu 16.04系统上NFS的安装与使用

    摘要:本文介绍了NFS服务器的安装过程.配置文件和常用命令行工具,以及NFS客户端上如何安装常用工具,介绍如何挂载共享目录,并通过实验进行验证. 一.服务器端: 1.1安装NFS服务: #执行以下命令 ...

  9. 配置Ubuntu16.04第01步:U盘安装 Ubuntu 16.04系统

    Ubuntu 每年发布两个版本,Ubuntu 16.04 开发代号为“Xenial Xerus”,为第六个长期支持(LTS)版本. 1. 制作U盘系统安装盘 1.1下载最新的Universal USB ...

  10. 入门系列之使用Sysdig监视您的Ubuntu 16.04系统

    欢迎大家前往腾讯云+社区,获取更多腾讯海量技术实践干货哦~ 本文由乌鸦 发表于云+社区专栏 介绍 Sysdig是一个全面的开源系统活动监控,捕获和分析应用程序.它具有强大的过滤语言和可自定义的输出,以 ...

随机推荐

  1. el-radio-group初始化默认值后点击无法切换问题/vue中设置表单对象属性页面不同步问题

    <el-radio-group v-model="ruleForm.type"> <el-radio :label="1">方案一< ...

  2. Git 远程仓库地址修改了怎么办?

    项目迁移了一波仓库地址,从自建的git-lab到gitee,所以远程仓库地址发生了变更. 命令: git remote -v # 查看本地配置的远程仓库地址,针对下图中的origin,有的人起名字可能 ...

  3. hbase - [02] 分布式安装部署

    一.角色规划 主机名 node01 node02 node03 node04 Zookeeper ○ ○ ○   NameNode ○ ○     JournalNode ○ ○ ○   DataNo ...

  4. Processing模拟控制多台舵机-以距离为参数 程序参考

    又是一次课程学习的结束,辅导学生的过程也很受益,温故而知新.该组同学需要Arduino控制多达6个舵机,而且基于距离这一参数,在不同距离值之间会有不同的触发事件,也就是6个舵机转的角度都有所不同,而且 ...

  5. mySql跳过行数获取多少行

    LIMIT :需要获取多少条记录 OFFSET :跳过前面的多少行记录从后面开始获取 SELECT * FROM USER LIMIT 32 OFFSET 1 只获取12行记录 跳过第一条记录 SEL ...

  6. ABAQUS弹塑性分析

    1. 弹塑性分析的主要问题 1.1 elastic-plastic deform behavior abaqus 默认的塑性表现行为是金属材料经典塑性理论,采用mises屈服面定义各向同性屈服. 一般 ...

  7. 【Matlab】求解复合材料层合板刚度矩阵及柔度矩阵

    1. matlab文件结构 2. main.m代码 clc clear; warning off; %% %铺层角度数组 angles=[0 90 0]; % ° %单层厚度 ply_thicknes ...

  8. nnUNet 使用方法

    首先明确分割任务. 其次明确研究方法和步骤. 再做好前期准备,如数据集的采集.标注以及其中的训练集/测试集划分. 其中的参考链接: (四:2020.07.28)nnUNet最舒服的训练教程(让我的奶奶 ...

  9. Calico Kernel's RPF check is set to 'loose'

    前言 K8s 集群部署使用了 calico 网络插件,而calico node 节点发生如下报错: 2023-03-13 11:19:36.622 [FATAL][828] int_dataplane ...

  10. 云服务器下如何部署Django项目详细操作步骤

    前期本人完成了"编写你的第一个 Django 应用程序",有了一个简单的项目代码,在本地window系统自测没问题了,接下来就想办法部署到服务器上,可以通过公网访问我们的Djang ...