rasa train nlu详解：1.2-_train

本文使用《使用ResponseSelector实现校园招聘FAQ机器人》中的例子，主要详解介绍_train_graph()函数中变量的具体值。

一.rasa/model_training.py/_train_graph()函数

_train_graph()函数实现，如下所示：

def _train_graph(

    file_importer: TrainingDataImporter,

    training_type: TrainingType,

    output_path: Text,

    fixed_model_name: Text,

    model_to_finetune: Optional[Union[Text, Path]] = None,

    force_full_training: bool = False,

    dry_run: bool = False,

    **kwargs: Any,

) -> TrainingResult:

    if model_to_finetune:  # 如果有模型微调

        model_to_finetune = rasa.model.get_model_for_finetuning(model_to_finetune)  # 获取模型微调

        if not model_to_finetune:  # 如果没有模型微调

            rasa.shared.utils.cli.print_error_and_exit(  # 打印错误并退出

                f"No model for finetuning found. Please make sure to either "   # 没有找到微调模型。请确保

                f"specify a path to a previous model or to have a finetunable " # 要么指定一个以前模型的路径，要么有一个可微调的

                f"model within the directory '{output_path}'."                  # 在目录'{output_path}'中的模型。

            )

        rasa.shared.utils.common.mark_as_experimental_feature(  # 标记为实验性功能

            "Incremental Training feature"  # 增量训练功能

        )

    is_finetuning = model_to_finetune is not None  # 如果有模型微调

    config = file_importer.get_config()  # 获取配置

    recipe = Recipe.recipe_for_name(config.get("recipe"))  # 获取配方

    config, _missing_keys, _configured_keys = recipe.auto_configure(  # 自动配置

        file_importer.get_config_file_for_auto_config(),  # 获取自动配置的配置文件

        config,  # 配置

        training_type,  # 训练类型

    )

    model_configuration = recipe.graph_config_for_recipe(  # 配方的graph配置

        config,  # 配置

        kwargs,  # 关键字参数

        training_type=training_type,  # 训练类型

        is_finetuning=is_finetuning,  # 是否微调

    )

    rasa.engine.validation.validate(model_configuration)  # 验证

    tempdir_name = rasa.utils.common.get_temp_dir_name()  # 获取临时目录名称

    # Use `TempDirectoryPath` instead of `tempfile.TemporaryDirectory` as this leads to errors on Windows when the context manager tries to delete an already deleted temporary directory (e.g. https://bugs.python.org/issue29982)

    # 翻译：使用TempDirectoryPath而不是tempfile.TemporaryDirectory，因为当上下文管理器尝试删除已删除的临时目录时，这会导致Windows上的错误（例如https://bugs.python.org/issue29982）

    with rasa.utils.common.TempDirectoryPath(tempdir_name) as temp_model_dir:  # 临时模型目录

        model_storage = _create_model_storage(  # 创建模型存储

            is_finetuning, model_to_finetune, Path(temp_model_dir)  # 是否微调，模型微调，临时模型目录

        )

        cache = LocalTrainingCache()  # 本地训练缓存

        trainer = GraphTrainer(model_storage, cache, DaskGraphRunner)  # Graph训练器

        if dry_run:  # dry运行

            fingerprint_status = trainer.fingerprint(                        # fingerprint状态

                model_configuration.train_schema, file_importer              # 模型配置的训练模式，文件导入器

            )

            return _dry_run_result(fingerprint_status, force_full_training)  # 返回dry运行结果

        model_name = _determine_model_name(fixed_model_name, training_type)  # 确定模型名称

        full_model_path = Path(output_path, model_name)                # 完整的模型路径

        with telemetry.track_model_training(                    # 跟踪模型训练

            file_importer, model_type=training_type.model_type  # 文件导入器，模型类型

        ):

            trainer.train(                               # 训练

                model_configuration,                     # 模型配置

                file_importer,                           # 文件导入器

                full_model_path,                         # 完整的模型路径

                force_retraining=force_full_training,    # 强制重新训练

                is_finetuning=is_finetuning,             # 是否微调

            )

            rasa.shared.utils.cli.print_success(         # 打印成功

                f"Your Rasa model is trained and saved at '{full_model_path}'."  # Rasa模型已经训练并保存在'{full_model_path}'。

            )

        return TrainingResult(str(full_model_path), 0)   # 训练结果

1.传递来的形参数据

2._train_graph()函数组成

该函数主要由3个方法组成，如下所示：

model_configuration = recipe.graph_config_for_recipe(*)
trainer = GraphTrainer(model_storage, cache, DaskGraphRunner)
trainer.train(model_configuration, file_importer, full_model_path, force_retraining, is_finetuning)

二._train_graph()函数中的方法

1.file_importer.get_config()

将config.yml文件转化为dict类型，如下所示：

2.Recipe.recipe_for_name(config.get("recipe"))

（1）ENTITY_EXTRACTOR = ComponentType.ENTITY_EXTRACTOR

实体抽取器。

（2）INTENT_CLASSIFIER = ComponentType.INTENT_CLASSIFIER

意图分类器。

（3）MESSAGE_FEATURIZER = ComponentType.MESSAGE_FEATURIZER

消息特征化。

（4）MESSAGE_TOKENIZER = ComponentType.MESSAGE_TOKENIZER

消息Tokenizer。

（5）MODEL_LOADER = ComponentType.MODEL_LOADER

模型加载器。

（6）POLICY_WITHOUT_END_TO_END_SUPPORT = ComponentType.POLICY_WITHOUT_END_TO_END_SUPPORT

非端到端策略支持。

（7）POLICY_WITH_END_TO_END_SUPPORT = ComponentType.POLICY_WITH_END_TO_END_SUPPORT

端到端策略支持。

3.model_configuration = recipe.graph_config_for_recipe(*)

model_configuration.train_schema和model_configuration.predict_schema的数据类型都是GraphSchema类对象，分别表示在训练和预测时所需要的SchemaNode，以及SchemaNode在GraphSchema中的依赖关系。

（1）model_configuration.train_schema

schema_validator：rasa.graph_components.validators.default_recipe_validator.DefaultV1RecipeValidator类中的validate方法
finetuning_validator：rasa.graph_components.validators.finetuning_validator.FinetuningValidator类中的validate方法
nlu_training_data_provider：rasa.graph_components.providers.nlu_training_data_provider.NLUTrainingDataProvider类中的provide方法
train_JiebaTokenizer0：rasa.nlu.tokenizers.jieba_tokenizer.JiebaTokenizer类中的train方法
run_JiebaTokenizer0：rasa.nlu.tokenizers.jieba_tokenizer.JiebaTokenizer类中的process_training_data方法
run_LanguageModelFeaturizer1：rasa.nlu.featurizers.dense_featurizer.lm_featurizer.LanguageModelFeaturizer类中的process_training_data方法
train_DIETClassifier2：rasa.nlu.classifiers.diet_classifier.DIETClassifier类中的train方法
train_ResponseSelector3：rasa.nlu.selectors.response_selector.ResponseSelector类中的train方法

说明：ResponseSelector类继承自DIETClassifier类。

（2）model_configuration.predict_schema

nlu_message_converter：rasa.graph_components.converters.nlu_message_converter.NLUMessageConverter类中的convert_user_message方法
run_JiebaTokenizer0：rasa.nlu.tokenizers.jieba_tokenizer.JiebaTokenizer类中的process方法
run_LanguageModelFeaturizer1：rasa.nlu.featurizers.dense_featurizer.lm_featurizer.LanguageModelFeaturizer类中的process方法
run_DIETClassifier2：rasa.nlu.classifiers.diet_classifier.DIETClassifier类中的process方法
run_ResponseSelector3：rasa.nlu.selectors.response_selector.ResponseSelector类中的process方法
run_RegexMessageHandler：rasa.nlu.classifiers.regex_message_handler.RegexMessageHandler类中的process方法

4.tempdir_name

'C:\Users\ADMINI~1\AppData\Local\Temp\tmpg0v179ea'

5.trainer = GraphTrainer(*)和trainer.train(*)

这里执行的代码是rasa/engine/training/graph_trainer.py中GraphTrainer类的train()方法，实现功能为训练和打包模型并返回预测graph运行程序。

6.Rasa中GraphComponent的子类

参考文献：

[1]https://github.com/RasaHQ/rasa

[2]rasa 3.2.10 NLU模块的训练：https://zhuanlan.zhihu.com/p/574935615

[3]rasa.engine.graph：https://rasa.com/docs/rasa/next/reference/rasa/engine/graph/

rasa train nlu详解：1.2-_train_graph()函数的更多相关文章

SQL 中详解round(),floor(),ceiling()函数的用法和区别？
SQL 中详解round(),floor(),ceiling()函数的用法和区别? 原创 2013年06月09日 14:00:21 摘自:http://blog.csdn.net/yueliang ...
第7.25节 Python案例详解：使用property函数定义与实例变量同名的属性会怎样？
第7.25节 Python案例详解:使用property函数定义与实例变量同名的属性会怎样? 一. 案例说明我们上节提到了,使用property函数定义的属性不要与类内已经定义的普通实例变量重 ...
第7.24节 Python案例详解：使用property函数定义属性简化属性访问代码实现
第7.24节 Python案例详解:使用property函数定义属性简化属性访问代码实现一. 案例说明本节将通过一个案例介绍怎么使用property定义快捷的属性访问.案例中使用Rectan ...
详解wait和waitpid函数
#include <sys/types.h> /* 提供类型pid_t的定义 */ #include <sys/wait.h> pid_t wait(int *status) ...
Linux 信号详解一（signal函数）
信号列表 SIGABRT 进程停止运行 SIGALRM 警告钟 SIGFPE 算述运算例外 SIGHUP 系统挂断 SIGILL 非法指令 SIGINT 终端中断 SIGKILL 停止进程(此信号不能 ...
（译）详解javascript立即执行函数表达式（IIFE）
写在前面这是一篇译文,原文:Immediately-Invoked Function Expression (IIFE) 原文是一篇很经典的讲解IIFE的文章,很适合收藏.本文虽然是译文,但是直译的 ...
《Windows驱动开发技术详解》之派遣函数
驱动程序的主要功能是负责处理I/O请求,其中大部分I/O请求是在派遣函数中处理的.用户模式下所有对驱动程序的I/O请求,全部由操作系统转化为一个叫做IRP的数据结构,不同的IRP数据会被“派遣”到不同 ...
[二] java8 函数式接口详解函数接口详解 lambda表达式匿名函数方法引用使用含义函数式接口实例如何定义函数式接口
函数式接口详细定义 package java.lang; import java.lang.annotation.*; /** * An informative annotation type use ...
详解MySQL中concat函数的用法（连接字符串）
MySQL中concat函数使用方法: CONCAT(str1,str2,…) 返回结果为连接参数产生的字符串.如有任何一个参数为NULL ,则返回值为 NULL. 注意: 如果所有参数均为非二进制 ...
详解javascript立即执行函数表达式（IIFE）
立即执行函数,就是在定义函数的时候直接执行,这里不是申明函数而是一个函数表达式 1.问题在javascript中,每一个函数在被调用的时候都会创建一个执行上下文,在函数内部定义的变量和函数只能在该函 ...

随机推荐

漏洞扫描与安全加固之Apache Axis组件
一.Apache Axis组件高危漏洞自查及整改 Apache Axis组件存在由配置不当导致的远程代码执行风险. 1. 影响版本 Axis1 和Axis2各版本均受影响 2. 处置建议 1)禁用此服 ...
Java 队列Queue的一些基本操作与概念!!!!!!!!
首先Java中的队列(Queue)是一种先进先出的数据结构. 其中常见的一些基本操作与方法,包括: 1.创建队列对象.例如:ArrayDeque.LinkedList等. 2.入队操作.将元素添加到队 ...
shell- ssh免密登录脚本
#!/bin/sh . /etc/init.d/functions #1.product key pair /usr/bin/rm -f .ssh/* 2&>/dev/null [ -f ...
数据结构与算法 | 哈希表（Hash Table）
哈希表(Hash Table) 在二分搜索中提到了在有序集合中查询某个特定元素的时候,通过折半的方式进行搜索是一种很高效的算法.那能否根据特征直接定位元素,而非折半去查找?哈希表(Hash Table ...
【Go 编程实践】从零到一：创建、测试并发布自己的 Go 库
为什么需要开发自己的 Go 库在编程语言中,包(Package)和库(Library)是代码组织和复用的重要工具.在 Go 中,包是代码的基本组织单位,每个 Go 程序都由包构成.包的作用是帮助组织 ...
开源模型 Zephyr-7B 发布——跨越三大洲的合作
最近我们刚刚发布了新的开源模型 Zephry-7B,这个模型的诞生离不开全球三大洲开源社区的协作 ️. 我们的 CSO Thomas 录了一个视频介绍了它的起源故事: 就在几个月前,巴黎的一个新团队发 ...
从HumanEval到CoderEval: 你的代码生成模型真的work吗？
本文分享自华为云社区<从HumanEval到CoderEval: 你的代码生成模型真的work吗?>,作者:华为云软件分析Lab . 本文主要介绍了一个名为CoderEval的代码生成大模 ...
C语言【编译器、变量、输入输出有关的】
C语言[编译器.变量.输入输出有关的] 一些想到的[从编译器到变量到输入输出有关的]的问题,有些是按自己理解写的答,有些待解决. C语言可以跨平台,汇编不可以,编译时C语言会根据不同系统翻译成不同形式 ...
生成伪随机数 rand；srand函数
1 相关内容来自鱼c论坛https://fishc.com.cn/forum.php?mod=viewthread&tid=84363&extra=page%3D1%26filter% ...
快速上手Prompt，让你的LLMs更智能
前言在当前社会中,随着AIGC的盛行,使用好prompt可以让自己更上一层楼.今天,我将通过星火大模型重新认识prompt设计,并与大家分享一些使用技巧. 如果你想体验星火大模型的强大魅力,请登录h ...

rasa train nlu详解：1.2-_train_graph()函数

rasa train nlu详解：1.2-_train_graph()函数的更多相关文章

随机推荐

热门专题