OpenMPI源码剖析4：rte.h 头文件的说明信息

上一篇文章中说道，我们在 rte.h 中发现了有价值的说明：

我们一块一块来分析，首先看到第一块，关于 Process name Object：

 * (a) Process name objects and operations											// 进程名Object

 *     1. Definitions for integral types ompi_jobid_t and ompi_vpid_t.

 *        The jobid must be unique for a given MPI_COMM_WORLD capable of

 *        connecting to another OMPI_COMM_WORLD and the vpid will be the

 *        process's rank in MPI_COMM_WORLD.

 *     2. ompi_process_name_t - a struct that must contain at least two integer-typed fields:

 *           a. ompi_jobid_t jobid			// 作业ID，应该也就是 MPI_COMM_WORLD，为了让不同的WORLD可以相连，每个WORLD就用这个作业ID来标识

 *           b. ompi_vpid_t vpid			// 在 MPI_COMM_WORLD 的 进程rank

 *        Note that the structure can contain any number of fields beyond these

 *        two, so the process name struct for any particular RTE can be whatever

 *        is desired.

 *     3. OMPI_NAME_PRINT - a macro that prints a process name when given			// 打印进程名字的宏

 *        a pointer to ompi_process_name_t. The output format is to be

 *        a single string representing the name.  This function should

 *        be thread-safe for multiple threads to call simultaneously.

 *     4. OMPI_PROC_MY_NAME - a pointer to a global variable containing

 *        the ompi_process_name_t for this process. Typically, this is

 *        stored as a field in the ompi_process_info_t struct, but that

 *        is not a requirement.

 *     5. OMPI_NAME_WIlDCARD - a wildcard name.

 *     6. ompi_rte_compare_name_fields - a function used to compare fields

 *        in the ompi_process_name_t struct. The function prototype must be

 *        of the form:

 *        int ompi_rte_compare_name_fields(ompi_rte_cmp_bitmask_t mask,

 *                                         ompi_process_name_t *name1,

 *                                         ompi_process_name_t *name2);

 *        The bitmask must be defined to indicate the fields to be used

 *        in the comparison. Fields not included in the mask must be ignored.

 *        Supported bitmask values must include:

 *           b. OMPI_RTE_CMP_JOBID

 *           c. OMPI_RTE_CMP_VPID

 *           d. OMPI_RTE_CMP_ALL

 *      7. uint64_t ompi_rte_hash_name(name) - return a string hash uniquely

 *         representing the ompi_process_name passed in.

 *      8. OMPI_NAME - an Opal DSS constant for a handler already registered		// 序列化，反序列化？DSS？

 *         to serialize/deserialize an ompi_process_name_t structure.

第二块，是关于集体信息交换的：

  * (b) Collective objects and operations													// 集体对象

 *     1. ompi_rte_collective_t - an OPAL object used during RTE collective operations		// nodex 是指 要求每个进程“发现”作业中所有其他进程的相关互连联系信息。

 *        such as modex and barrier. It must be an opal_list_item_t and contain the

 *        following fields:

 *           a. id (ORTE type: int32_t)

 *           b. bool active

 *              flag that user can poll on to know when collective

 *              has completed - set to false just prior to

 *              calling user callback function, if provided

 *     2. ompi_rte_modex - a function that performs an exchange of endpoint information		// 各节点进程间消息传输

 *        to wireup the MPI transports. The function prototype must be of the form:

 *        int ompi_rte_modex(ompi_rte_collective_t *coll);

 *        At the completion of the modex operation, the coll->active flag must be set

 *        to false, and the endpoint information must be stored in the modex database.		// nodex 的信息会存储到数据库

 *        This function must have barrier semantics across the MPI_COMM_WORLD of the

 *        calling process.

 *     3. ompi_rte_barrier - a function that performs a barrier operation within the

 *        RTE. The function prototype must be of the form:

 *        int ompi_rte_barrier(ompi_rte_collective_t *coll);

 *        At the completion of the barrier operation, the coll->active flag must be set

 *        to false

　　更多的Modex操作信息，唯一能找到的参考是: https://github.com/open-mpi/ompi/wiki/ModexlessLaunch

启动MPI作业通常不仅要求各个进程在各个节点上生成，还要求每个进程“发现”作业中所有其他进程的相关互连联系信息。完成后面这一步的默认启动机制被称为“modex”，包含几个步骤：

1.在启动时，每个进程打开每个接口驱动程序以查询本地节点的可用接口
2.那些具有接口的驱动程序会注册一个包含其接口联系信息的modex条目
3.作业中的流程执行集体操作以交换其个人联系信息。这是在MPI_Init期间发生的阻塞操作。

也就是 MPI_Init 会进行一个各个进程间的信息交换，并且是有同步保障的。

第3块: 进程结构体

每个进程都有这样一个结构体，记录自己的 node 的 rank , 记录自己在 node 中的 rank 。

 * (c) Process info struct

 *     1. ompi_process_info_t - a struct containing info about the current process.		           //当前进程的信息

 *        The struct must contain at least the following fields:

 *           a. app_num -

 *           b. pid - this process's pid.  Should be same as getpid().

 *           c. num_procs - Number of processes in this job (ie, MCW)								// 作业的进程数，一个作业可能包括多个节点，每个节点可能有多个进程

 *           d. my_node_rank - relative rank on local node to other peers this run-time             // 节点 rank

 *                    instance knows about.  If doing dynamics, this may be something

 *                    different than my_local_rank, but will be my_local_rank in a

 *                    static job.

 *           d. my_local_rank - relative rank on local node with other peers in this job (ie, MCW) 	// 本节点上的 process rank

 *           e. num_local_peers - Number of local peers (peers in MCW on your node)                 // 本地节点进程个数

 *           f. my_hnp_uri -

 *           g. peer_modex - a collective id for the modex operation								// 不知道 modex啥意思

 *           h. peer_init_barrier - a collective id for the barrier during MPI_Init

 *           i. peer_fini_barrier - a collective id for the barrier during MPI_Finalize

 *           j. job_session_dir -

 *           k. proc_session_dir -

 *           l. nodename - a string representation for the name of the node this

 *              process is on

 *           m. cpuset -

 *     2. ompi_process_info - a global instance of the ompi_process_t structure.

 *     3. ompi_rte_proc_is_bound - global boolean that will be true if the runtime bound

 *        the process to a particular core or set of cores and is false otherwise.

第4块，初始化和反初始化操作:

  * (e) Init and finalize objects and operations

 *     1. ompi_rte_init - a function to initialize the RTE. The function

 *        prototype must be of the form:

 *        int ompi_rte_init(int *argc, char ***argv);

 *     2. ompi_rte_finalize - a function to finalize the RTE. The function

 *        prototype must be of the form:

 *        int ompi_rte_finalize(void);

 *     3. void ompi_rte_wait_for_debugger(void) - Called during MPI_Init, this

 *        function is used to wait for debuggers to do their pre-MPI attach.

 *        If there is no attached debugger, this function will not block.

第5块，数据库操作:

  * (f) Database operations

 *     1. ompi_rte_db_store - a function to store modex and other data in		// Modex 的记录插入数据库

 *        a local database. The function is primarily used for storing modex

 *        data, but can be used for general purposes. The prototype must be

 *        of the form:

 *        int ompi_rte_db_store(const ompi_process_name_t *proc,

 *                              const char *key, const void *data,

 *                              opal_data_type_t type);

 *        The implementation of this function must store a COPY of the data

 *        provided - the data is NOT guaranteed to be valid after return

 *        from the call.

 *     3. ompi_rte_db_fetch -

 *        NOTE: Fetch accepts an 'ompi_proc_t'.

 *        int ompi_rte_db_fetch(const struct ompi_proc_t *proc,

 *                              const char *key,

 *                              void **data,

 *                              opal_data_type_t type);

 *     4. ompi_rte_db_fetch_pointer -

 *        NOTE: Fetch accepts an 'ompi_proc_t'.

 *        int ompi_rte_db_fetch_pointer(const struct ompi_proc_t *proc,

 *                                      const char *key,

 *                                      void **data,

 *                                      opal_data_type_t type);

 *     5. Pre-defined db keys (with associated values after rte_init)

 *        a. OMPI_DB_HOSTNAME

 *        b. OMPI_DB_LOCALITY

其实这篇文章并没有给我们提供什么有实质性的信息，只是大致指明了一些方向。

还记起，我们在分析 MPI_Init 的消息时，并没有进入到实际的 ompi_mpi_init 函数中，下一次，我们就要尝试进入该函数。

最简单的MPI程序就2行代码: MPI_Init(); MPI_Finalize;

初始化过程中应该是做了非常多的事情的，包括很多进程信息初始化，交换信息等，我们慢慢地去探索。

OpenMPI源码剖析4：rte.h 头文件的说明信息的更多相关文章

OpenMPI源码剖析1：MPI_Init初探
OpenMPI的底层实现: 我们知道,OpenMPI应用起来还是比较简单的,但是如果让我自己来实现一个MPI的并行计算,你会怎么设计呢?————这就涉及到比较底层的东西了. 回想起我们最简单的代码,通 ...
OpenMPI源码剖析：网络通信原理(二) 如何选择网络协议?
因为比较常用的是 TCP 协议,所以在 opal/mca/btl/tcp/btl_tcp.h 头文件中找到对应的 struct mca_btl_tcp_component_t { mca_btl_ba ...
OpenMPI源码剖析3：try_kill_peers 和 ompi_rte_abort 函数
接着上一篇的疑问,我们说道,会执行 try_kill_peers 函数,它的函数定义在 ompi_mpi_abort.c 下: // 这里注释也说到了,主要是杀死在同一个communicator的进程 ...
OpenMPI源码剖析2：ompi_mpi_errors_are_fatal_comm_handler函数
上一篇文章说道,初始化失败会有一个函数调用: ompi_mpi_errors_are_fatal_comm_handler(NULL, NULL, message); 所以这里简单地进入了 ompi_ ...
OpenMPI源码剖析：网络通信原理（一）
MPI中的网络通信的原理,需要解决以下几个问题: 1. MPI使用什么网络协议进行通信? 2.中央数据库是存储在哪一台机器上? 3.集群中如果有一台机器挂掉了是否会影响其他机器? 参考: https: ...
WorldWind源码剖析系列：插件类Plugin、插件信息类PluginInfo和插件编译器类PluginCompiler
插件类Plugin是所有由插件编译器加载的插件子类的抽象父类,提供对插件的轻量级的访问控制功能. 插件信息类PluginInfo用来存储关于某个插件的信息的类,可以理解为对插件类Plugin类的进一步 ...
hadoop源码剖析--$HADOOP_HOME/bin/hadoop脚本文件分析
1. $HADOOP_HOME/bin/ hadoop #!/usr/bin/env bash# Licensed to the Apache Software Foundation (ASF) un ...
stdarg.h头文件源代码分析
谈到C语言中可变参数函数的实现(参见C语言中可变参数函数实现原理),有一个头文件不得不谈,那就是stdarg.h 本文从minix源码中的stdarg.h头文件入手进行分析: #ifndef _STD ...
DICOM医学图形处理：storescp.exe与storescu.exe源码剖析，学习C-STORE请求（续）
转载:http://blog.csdn.net/zssureqh/article/details/39237649 背景: 上一篇博文中,在对storescp工具源文件storescp.cc和DcmS ...

随机推荐

FreeRTOS 和uCOS II的简单比较
转载:http://www.viewtool.com/bbs/forum.php?mod=viewthread&tid=114 这是两种RTOS, 现在粗略比较一下. freeRTOS比uCO ...
Shell脚本之Crontab的格式
Crontab的格式第1列分钟1-59第2列小时1-23(0表示子夜)第3列日1-31第4列月1-12第5列星期0-6(0表示星期天)第6列要运行的命令下面是crontab的格式:分时日 ...
mybatis——学习笔记
配置文件 <properties resource="dbconfig.properties"></properties> 1. properties 引入 ...
个人免签收款接口 bufpay.com 支持限额设置
有产品希望收款分布到不同的手机,每个当手机达到某一限额以后就停止改手机的收款. bufpay.com 近期上线了收款限额设置功能,配置界面如下图: 每个手机微信或支付宝可以单独设置每日限额,如果该手机 ...
利用MFC Picture Control控件加载bmp,png
1.在资源视图,选择PictureControl,并且在属性中把Type设置为Bitmap. 2.加载PNG CStatic* pWnd = (CStatic*)GetDlgItem(IDC_PIC) ...
Yii2 yiisoft/mongodb 手动安装
手动将yiisoft/mongodb下载到vendor/yiisoft目录(注意约束条件). 在vendor/yiisoft/extensions.php 中添加 'yiisoft/yii2-mong ...
QWT编译与配置-Windows/Linux环境
QWT编译与配置-Windows/Linux环境 QWT和FFTW两种开源组件是常用的工程软件支持组件,QWT可以提供丰富的绘图组件功能,FFTW是优秀数字波形分析软件.本文使用基于LGPL版权协议的 ...
Hangfire初探
Hangfire 是什么? Hangfire 是一个定时任务的管理后台,它拥有定时任务功能和及其相关的管理后台界面.Hangfire 原生使用 .NET 开发的,同时支持 .NET 和 .NET Co ...
C++的特点
C和C++ C主要是应用在在驱动层,是面向过程的编程语言,对类型的定义不是很严格.C++主要是应用与应用层,是C语言的一个加强版,可以完全兼容C语言,并且还有很多C语言不具备的特性,如,C++是一种面 ...
C# 面试题（三）
1. 抽象类的特性是什么? 抽象类不能被实力化,在抽象类上使用new操作符是错误的. 抽象类允许(但不必要)包含抽象方法和入口. 抽象类不能用scaled修饰符. 2. abstract关键字怎么用? ...

OpenMPI源码剖析4：rte.h 头文件的说明信息

OpenMPI源码剖析4：rte.h 头文件的说明信息的更多相关文章

随机推荐

热门专题