Parallel NetCDF API

所有C接口前加ncmpi前缀，Fortran接口前加nfmpi前缀
函数返回整数 NetCDF 状态变量

1. Variable and Parameter Types

函数采用MPI_Offset类型来表示大小参数，与size_t相比（32-bit）MPI_Offset为64位变量，表示数据几乎不受限制。

有关变量起始下标编号start，各个维度长度count，及间隔大小stride等标量或向量都需定义为MPI_Offset类型。

2. Dataset Functions

ncmpi_create与ncmpi_open函数多了一个附加参数MPI_Info，这个参数主要用于传递提示变量。调用时传递MPI_INFO_NULL则可以忽略此功能。

int ncmpi_create(MPI_Comm comm,

                 const char *path,

                 int cmode,

                 MPI_Info info,

                 int *ncidp)

int ncmpi_open(MPI_Comm comm,

               const char *path,

               int omode,

               MPI_Info info,

               int *ncidp)

3. Define Mode Functions

所有进程必须采用相同值调用这类函数，在定义结束后，所有进程定义内容会进行检查与比较。若其不相同，函数ncmpi_enddef会返回错误代码。

4. Inquiry Functions

Inquiry函数可以在定义模式（define mode）或数据模式（data mode）下被调用。

5. Attribute Functions

Attributes（属性）主要在NetCDF中储存标量或是向量来描述变量。

在原始接口中，attribute函数可以在定义模式或数据模式下调用；然而，在数据模式状态下修改attributes的值有可能会失败。主要由于文件所需空间可能会改变。

6. Data Mode Functions

数据模式（data mode）可分为两个状态：总体模式（collective mode）与独立模式（independent mode）。当用户调用ncmpi_enddef或ncmpi_open后，文件自动进入总体模式。

在总体模式内，所有进程必须在代码相同位置调用相同的函数。调用参数如 start，count，stride 等则可以不同；在独立模式内，进程不必共同调用API。

在定义状态（define mode）下不能进入独立模式，需要首先调用ncmpi_enddef来离开定义状态随后进入数据模式。

数据模式函数分为两类。第一类模仿传统的NetCDF函数并且将其简单的又传统NetCDF接口迁移成为并行NetCDF函数接口。我们称这类数据接口为高级数据模式接口(high level data mode interface)。

第二个类函数使用更多的MPI功能来提供更好的处理内部数据，并且更充分地展示MPI-IO处理应用程序的能力。所有的第一类函数将按照这类函数实现。我们这类称为灵活数据模式接口（flexible data mode interface）。

在两类函数中，都提供了包括独立模式与总体模式操作。总体模式函数名后以_all结尾。所有这些进程必须同时调用该函数。

6.1. High Level Data Mode Interface

每个独立函数都类似于NetCDF数据模式接口。主要变化就是使用MPI_Offset代替size_t类型数据。

ncmpi_put_var_<type> 将变量所有值写入Netcdf文件；
ncmpi_put_vara_<type> 写入数据部分由start向量指定起始位置，count指定各维度长度；
ncmpi_put_vars_<type> 写入数据部分由start向量指定起始位置，count指定各维度长度，stride指定各维度间隔；
ncmpi_put_varm_<type>

6.2. Flexible Data Mode Interface

6.3. Mapping Between NetCDF and MPI Types

7. Q & A

For more details, please refer to Parallel netCDF Q&A

Q: How do I use the buffered nonblocking write APIs?

A: Buffered nonblocking write APIs copy the contents of user buffers into an internally allocated buffer, so the user buffers can be reused immediately after the calls return. A typical way to use these APIs is described below.

First, tell PnetCDF how much space can be allocated to be used by the APIs.
Make calls to the buffered put APIs.
Make calls to the (collective) wait APIs.
Free the space allocated by the internal buffer.

For further information about the buffered nonblocking APIs, readers are referred to this page.

Q: What is the difference between collective and independent APIs?

A: Collective APIs requires all MPI processes to participate the call. This requirement allows MPI-IO and PnetCDF to coordinate the I/O requesting processes to rearrange requests into a form that can achieve the best performance from the underlying file system. On the contrary, independent APIs (also referred as non-collective) has no such requirement. All PnetCDF collective APIs (except create, open, and close) have a suffix of _all, corresponding to their independent counterparts. To switch from collective data mode to independent mode, users must call ncmpi_begin_indep_data. API ncmpi_begin_indep_data is to exit the independent mode.

Q: Should I use collective APIs or independent APIs?

A: Users are encouraged to use collective APIs whenever possible. Collective API calls require the participation of all MPI processes that open the shared file. This requirement allows MPI-IO and PnetCDF to coordinate the I/O requesting processes to rearrange requests into a form that can achieve the best performance from the underlying file system. If the nature of user's I/O does not permit to call collective APIs (such as the number of requests are not equal among processes, or is determined at the run time), then we recommend the followings.

Force all the processes participate the collective calls. When a process has nothing to request, users can still call a collective API with zero-length request. This is achieved by set the contents of argument count to zero.
Use nonblocking APIs. Individual processes can make any number of calls to nonblocking APIs independently from other processes. At the end, a collective wait API, ncmpi_wait_all, is recommended to used to allow all nonblocking requests to commit to the file system.

总结：推荐使用集合接口（collective APIs），不适用也尽量使。

8. Example

/*********************************************************************

 *

 *  Copyright (C) 2012, Northwestern University and Argonne National Laboratory

 *  See COPYRIGHT notice in top-level directory.

 *

 *********************************************************************/

/* $Id$ */

/* simple demonstration of pnetcdf

 * text attribute on dataset

 * write out rank into 1-d array collectively.

 * The most basic way to do parallel i/o with pnetcdf */

/* This program creates a file, say named output.nc, with the following

   contents, shown by running ncmpidump command .

    % mpiexec -n 4 pnetcdf-write-standard /orangefs/wkliao/output.nc

    % ncmpidump /orangefs/wkliao/output.nc

    netcdf output {

    // file format: CDF-2 (large file)

    dimensions:

            d1 = 4 ;

            time = UNLIMITED ; // (2 currently)

    variables:

            int v1(time, d1) ;

            int v2(d1) ;

    // global attributes:

                :string = "Hello World\n",

        "" ;

    data:

         v1 =

            0, 1, 2, 3,

            1, 2, 3, 4 ;

         v2 = 0, 1, 2, 3 ;

    }

*/

#include <stdlib.h>

#include <mpi.h>

#include <pnetcdf.h>

#include <stdio.h>

static void handle_error(int status, int lineno)

{

    fprintf(stderr, "Error at line %d: %s\n", lineno, ncmpi_strerror(status));

    MPI_Abort(MPI_COMM_WORLD, 1);

}

int main(int argc, char **argv) {

    int ret, ncfile, nprocs, rank, dimid1, dimid2, varid1, varid2, ndims;

    MPI_Offset start, count=1;

    int t, i;

    int v1_dimid[2];

    MPI_Offset v1_start[2], v1_count[2];

    int v1_data[4];

    char buf[13] = "Hello World\n";

    int data;

    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);

    if (argc != 2) {

        if (rank == 0) printf("Usage: %s filename\n", argv[0]);

        MPI_Finalize();

        exit(-1);

    }

    ret = ncmpi_create(MPI_COMM_WORLD, argv[1],

                       NC_CLOBBER, MPI_INFO_NULL, &ncfile);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    ret = ncmpi_def_dim(ncfile, "d1", nprocs, &dimid1);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    ret = ncmpi_def_dim(ncfile, "time", NC_UNLIMITED, &dimid2);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    v1_dimid[0] = dimid2;

    v1_dimid[1] = dimid1;

    ndims = 2;

    ret = ncmpi_def_var(ncfile, "v1", NC_INT, ndims, v1_dimid, &varid1);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    ndims = 1;

    ret = ncmpi_def_var(ncfile, "v2", NC_INT, ndims, &dimid1, &varid2);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    ret = ncmpi_put_att_text(ncfile, NC_GLOBAL, "string", 13, buf);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    /* all processors defined the dimensions, attributes, and variables,

     * but here in ncmpi_enddef is the one place where metadata I/O

     * happens.  Behind the scenes, rank 0 takes the information and writes

     * the netcdf header.  All processes communicate to ensure they have

     * the same (cached) view of the dataset */

    ret = ncmpi_enddef(ncfile);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    start=rank, count=1, data=rank;

    ret = ncmpi_put_vara_int_all(ncfile, varid2, &start, &count, &data);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    for (t = 0; t<2; t++){

        v1_start[0] = t, v1_start[1] = rank;

        v1_count[0] = 1, v1_count[1] = 1;

        for (i = 0; i<4; i++){

            v1_data[i] = rank+t;

        }

        /* in this simple example every process writes its rank to two 1d variables */

        ret = ncmpi_put_vara_int_all(ncfile, varid1, v1_start, v1_count, v1_data);

        if (ret != NC_NOERR) handle_error(ret, __LINE__);

    }

    ret = ncmpi_close(ncfile);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    MPI_Finalize();

    return 0;

}

Parallel NetCDF 简介的更多相关文章

痞子衡嵌入式：通用NOR接口标准(CFI-JESD68)及SLC Parallel NOR简介
大家好,我是痞子衡,是正经搞技术的痞子.今天痞子衡给大家介绍的是CFI标准及SLC Parallel NOR. NOR Flash是嵌入式世界里最常见的存储器,常常内嵌在微控制器里(Parallel型 ...
痞子衡嵌入式：飞思卡尔i.MX RT系列MCU启动那些事（9）- 从Parallel NOR启动
大家好,我是痞子衡,是正经搞技术的痞子.今天痞子衡给大家介绍的是飞思卡尔i.MX RT系列MCU的Parallel NOR启动. 上一篇讲i.MXRT从Raw NAND启动的文章从Raw NAND启 ...
痞子衡嵌入式：串行EEPROM接口事实标准及SPI EEPROM简介
大家好,我是痞子衡,是正经搞技术的痞子.今天痞子衡给大家介绍的是EEPROM接口标准及SPI EEPROM. 痞子衡之前写过一篇文章 <SLC Parallel NOR简介>,介绍过并行N ...
痞子衡嵌入式：飞思卡尔i.MX RT系列MCU开发那些事 - 索引
大家好,我是痞子衡,是正经搞技术的痞子.本系列痞子衡给大家介绍的是飞思卡尔i.MX RT系列微控制器相关知识. 飞思卡尔半导体(现恩智浦半导体)于2017年开始推出的i.MX RT系列开启了高性能MC ...
CESM部署安装环境和使用
平台信息 Description: CentOS Linux release 7.6.1810 (Core) 安装CESM 安装前提:(小提示:耗时较长,需要耐心)阅读原文 CentOS 7(检查:s ...
.NET异步程序设计之任务并行库
目录 1.简介 2.Parallel类 2.0 Parallel类简介 2.1 Parallel.For() 2.2 Parallel.ForEach() 2.3 Parallel.Invoke() ...
R︱并行计算以及提高运算效率的方式(parallel包、clusterExport函数、SupR包简介)
要学的东西太多,无笔记不能学~~ 欢迎关注公众号,一起分享学习笔记,记录每一颗"贝壳"~ --------------------------- 终于开始攻克并行这一块了,有点小兴 ...
比特币_Bitcoin 简介
2008-11 Satoshi Nakamoto Bitcoin: A Peer-to-Peer Electronic Cash System http://p2pbucks.com/?p=99 ...
[译]何时使用 Parallel.ForEach，何时使用 PLINQ
原作者: Pamela Vagata, Parallel Computing Platform Group, Microsoft Corporation 原文pdf:http://download.c ...

随机推荐

【数据结构与算法Python版学习笔记】树——树的遍历 Tree Traversals
遍历方式前序遍历在前序遍历中,先访问根节点,然后递归地前序遍历左子树,最后递归地前序遍历右子树. 中序遍历在中序遍历中,先递归地中序遍历左子树,然后访问根节点,最后递归地中序遍历右子树. 后序遍 ...
vue3.x相对于vue2.x生命周期改动
vue3.x已经正式发布了,部分小伙伴已经用了vue3.x开发,部分小伙伴还在观望中,下面是两个影响比较大的改动 1.beforeDestroy和destroyed不能用了. 这个应该是vue2.x项 ...
2021.10.9考试总结[NOIP模拟72]
T1出了个大阴间题状压\(DP\),记当前状态的代价和与方案数.状态\(\Theta(2^nn)\),转移\(\Theta(n)\). 发现每个状态的最大值只会是所选集合的\(max\)或加一.于是 ...
vs2010中release模式下调试程序
debug模式调试信息全,但是速度很慢,在数据量比较大的时候非常影响调试效率,release模式速度快,但是没有调试信息.所以在编译的时候很多编译器会提供一种折中的编译方式,在release下提供调试 ...
第33篇-方法调用指令之invokeinterface
invokevirtual字节码指令的模板定义如下: def(Bytecodes::_invokeinterface , ubcp|disp|clvm|____, vtos, vtos, invoke ...
『学了就忘』Linux基础 — 9、虚拟机中快照的使用
目录 1.快照的含义 2.快照的使用步骤一:创建拍摄快照步骤二:填写快照信息并创建步骤三:查看快照步骤四:操作快照 3.管理虚拟机小技巧 4.关于快照说明快照和克隆是VMware中两个非常实 ...
ArrayList集合底层原理
目录 ArrayList集合特点及源码分析 ArrayList源码分析成员变量构造函数增加方法 add(E e)方法 add(int index, E element)方法删除方法 remov ...
dhcpd:bad subnet number/mask combination. subnet
今天在调试wifi热点启动hdcpd服务时出现报错"bad subnet number/mask combination. subnet 192.168.1.1", Interne ...
dubbo 配置 loadbalance 不生效？撸一把源码
背景很久之前我给业务方写了一个 dubbo loadbalance 的扩展(为了叙述方便,这个 loadbalance 扩展就叫它 XLB 吧),这两天业务方反馈说 XLB 不生效了我心想,不可能 ...
java 垃圾回收及内存分配策略
一.在垃圾收集器对堆进行回收前,首先需要判断对象是否"存活",对已经"死去"的对象进行回收判断对象是否存活:引用计数法和可达性分析法引用计数法:给对象添加一 ...

Parallel NetCDF 简介