Call Paralution Solver from Fortran

Abstract: Paralution is an open source library for sparse iterative methods with special focus on multi-core and accelerator technology such as GPUs. It has a simple fortran interface and not well designed for multiple right-hand-sides. Here we defined a user friendly interface based on ISO_C_BINDING for Fortran FEA programmes to call Paralution solver.

keywords: Paralution, Fortran, sparse iterative method, iso_c_binding, FEA

　　Paralution系由瑞典科研工作者开发的基于多种稀疏矩阵迭代求解方法的C++并行计算程序包。它支持包括OpenMP、GPU/CUDA、OpenCL、OpenMP/MIC等多种后台并行库，也允许通过MPI进行多节点并行加速计算。它是多许可证的开源软件，方便易用，且不依赖特定的硬件平台和软件库，支持主流的Linux、Windows和MacOS操作平台。具体内容可以参考其官方主页：www.paralution.com

　　Paralution附带的算例中有Fortran调用的例子，只是对于有限元程序而言通常求解线性方程组AX=B的时候，右端项B不是向量形式而是矩阵形式。其次，在动力计算中总刚的稀疏结构通常而言并不发生改变，所以迭代求解器和预处理只需进行一次，这也是软件包附带算例没有考虑的。

　　在有限元分析中调用Paralution计算线性稀疏方程组AX=B的一般步骤是：

1、初始化paralution环境；

2、设置求解器并导入总刚；

3、分析求解X；

4、退出并清空paralution环境

具体的介绍详见代码和算例

CPP code:

 #include <paralution.hpp>

 namespace fortran_module{

     // Fortran interface via iso_c_binding

     /*

     Author: T.Q

     Date: 2014.11

     Version: 1.1

     License: GPL v3

     */

     extern "C"

     {

         // Initialize paralution

         void paralution_f_init();

         // Print info of paralution

         void paralution_f_info();

         // Build solver and preconditioner of paralution

         void paralution_f_build(const int[], const int[], const double[], int ,int);

         // Solve Ax=b

         void paralution_f_solve(const double[], double[], int , int&, double&, int&);

         // Stop paralution

         void paralution_f_stop();

         // Subspace method for compute general eigenpairs

         //void paralution_f_subspace();

     }

     int *row = NULL;// Row index

     int *col = NULL;// Column index

     int *row_offset = NULL;// Row offset index for CSR

     double *val = NULL;// Value of sparse matrix

     double *in_x = NULL;// Internal x vector

     double *in_rhs = NULL;// Internal rhs vector

     bool var_clear = false;// Flag of variables clearance

     bool ls_build = false;// Flag of solver exist

     using namespace paralution;

     LocalMatrix<double> *mat_A = NULL;// Matrix A i.e. Stiffness matrix in FEM

     LocalMatrix<double> *mat_B = NULL;// Matrix B i.e. Mass matrix in FEM

     // Iterative Linear Solver and Preconditioner

     IterativeLinearSolver<LocalMatrix<double>, LocalVector<double>, double> *ls = NULL;

     Preconditioner<LocalMatrix<double>, LocalVector<double>, double> *pr = NULL;

     void paralution_f_clear()

     {

         // Clear variables

         if(ls!=NULL){ ls->Clear(); delete ls;}

         if(pr!=NULL){ pr->Clear(); delete pr;}

         if(row!=NULL)free_host(&row);

         if(col!=NULL)free_host(&col);

         if(row_offset!=NULL)free_host(&row_offset);

         if(val!=NULL)free_host(&val);

         if(in_x!=NULL)free_host(&in_x);

         if(in_rhs!=NULL)free_host(&in_rhs);

         if(mat_A!=NULL){ mat_A->Clear(); delete mat_A;}

         if(mat_B!=NULL){ mat_B->Clear(); delete mat_B;}

         var_clear = true;

         ls_build = false;

     }

     void paralution_f_init()

     {

         paralution_f_clear();

         init_paralution();

     }

     void paralution_f_info()

     {

         info_paralution();

     }

     void paralution_f_stop()

     {

         paralution_f_clear();

         stop_paralution();

     }

     void paralution_f_build(const int for_row[], const int for_col[],

         const double for_val[], int n, int nnz)

     {

         int i;

         if(var_clear==false)paralution_f_clear();

         // Allocate arrays

         allocate_host(nnz,&row);

         allocate_host(nnz,&col);

         allocate_host(nnz,&val);

         // Copy sparse matrix data F2C

         for(i=;i<nnz;i++){

             row[i] = for_row[i] - ;// Fortran starts with 1 default

             col[i] = for_col[i] - ;

             val[i] = for_val[i];}

         // Load data

         mat_A = new LocalMatrix<double>;

         mat_A->SetDataPtrCOO(&row,&col,&val,"Imported Fortran COO Matrix", nnz, n, n);

         mat_A->ConvertToCSR();// Manual suggest CSR format

           mat_A->info();

         // Creat Solver and Preconditioner

         ls = new CG<LocalMatrix<double>, LocalVector<double>, double>;

         pr = new Jacobi<LocalMatrix<double>, LocalVector<double>, double>;

         ls->Clear();

         ls->SetOperator(*mat_A);

         ls->SetPreconditioner(*pr);

         ls->Build();

         ls_build = true;

     }

     void paralution_f_solve(const double for_rhs[], double for_x[], int n,

         int &iter, double &resnorm, int &err)

     {

         int i;

         LocalVector<double> x;// Solution vector

         LocalVector<double> rhs;// Right-hand-side vector

         assert(ls_build==true);

         if(in_rhs!=NULL)free_host(&in_rhs);

         if(in_x!=NULL)free_host(&in_x);        

         allocate_host(n,&in_rhs);

         allocate_host(n,&in_x);

         // Copy and set rhs vector

         for(i=;i<n;i++){ in_rhs[i] = for_rhs[i];}

         rhs.SetDataPtr(&in_rhs,"vector",n);

         // Set solution to zero

         x.Allocate("vector",n);

         x.Zeros();

         // Solve

         ls->Solve(rhs,&x);

         // Get solution

         x.LeaveDataPtr(&in_x);

         // Copy to array

         for(i=;i<n;i++){ for_x[i] = in_x[i];}

         // Clear

         x.Clear();

         rhs.Clear();

         // Get information

         iter = ls->GetIterationCount();// iteration times

         resnorm = ls->GetCurrentResidual();// residual

           err = ls->GetSolverStatus();// error code

     }

     // TODO

     // Subspace method for solve general eigenpairs for symmetric real positive

     // defined matrix

     // A*x=lambda*M*x

     // A: matrix N*N i.e. stiffness matrix

     // B: matrix N*N i.e. mass matrix

     //

     void paralution_f_subspace(const int for_row[], const int for_col[],

         const int option[], const double for_stif[], const double for_mass[],

         double eigval[], double eigvec[], int n, int nnz)

     {

     }

 }

Fortran code:

 module paralution

 use,intrinsic::iso_c_binding

 implicit none

 !*******************************************************************************

 !        Fortran module for call Paralution package

 !-------------------------------------------------------------------------------

 !            Author: T.Q.

 !            Date: 2014.11

 !           Version: 1.1

 !*******************************************************************************

 ! License: GPL v3

 !-------------------------------------------------------------------------------

 ! usage:

 !-------------------------------------------------------------------------------

 ! ) call paralution_init

 ! ) call paralution_info (optional)

 ! ) call paralution_build

 ! ) call paralution_solve (support multi-rhs)

 ! ) call paralution_stop

 !*******************************************************************************

 interface

     subroutine paralution_init()bind(c,name='paralution_f_init')

     end subroutine

     subroutine paralution_info()bind(c,name='paralution_f_info')

     end subroutine

     subroutine paralution_stop()bind(c,name='paralution_f_stop')

     end subroutine

     subroutine paralution_build(row,col,val,n,nnz)bind(c,name='paralution_f_build')

     import

     implicit none

     integer(c_int),intent(in),value::n,nnz

     integer(c_int),intent(in)::row(nnz),col(nnz)

     real(c_double),intent(in)::val(nnz)

     end subroutine

     subroutine paralution_solve_vec(rhs,x,n,iter,resnorm,err2)bind(c,name='paralution_f_solve')

     import

     implicit none

     integer(c_int),intent(in),value::n

     real(c_double),intent(in)::rhs(n)

     real(c_double)::x(n)

     integer(c_int)::iter,err2

     real(c_double)::resnorm

     end subroutine

 end interface

 contains

     subroutine paralution_solve(rhs,x,mat_rhs,mat_x,n,iter,resnorm,err2)

     implicit none

     integer(c_int)::n,iter,err2

     real(c_double),intent(in),optional::rhs(:),mat_rhs(:,:)

     real(c_double),optional::x(:),mat_x(:,:)

     real(c_double)::resnorm

     logical::isVec,isMat

     integer::i

     isVec = present(rhs).and.present(x)

     isMat = present(mat_rhs).and.present(mat_x)

     if(isVec.and.(.not.isMat))then

         ! rhs and x both vector

         call paralution_solve_vec(rhs,x,n,iter,resnorm,err2)

     elseif(isMat)then

         ! rhs and x both matrix

         do i=,size(mat_rhs,)

             call paralution_solve_vec(mat_rhs(:,i),mat_x(:,i),n,iter,resnorm,err2)

         enddo

     else

         write(*,*)"Error too many input variables"

     endif

     return

     end subroutine

 end module

 program test

 use paralution

 implicit none

 real(kind=)::a(,),b(,),x(,),val(),res2

 integer::i,j,k,row(),col(),err2

 a = .D0

 a(,[,]) = [.7D0, .13D0]

 a(,[,,]) = [.D0, .02D0, .01D0]

 a(,) = .5D0

 a(,) = .1D0

 a(,::) = [.6D0,.D0,.16D0,.09D0,.52D0,.53D0]

 a(,) = .2D0

 a(,[,]) = [.3D0, .56D0]

 a(,:) = [.6D0, .11D0]

 a(,) = .4D0

 a(,) = .1D0

 write(*,*)"Data initialize"

 do i=,size(a,)

     do j=min(i+,size(a,)),size(a,)

         a(j,i) = a(i,j)

     enddo

 enddo

 k=

 do i=,size(a,)

     do j=,size(a,)

         if(a(i,j)>.D-)then

             row(k) = i

             col(k) = j

             val(k) = a(i,j)

             write(*,*)i,j,val(k)

             k = k +

         endif

     enddo

 enddo

 b(:,) = [.287D0, .22D0, .45D0, .44D0, .486D0, .72D0, .55D0, .424D0, .621D0, .759D0]

 b(:,) = .5D0*b(:,)

 x = .D0

 i =

 j =

 k =

 call paralution_init()

 call paralution_info()

 call paralution_build(row,col,val,i,j)

 do k=,

     call paralution_solve(rhs=b(:,k),x=x(:,k),n=i,iter=j,resnorm=res2,err2=err2)

     write(*,*)"Iter times:",j," relative error:",res2," error code:",err2

     write(*,*)x(:,k)

 enddo

 call paralution_solve(mat_rhs=b,mat_x=x,n=i,iter=j,resnorm=res2,err2=err2)

 do k=,

     write(*,*)"Iter times:",j," relative error:",res2," error code:",err2

     write(*,*)x(:,k)

 enddo

 call paralution_stop()

 end program

result:

  Data initialize

            1           1   1.7000000000000000

            1           9  0.13000000000000000

            2           2   1.0000000000000000

            2           5   2.0000000000000000E-002

            2          10   1.0000000000000000E-002

            3           3   1.5000000000000000

            4           4   1.1000000000000001

            5           2   2.0000000000000000E-002

            5           5   2.6000000000000001

            5           7  0.16000000000000000

            5           8   8.9999999999999997E-002

            5           9  0.52000000000000002

            5          10  0.53000000000000003

            6           6   1.2000000000000000

            7           5  0.16000000000000000

            7           7   1.3000000000000000

            7          10  0.56000000000000005

            8           5   8.9999999999999997E-002

            8           8   1.6000000000000001

            8           9  0.11000000000000000

            9           1  0.13000000000000000

            9           5  0.52000000000000002

            9           8  0.11000000000000000

            9           9   1.3999999999999999

           10           2   1.0000000000000000E-002

           10           5  0.53000000000000003

           10           7  0.56000000000000005

           10          10   3.1000000000000001

 This version of PARALUTION is released under GPL.

 By downloading this package you fully agree with the GPL license.

 Number of CPU cores: 2

 Host thread affinity policy - thread mapping on every core

 PARALUTION ver B0.8.0

 PARALUTION platform is initialized

 Accelerator backend: None

 OpenMP threads:2

 LocalMatrix name=Imported Fortran COO Matrix; rows=10; cols=10; nnz=28; prec=64bit; asm=no; format=CSR; host backend={CPU(OpenMP)}; accelerator backend={None}; current=CPU(OpenMP)

 PCG solver starts, with preconditioner:

 Jacobi preconditioner

 IterationControl criteria: abs tol=1e-15; rel tol=1e-06; div tol=1e+08; max iter=1000000

 IterationControl initial residual = 5.33043

 IterationControl RELATIVE criteria has been reached: res norm=6.83208e-07; rel val=1.28171e-07; iter=6

 PCG ends

  Iter times:           6  relative error:   6.8320832955793363E-007  error code:           2

   0.10000029331886898       0.19999984686691041       0.30000008316594051       0.40000011088792048       0.49999997425190351       0.60000016633188091       0.70000002413346363       0.79999978910736969       0.90000002416832581        1.0000000083185150

 PCG solver starts, with preconditioner:

 Jacobi preconditioner

 IterationControl criteria: abs tol=1e-15; rel tol=1e-06; div tol=1e+08; max iter=1000000

 IterationControl initial residual = 7.99564

 IterationControl RELATIVE criteria has been reached: res norm=1.02481e-06; rel val=1.28171e-07; iter=6

 PCG ends

  Iter times:           6  relative error:   1.0248124943384603E-006  error code:           2

   0.15000043997830351       0.29999977030036568       0.45000012474891066       0.60000016633188091       0.74999996137785507       0.90000024949782143        1.0500000362001956        1.1999996836610549        1.3500000362524884        1.5000000124777721

 PCG solver starts, with preconditioner:

 Jacobi preconditioner

 IterationControl criteria: abs tol=1e-15; rel tol=1e-06; div tol=1e+08; max iter=1000000

 IterationControl initial residual = 5.33043

 IterationControl RELATIVE criteria has been reached: res norm=6.83208e-07; rel val=1.28171e-07; iter=6

 PCG ends

 PCG solver starts, with preconditioner:

 Jacobi preconditioner

 IterationControl criteria: abs tol=1e-15; rel tol=1e-06; div tol=1e+08; max iter=1000000

 IterationControl initial residual = 7.99564

 IterationControl RELATIVE criteria has been reached: res norm=1.02481e-06; rel val=1.28171e-07; iter=6

 PCG ends

  Iter times:           6  relative error:   1.0248124943384603E-006  error code:           2

   0.10000029331886898       0.19999984686691041       0.30000008316594051       0.40000011088792048       0.49999997425190351       0.60000016633188091       0.70000002413346363       0.79999978910736969       0.90000002416832581        1.0000000083185150

  Iter times:           6  relative error:   1.0248124943384603E-006  error code:           2

   0.15000043997830351       0.29999977030036568       0.45000012474891066       0.60000016633188091       0.74999996137785507       0.90000024949782143        1.0500000362001956        1.1999996836610549        1.3500000362524884        1.5000000124777721

Call Paralution Solver from Fortran的更多相关文章

[转载]John Burkardt搜集的FORTRAN源代码
Over the years, I have collected, modified, adapted, adopted or created a number of software package ...
求解轨道力学二体意义下的Lambert方程（兰伯特方程）的Fortran程序
轨道力学中二体问题下求解兰伯特方程. 老外写的Matlab程序,我把它转成了Fortran程序. !************************************************** ...
Leetcode 笔记 36 - Sudoku Solver
题目链接:Sudoku Solver | LeetCode OJ Write a program to solve a Sudoku puzzle by filling the empty cells ...
py-faster-rcnn之从solver文件创建solver对象，建立pythonlayer
faster-rcnn在训练阶段,根据一个solver的prototxt文件创建相应的网络.仅凭一个prototxt就创建网络?其实还涉及到自定义的PythonLayer. 比如lib/rpn/anc ...
【转】Caffe初试（九）solver及其设置
solver算是caffe的核心的核心,它协调着整个模型的运作.caffe程序运行必带的一个参数就是solver配置文件.运行代码一般为 #caffe train --solver=*_solver. ...
Fortran 基础语法（一）
GE是大于等于号(>=),GT是大于号(>),LE是小于等于号(<=),LT是小于号(<) IF基本用法 If(逻辑判断式) then …… End if If(逻辑判断式 ...
Fortran版MPI_barrier出现错误可能情况
在Fortran中的MPI_开头函数都常有一个整数型的错误变量用以函数返回错误信息.如: call MPI_Barrier(MPI_COMM_WORLD,ierr) 在没有ierr参数时,程序可能会出 ...
Pegasos: Primal Estimated sub-GrAdient Solver for SVM
Abstract We describe and analyze a simple and effective iterative algorithm for solving the optimiza ...
Caffe 源碼閱讀（五） Solver.cpp
1.Solver类两个构造函数 Solver(const SolverParameter& param) Solver(const string& param_file) 初始化两个类 ...

随机推荐

iOS——百度统计
百度移动统计(http://mtj.baidu.com)是一款专业的移动应用统计分析工具,支持ios和android平台.开发者可以方便地通过嵌入统计SDK,对移动应用进行全方位监测,实时监控产品表现 ...
IOS开发之——CocoaPods安装和使用 OC和swift通吃
网址一:步骤 http://blog.csdn.net/showhilllee/article/details/38398119 网址二:为什么这么做(原因) http://blog.csdn.net ...
count(*) vs count(1)--social.msdn.microsoft.com
Clever response Dave, but insufficient. I'll admit I've suggested this myself for certain questions ...
[wikioi 1307][poj 2054]欧少堆（乱搞）
题目:http://www.wikioi.com/problem/1307/ 题意:给你一个树,上面有n个节点,每个节点都有一个价值p,求一个n个节点的排列顺序,是的Σi*p[i]最小(要求父节点一定 ...
[wikioi 1519]过路费（最小生成树+树链剖分）
题目:http://www.wikioi.com/problem/1519/ 题意:给你一个连通的无向图,每条边都有权值,给你若干个询问(x,y),要输出从x到y的路径上边的最大值的最小值分析:首先 ...
Pell方程及其一般形式
一.Pell方程形如x^2-dy^2=1的不定方程叫做Pell方程,其中d为正整数,则易得当d是完全平方数的时候这方程无正整数解,所以下面讨论d不是完全平方数的情况. 设Pell方程的最小正整数解为 ...
JavaScript基础---AJAX
内容提纲: 1.XMLHttpRequest 2.GET与POST 3.封装Ajax 发文不易,转载请注明链接出处,谢谢! 2005年Jesse James Garrett发表了一篇文章,标题为:“ ...
每天一个linux命令（37）：free 命令
free命令可以显示Linux系统中空闲的.已用的物理内存及swap内存,及被内核使用的buffer.在Linux系统监控的工具中,free命令是最经常使用的命令之一. 1．命令格式: free [参 ...
iOS7(iPhone4)button不能改变button的title
最近整了个高端的iPhone4测试机,系统是iOS7.1,测出一个问题,两个button,第二个的enable为NO,点击第一个button,第二个的title改变,然而在iPhone4上并不能运行, ...
Freemarker-数字默认格式化问题
freemarker在解析数据格式的时候,默认将数字按3位来分割例如1000被格式化为1,000 这样做看似美观,但在实际操作时候会带来问题.例如我一个页面有一个元素,该元素的值由后台绑定且超过10 ...

Call Paralution Solver from Fortran

Call Paralution Solver from Fortran的更多相关文章

随机推荐

热门专题