如何使用TVM Pass红外线

随着Relay / tir中优化遍数的增加，执行并手动维护其依赖关系变得很棘手。引入了一个基础结构来管理优化过程，将其应用于TVM堆栈中IR的不同层。

Relay / tir程序的优化可以以各种粒度应用，分别使用tvm.relay.transform.FunctionPass/ tvm.tir.transform.PrimFuncPass和的功能级别和模块级别tvm.transform.ModulePass 。用户可以依靠在tvm.transform.Sequential relay/ tir程序上应用一系列Pass，其中Pass之间的依赖性可以passPass下文解决。

本文主要说明开发人员如何使用pass infra进行特定的优化，创建用于Relay程序的优化管道。同样的方法也可以用于tir。

import numpy as np

import tvm

from tvm import te

import tvm.relay as relay

创建一个示例 relay程序

创建一个简单的Relay程序。该程序将用于本文中示例的各种优化。用户可以编写一个tir基本函数并应用tirPass。

def example():

shape = (1, 64, 54, 54)

c_data = np.empty(shape).astype("float32")

c = relay.const(c_data)

weight = relay.var("weight", shape=(64, 64, 3, 3))

x = relay.var("x", relay.TensorType((1, 64, 56, 56), "float32"))

conv = relay.nn.conv2d(x, weight)

y = relay.add(c, c)

y = relay.multiply(y, relay.const(2, "float32"))

y = relay.add(conv, y)

z = relay.add(y, c)

z1 = relay.add(y, c)

z2 = relay.add(z, z1)

return relay.Function([x, weight], z2)

为conv2d op注册布局更改，在示例中应用布局更改通道。alter layout pass如何工作不在本文的讨论范围之内。

@relay.op.register_alter_op_layout("nn.conv2d", level=101)

def alter_conv2d(attrs, inputs, tinfos, out_type):

data, weight = inputs

new_attrs = dict(attrs)

new_attrs["data_layout"] = "NCHW16c"

return relay.nn.conv2d(data, weight, **new_attrs)

优化程序

现在要优化程序。 relay具有许多优化功能。将选择其中一些以应用于此示例程序。

有多种优化 relay程序的方法。下面将为每个示例提供示例。

手动应用优化Pass

# Let's first create a relay Module which contains one or multiple Relay

# functions for optimization.

f = example()

mod = tvm.IRModule.from_expr(f)

# Now we can apply constant folding on the module.

# fold_const here is a callback that doesn't take any parameters.

fold_const = relay.transform.FoldConstant()

# Then, we can invoke the pass on the given module. Note that the constant

# folding pass works at the function-level. That being said, each function in

# the module will be applied with the optimization. Users don't need to iterate

# through individual functions manually to apply this pass.

mod = fold_const(mod)

# We can see from the updated program that the constants are folded.

print(mod)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

以类似方式应用更多优化。例如，消除z和z1使用的通用表达式。

mod = relay.transform.EliminateCommonSubexpr()(mod)

print(mod)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

一些优化（例如融合）也是参数化的。例如，选择级别0不允许将算子融合在一起。用户可以传递 fuse_opt_level来启用此功能。

mod = relay.transform.FuseOps(fuse_opt_level=0)(mod)

# We can observe that the optimized module contains functions that only have

# a signle primitive op.

print(mod)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%1 = %0(%x, %weight) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = fn (%p01: Tensor[(1, 64, 54, 54), float32], %p11: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p01, %p11) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3 = %2(%1, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%4 = fn (%p02: Tensor[(1, 64, 54, 54), float32], %p12: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p02, %p12) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%5 = %4(%3, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%6 = fn (%p03: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p03, %p03) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%6(%5) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

使用顺序来应用Pass序列

应用Pass实际上是乏味的，需要用户更好地了解之间的依赖性。例如，融合目前不适用于let绑定。如果relay.transform.ToANormalForm()在融合之前应用算子，无法融合在一起，此过程为每个表达式生成let绑定，以规范化Relay程序。

Relaytvm.transform.Sequentialpass指定每个遍历，将打包为整体来减轻开发人员显式处理这些问题的负担。例如，现在可以使用以下顺序样式应用。tvm.transform.Sequential与torch.nn.sequential 和mxnet.gluon.block类似。例如，torch.nn.sequential用于包含一系列PyTorch模块，这些模块将被添加，以构建网络，着重于网络层。取而代之的是tvm.transform.Sequential，下面的过程中的基础工作于优化过程。

# Now let's execute some passes through :py:class:`tvm.transform.Sequential`

f = example()

mod = tvm.IRModule.from_expr(f)

# Glob the interested passes.

seq = tvm.transform.Sequential(

[

relay.transform.FoldConstant(),

relay.transform.EliminateCommonSubexpr(),

relay.transform.FuseOps(fuse_opt_level=2),

]

)

mod1 = seq(mod)

print(mod1)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%4 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%4(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

从转换后的Relay程序中，可以看到仍然有两个相同的加法运算。这是EliminateCommonSubexpr 未实际执行。只有优化级别小于或等于2的过程才被执行 tvm.transform.Sequential。下面的pass提供了一个配置界面，供用户自定义要执行的优化级别。

with tvm.transform.PassContext(opt_level=3):

mod2 = seq(mod)

print(mod2)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

可以看到仅保留了两个相同的加法之一。

用户可以使用disabled_pass配置有选择地禁用某些pass，这类似于通用编译器（例如Clang和GCC）使用的-fno-xxx选项。例如，可以禁用EliminateCommonSubexpr，如下所示。打印的模块将再次显示两个相同的加法运算。

with tvm.transform.PassContext(opt_level=3, disabled_pass=["EliminateCommonSubexpr"]):

mod3 = seq(mod)

print(mod3)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%4(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

应用的Pass与目标无关。下文的Pass还提供了具有目标意识的方法。例如，布局变更阶段属于这种类别。

with tvm.transform.PassContext(opt_level=3):

mod4 = seq(mod)

print(mod4)

seq1 = tvm.transform.Sequential([relay.transform.AlterOpLayout()])

with tvm.transform.PassContext(opt_level=3):

with tvm.target.Target("llvm"):

mod5 = seq1(mod)

print(mod5)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = layout_transform(%x, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 56, 56, 16), float32] */;

%1 = nn.conv2d(%0, %weight, padding=[0, 0, 0, 0], data_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%2 = add(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = multiply(%2, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%4 = layout_transform(%3, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%5 = add(%1, %4) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%6 = layout_transform(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%7 = add(%5, %6) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%8 = add(%5, %6) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%9 = add(%7, %8) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

layout_transform(%9, src_layout="NCHW16c", dst_layout="NCHW") /* ty=Tensor[(1, 64, 54, 54), float32] */

}

使用Python Decorator实施Pass

下一个示例说明了如何使用Python装饰器pass传递基础流程来编排定制的优化管道。极大地简化了Pass的实施。例如，用户可以简单地定义一个修饰的类，进行功能级别的优化，如以下示例所示。transform_function包装一个类，以用c的倍数替换所有常量。调用自定义过程时，将访问给定模块中的每个函数，并且将替换函数中的每个常量。

@relay.transform.function_pass(opt_level=1)

class CustomPipeline:

"""Simple test function to replace one argument to another."""

def __init__(self, multiplier):

self.multiplier = multiplier

# This function can define a pass.

def transform_function(self, func, mod, ctx):

obj = self

class ReplaceConstant(tvm.relay.ExprMutator):

def visit_constant(self, c):

return relay.multiply(obj.multiplier, c)

return ReplaceConstant().visit(func)

f = example()

mod = tvm.IRModule.from_expr(f)

custom_pass = CustomPipeline(multiplier=relay.const(3, "float32"))

assert custom_pass.info.name == "CustomPipeline"

mod3 = custom_pass(mod)

print(mod3)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = multiply(3f /* ty=float32 */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = multiply(3f /* ty=float32 */, 2f /* ty=float32 */) /* ty=float32 */;

%4 = multiply(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%5 = add(%0, %4) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%6 = add(%5, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%7 = add(%5, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%6, %7) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

调试Pass

TVM为用户提供了一个插用式的调试通道，在pass特殊通道（PrintIR）来转储整个模块的IR之后，将IR打印出来。顺序传递示例的略微修改版本，类似于以下内容，以启用IR转储以进行FoldConstant优化。

f = example()

mod = tvm.IRModule.from_expr(f)

seq = tvm.transform.Sequential(

[

relay.transform.FoldConstant(),

tvm.transform.PrintIR(),

relay.transform.EliminateCommonSubexpr(),

relay.transform.FuseOps(),

relay.transform.AlterOpLayout(),

]

)

# By inserting the ``PrintIR`` pass after ``FoldConstant``, the pass infra will

# dump out the module IR when ``FoldConstant`` is done. Users can plug in this

# pass after any pass they want to debug for viewing the optimization effect.

# There is a more flexible debugging mechanism also exposed by the build configuration

# object. One can pass a tracing function which can be used to execute arbitrary code

# before and/or after each pass. A tracing function will receive a :py::class:`tvm.IRModule`,

# a :py:class:`tvm.transform.PassInfo` object,

# and a boolean indicating whether you are executing before, or after a pass.

# An example is below.

def print_ir(mod, info, is_before):

"""Print the name of the pass, the IR, only before passes execute."""

if is_before:

print("Running pass: {}", info)

print(mod)

with tvm.transform.PassContext(opt_level=3, trace=print_ir):

with tvm.target.Target("llvm"):

# Perform the optimizations.

mod = seq(mod)

print(mod)

print("done")

输出：

Running pass: {} The meta data of the pass: pass name: FoldConstantopt_level: 2required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]);

%1 = add(meta[relay.Constant][0], meta[relay.Constant][0]);

%2 = multiply(%1, 2f);

%3 = add(%0, %2);

%4 = add(%3, meta[relay.Constant][0]);

%5 = add(%3, meta[relay.Constant][0]);

add(%4, %5)

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() {

add(meta[relay.Constant][0], meta[relay.Constant][0])

}

Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

InferType, ]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

add(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p0, %p0)

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */)

}

Running pass: {} The meta data of the pass: pass name: ToANormalFormopt_level: 1required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p0, %p0) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

let %x = meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */;

let %x1 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p0, %p0) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

let %x2 = %x1(%x);

%x2

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() {

multiply(meta[relay.Constant][0], 2f)

}

Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

InferType, ]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

multiply(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

multiply(%p0, %p1)

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */)

}

Running pass: {} The meta data of the pass: pass name: ToANormalFormopt_level: 1required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

multiply(%p0, %p1) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

let %x = meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */;

let %x1 = 2f /* ty=float32 */;

let %x2 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

multiply(%p0, %p1) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

let %x3 = %x2(%x, %x1);

%x3

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]);

%1 = add(%0, meta[relay.Constant][0]);

%2 = add(%1, meta[relay.Constant][1]);

%3 = add(%1, meta[relay.Constant][1]);

add(%2, %3)

}

Running pass: {} The meta data of the pass: pass name: PrintIRopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: EliminateCommonSubexpropt_level: 3required passes: [

InferType, ]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2)

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

InferType, ]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]);

%1 = add(%0, %p2);

%2 = add(%1, %p3);

add(%2, %2)

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */)

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: AlterOpLayoutopt_level: 3required passes: [

InferType, ]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%7 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = layout_transform(%p0, src_layout="NCHW", dst_layout="NCHW16c");

%1 = nn.conv2d(%0, %p1, padding=[0, 0, 0, 0], data_layout="NCHW16c");

%2 = layout_transform(%p2, src_layout="NCHW", dst_layout="NCHW16c");

%3 = add(%1, %2);

%4 = layout_transform(%p3, src_layout="NCHW", dst_layout="NCHW16c");

%5 = add(%3, %4);

%6 = add(%5, %5);

layout_transform(%6, src_layout="NCHW16c", dst_layout="NCHW")

};

%7(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */)

}

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = layout_transform(%p0, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 56, 56, 16), float32] */;

%1 = nn.conv2d(%0, %p1, padding=[0, 0, 0, 0], data_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%2 = layout_transform(%p2, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%3 = add(%1, %2) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%4 = layout_transform(%p3, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%5 = add(%3, %4) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%6 = add(%5, %5) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

layout_transform(%6, src_layout="NCHW16c", dst_layout="NCHW") /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%7(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

done

概括

本文介绍了如何使用Pass基础更加方便地在TVM中编写和调用Pass。讨论了调用Pass的不同方法。使用tvm.transform.Sequential可以极大地帮助用户简化处理多个优化过程及其依赖项的工作。提供了一个示例来说明如何使用PrintIR和跟踪调试过程。

如何使用TVM Pass红外线的更多相关文章

TVM Pass IR如何使用
TVM Pass IR如何使用随着Relay / tir中优化遍数的增加,执行并手动维护其依赖关系变得很棘手.引入了一个基础结构来管理优化过程,并应用于TVM堆栈中IR的不同层. Relay / t ...
TVM设备添加以及代码生成
因为要添加的设备是一种类似于GPU的加速卡,TVM中提供了对GPU编译器的各种支持,有openCl,OpenGL和CUDA等,这里我们选取比较熟悉的CUDA进行模仿生成.从总体上来看,TVM是一个多层 ...
TVM量化小结手册
TVM量化小结手册文章目录 Offical References TVM quantization roadmap INT8 quantization proposal Quantization S ...
桥接PyTorch和TVM
桥接PyTorch和TVM 人工智能最引人入胜的一些应用是自然语言处理.像BERT或GPT-2之类的模型及其变体,可以获住足够多的文本信息. 这些模型属于称为Transformers的神经网络类体系结 ...
TVM适配NN编译Compiler缺陷
TVM适配NN编译Compiler缺陷内容纲要前言 TVM针对VTA的编译流程自定义VTA架构:TVM的缺陷与性能瓶颈 TVM缺陷与瓶颈缺陷一:SRAM配置灵活性差缺陷二:计算阵列配置僵硬 ...
自定义pass编写
自定义pass编写 TVM是一个框架,抽象了机器学习加速器的异质性.有时,用户可能需要自定义一些分析和IR转换,使TVM适应自己的专用硬件.本文可帮助用户在TVM中编写自定义pass. 先决条件 ...
TVM设计与构架构建
TVM设计与构架构建本文档适用于希望了解TVM体系结构和/或在项目上进行积极开发的开发人员.该页面的组织如下: 实例编译流程Example Compilation Flow描述TVM把一个模型的高级 ...
如何在TVM上集成Codegen（上）
如何在TVM上集成Codegen(上) 许多常用的深度学习内核,或者提供DNNL或TensorRT等框架和图形引擎,让用户以某种方式描述他们的模型,从而获得高性能.此外,新兴的深度学习加速器也有自己的 ...
TVM 架构设计
TVM 架构设计本文面向希望了解TVM体系结构和/或积极参与项目开发的开发人员. 主要内容如下: 示例编译流程概述了TVM将模型的高级概念转换为可部署模块的步骤. 逻辑架构组件部分描述逻辑组件.针对 ...

随机推荐

功能：@Vaild注解使用及扩展
@Vaild注解使用及扩展一.@Vaild注解介绍使用@Vaild注解可以简化入参的校验,配合统一异常实现简单快捷的入参校验,具体使用参照以下二.@Vaild具体使用 1.引入jar包如果你是 ...
利用 ROP 技术绕过 DEP 保护的一次简单尝试
\x 01 前言 DEP是数据执行保护的英文缩写,全称为Data Execution Prevention.数据执行保护(DEP) 是一套软硬件技术,能够在内存上执行额外检查以帮助防止在系统上运行恶意 ...
Django中的表单
目录表单 Django中的表单用表单验证数据自定义验证表单 HTML中的表单是用来提交数据给服务器的,不管后台服务器用的是 Django 还是 PHP还是JSP还是其他语言.只要把 inpu ...
BugkuCTF——wp(旧版)
title: BugkuCTF--wp(旧版) date: 2020-4-25 tags: CTF,比赛 categories: CTF 比赛 Web篇 0x001-web2 解题思路: 1.直接按F ...
『政善治』Postman工具 — 12、Postman中实现数据驱动
目录 1.什么是数据驱动? 2.测试集说明 3.创建请求与准备数据文件 (1)新增学院结果文档内容如下 (2)编写数据文件 (3)在Postman中创建请求 4.实现Postman中的数据驱动步骤1 ...
oracle 碎片管理和数据文件resize释放表空间和磁盘空间(以及sys.wri$_optstat_histgrm_history过大处理)
随着互联网的快速发展,各行各业的数据量也是与日俱增,而数据库的数据量也是直线增长,但是,如果表DML太多,则可能会在高水位线以下出现太多空白. 因此,只能将数据文件缩小到高水位线,因为高水位线以下有一 ...
[Linux] Linux命令行与Shell脚本编程大全 Part.2
进程 Linux是多用户系统,多个用户可以在不同地方通过网络连接到一个Linux系统上进行操作 w:显示登录人员信息 date:显示当前日期.时间和时区 up:从开机登录到现在经过的时间 load a ...
Java中日志组件详解
avalon-logkit Java中日志组件详解 lanhy 发布于 2020-9-1 11:35 224浏览 0收藏作为开发人员,我相信您对日志记录工具并不陌生. Java还具有功能强大且功能强 ...
Ubuntu20.04 网络配置
Ubuntu20.04 网络配置设置 ROOT 密码先设置 root 密码,后面直接使用 root 用户操作 it@it:~$ sudo passwd root [sudo] password f ...
Jenkins远程代码执行漏洞
于一个月前,进行服务器巡检时,发现服务器存在不明进程,并且以Jenkins用户身份来运行.当时进行了处理并修复了漏洞.在此补上修复过程第一反应是Jenkins存在漏洞,于是Google Jenkin ...

如何使用TVM Pass红外线

如何使用TVM Pass红外线的更多相关文章

随机推荐

热门专题