百度Paddle速查_CPU和GPU的mnist预测训练_模型导出_模型导入再预测_导出onnx并预测
需要做点什么
方便广大烟酒生研究生、人工智障炼丹师算法工程师快速使用百度PaddelPaddle,所以特写此文章,默认使用者已有基本的深度学习概念、数据集概念。
系统环境
python 3.7.4
paddlepaddle-gpu 2.2.2
paddle2onnx 0.9.1
onnx 1.9.0
onnxruntime-gpu 1.9.0
数据准备
MNIST数据集csv文件是一个42000x785的矩阵
42000表示有42000张图片
785中第一列是图片的类别(0,1,2,..,9),第二列到最后一列是图片数据向量 (28x28的图片张成784的向量), 数据集长这个样子:
1 0 0 0 0 0 0 0 0 0 ..
0 0 0 0 0 0 0 0 0 0 ..
1 0 0 0 0 0 0 0 0 0 ..
1. 导入需要的包
import os
import onnx
import paddle
import numpy as np
import pandas as pd
import onnxruntime as ort
import paddle.nn.functional as F
from paddle.metric import Accuracy
from paddle.static import InputSpec
from sklearn.metrics import accuracy_score
2. 参数准备
N_EPOCH = 2
N_BATCH = 64
N_BATCH_NUM = 250
S_DATA_PATH = r"mnist_train.csv"
S_PADDLE_MODEL_PATH = r"cnn_model"
S_ONNX_MODEL_PATH = r"cnn_model_batch%d.onnx" % N_BATCH
S_DEVICE, N_DEVICE_ID, S_DEVICE_FULL = "gpu", 0, "gpu:0"
# S_DEVICE, N_DEVICE_ID, S_DEVICE_FULL = "cpu", 0, "cpu"
paddle.set_device(S_DEVICE_FULL)
运行输出:
CUDAPlace(0)
3. 读取数据
df = pd.read_csv(S_DATA_PATH, header=None)
print(df.shape)
np_mat = np.array(df)
print(np_mat.shape)
X = np_mat[:, 1:]
Y = np_mat[:, 0]
X = X.astype(np.float32) / 255
X_train = X[:N_BATCH * N_BATCH_NUM]
X_test = X[N_BATCH * N_BATCH_NUM:]
Y_train = Y[:N_BATCH * N_BATCH_NUM]
Y_test = Y[N_BATCH * N_BATCH_NUM:]
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)
print(X_train.shape)
print(Y_train.shape)
print(X_test.shape)
print(Y_test.shape)
class MnistDataSet(paddle.io.Dataset):
def __init__(self, X, Y):
self.l_data, self.l_label = [], []
for i in range(X.shape[0]):
self.l_data.append(X[i, :, :, :])
self.l_label.append(Y[i])
def __getitem__(self, index):
return self.l_data[index], self.l_label[index]
def __len__(self):
return len(self.l_data)
train_loader = paddle.io.DataLoader(MnistDataSet(X_train, Y_train), batch_size=N_BATCH, shuffle=True)
test_loader = paddle.io.DataLoader(MnistDataSet(X_test, Y_test), batch_size=N_BATCH, shuffle=False)
运行输出
(42000, 785)
(42000, 785)
(16000, 1, 28, 28)
(16000,)
(26000, 1, 28, 28)
(26000,)
4. 模型构建
class Net(paddle.nn.Layer):
def __init__(self):
super(Net, self).__init__()
self.encoder = paddle.nn.Sequential(paddle.nn.Conv2D(1, 16, 3, 1),
paddle.nn.MaxPool2D(2),
paddle.nn.Flatten(1),
paddle.nn.Linear(2704, 128),
paddle.nn.ReLU(),
paddle.nn.Linear(128, 10))
def forward(self, x):
out = self.encoder(x)
return out
5. 模型训练和保存
print("model train")
model = paddle.Model(Net(), InputSpec([None, 1, 28, 28], 'float32', 'x'), InputSpec([None, 10], 'float32', 'x'))
model.prepare(paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters()), paddle.nn.CrossEntropyLoss(), Accuracy())
model.fit(train_loader,
test_loader,
epochs=N_EPOCH,
batch_size=N_BATCH,
save_dir=S_PADDLE_MODEL_PATH + "_iter",
verbose=1)
model.save(S_PADDLE_MODEL_PATH + "_final_model")
print()
# model.save(S_PADDLE_MODEL_PATH) # Model save
运行输出
model train
The loss value printed in the log is the current step, and the metric is the average value of previous steps.
Epoch 1/2
step 30/250 [==>...........................] - loss: 0.3036 - acc: 0.7531 - ETA: 1s - 6ms/step
step 250/250 [==============================] - loss: 0.3151 - acc: 0.9073 - 4ms/step
save checkpoint at D:\Document\_Code_Py\ai_fast_handbook\cnn_model_iter\0
Eval begin...
step 407/407 [==============================] - loss: 0.0230 - acc: 0.9330 - 2ms/step - loss: 0.1698 - acc: 0.9315 - ETA: 0s - 2ms/ - loss: 0.3643 - acc: 0.9326 - ETA: 0s - 2m
Eval samples: 26000
Epoch 2/2
step 250/250 [==============================] - loss: 0.0744 - acc: 0.9642 - 3ms/step
save checkpoint at D:\Document\_Code_Py\ai_fast_handbook\cnn_model_iter\1
Eval begin...
step 407/407 [==============================] - loss: 0.0614 - acc: 0.9575 - 2ms/step
Eval samples: 26000
save checkpoint at D:\Document\_Code_Py\ai_fast_handbook\cnn_model_iter\final
6.模型预测
print("model pred")
model.evaluate(test_loader, batch_size=N_BATCH, verbose=1)
print()
运行输出
model pred
Eval begin...
step 407/407 [==============================] - loss: 0.0614 - acc: 0.9575 - 2ms/step - loss: 0.2162 - acc: 0.9559 - ETA
Eval samples: 26000
7.模型加载和加载模型使用
print("load model and pred test data")
model_load = paddle.Model(Net(), InputSpec([None, 1, 28, 28], 'float32', 'x'), InputSpec([None, 10], 'float32', 'x'))
# model_load.load(S_PADDLE_MODEL_PATH + "_iter/final")
model_load.load(S_PADDLE_MODEL_PATH + "_final_model")
model_load.prepare(paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters()), paddle.nn.loss.CrossEntropyLoss(), Accuracy())
model_load.evaluate(test_loader, batch_size=N_BATCH, verbose=1)
print()
运行输出
load model and pred test data
Eval begin...
step 407/407 [==============================] - loss: 0.0614 - acc: 0.9575 - 2ms/step - loss: 0.1656 - acc: 0.9561 - ETA:
Eval samples: 26000
8.导出ONNX
x_spec = InputSpec([None, 1, 28, 28], 'float32', 'x')
paddle.onnx.export(Net(), S_ONNX_MODEL_PATH, input_spec=[x_spec])
运行输出
2022-03-24 08:08:21 [INFO] ONNX model saved in cnn_model_batch64.onnx.onnx
8. 加载ONNX并运行
S_DEVICE = "cuda" if S_DEVICE == "gpu" else S_DEVICE
model = onnx.load(S_ONNX_MODEL_PATH + ".onnx")
print(onnx.checker.check_model(model)) # Check that the model is well formed
print(onnx.helper.printable_graph(model.graph)) # Print a human readable representation of the graph
ls_input_name, ls_output_name = [input.name for input in model.graph.input], [output.name for output in model.graph.output]
print("input name ", ls_input_name)
print("output name ", ls_output_name)
s_input_name = ls_input_name[0]
x_input = X_train[:N_BATCH * 2, :, :, :].astype(np.float32)
ort_val = ort.OrtValue.ortvalue_from_numpy(x_input, S_DEVICE, N_DEVICE_ID)
print("val device ", ort_val.device_name())
print("val shape ", ort_val.shape())
print("val data type ", ort_val.data_type())
print("is_tensor ", ort_val.is_tensor())
print("array_equal ", np.array_equal(ort_val.numpy(), x_input))
providers = 'CUDAExecutionProvider' if S_DEVICE == "cuda" else 'CPUExecutionProvider'
print("providers ", providers)
ort_session = ort.InferenceSession(S_ONNX_MODEL_PATH + ".onnx", providers=[providers]) # gpu运行
ort_session.set_providers([providers])
outputs = ort_session.run(None, {s_input_name: ort_val})
print("sess env ", ort_session.get_providers())
print(type(outputs))
print(outputs[0])
'''
For example ['CUDAExecutionProvider', 'CPUExecutionProvider']
means execute a node using CUDAExecutionProvider if capable, otherwise execute using CPUExecutionProvider.
'''
运行输出
None
graph paddle-onnx (
%x[FLOAT, -1x1x28x28]
) {
%conv2d_2.w_0 = Constant[value = <Tensor>]()
%conv2d_2.b_0 = Constant[value = <Tensor>]()
%linear_4.w_0 = Constant[value = <Tensor>]()
%linear_4.b_0 = Constant[value = <Tensor>]()
%linear_5.w_0 = Constant[value = <Tensor>]()
%linear_5.b_0 = Constant[value = <Tensor>]()
%conv2d_3.tmp_0 = Conv[dilations = [1, 1], group = 1, kernel_shape = [3, 3], pads = [0, 0, 0, 0], strides = [1, 1]](%x, %conv2d_2.w_0)
%Constant_0 = Constant[value = <Tensor>]()
%Reshape_0 = Reshape(%conv2d_2.b_0, %Constant_0)
%conv2d_3.tmp_1 = Add(%conv2d_3.tmp_0, %Reshape_0)
%pool2d_0.tmp_0 = MaxPool[kernel_shape = [2, 2], pads = [0, 0, 0, 0], strides = [2, 2]](%conv2d_3.tmp_1)
%Shape_0 = Shape(%pool2d_0.tmp_0)
%Slice_0 = Slice[axes = [0], ends = [1], starts = [0]](%Shape_0)
%Constant_1 = Constant[value = <Tensor>]()
%Concat_0 = Concat[axis = 0](%Slice_0, %Constant_1)
%flatten_3.tmp_0 = Reshape(%pool2d_0.tmp_0, %Concat_0)
%linear_6.tmp_0 = MatMul(%flatten_3.tmp_0, %linear_4.w_0)
%linear_6.tmp_1 = Add(%linear_6.tmp_0, %linear_4.b_0)
%relu_0.tmp_0 = Relu(%linear_6.tmp_1)
%linear_7.tmp_0 = MatMul(%relu_0.tmp_0, %linear_5.w_0)
%linear_7.tmp_1 = Add(%linear_7.tmp_0, %linear_5.b_0)
return %linear_7.tmp_1
}
input name ['x']
output name ['linear_7.tmp_1']
val device cuda
val shape [128, 1, 28, 28]
val data type tensor(float)
is_tensor True
array_equal True
providers CUDAExecutionProvider
sess env ['CUDAExecutionProvider', 'CPUExecutionProvider']
<class 'list'>
[[ 0.763783 -0.16668957 -0.16518936 ... 0.07235195 -0.01643395
0.06049304]
[ 1.8068395 -0.74552214 0.3836273 ... 0.75880224 -0.88902843
0.32921085]
[ 0.2381373 -0.14879732 -0.21634206 ... -0.06579521 -0.461351
0.15305203]
...
[ 0.97004616 0.07693841 0.05774391 ... 0.21991295 0.07179791
-0.22383693]
[ 0.5787286 -0.34370935 -0.12914304 ... -0.03083546 -0.01817408
-0.5147962 ]
[ 0.60808766 -0.12549599 -0.32095248 ... -0.32175955 -0.03176413
-0.06790417]]
你甚至不愿意Start的Github
百度Paddle速查_CPU和GPU的mnist预测训练_模型导出_模型导入再预测_导出onnx并预测的更多相关文章
- Keras速查_CPU和GPU的mnist预测训练_模型导出_模型导入再预测_导出onnx并预测
需要做点什么 方便广大烟酒生研究生.人工智障炼丹师算法工程师快速使用keras,所以特写此文章,默认使用者已有基本的深度学习概念.数据集概念. 系统环境 python 3.7.4 tensorflow ...
- Mxnet速查_CPU和GPU的mnist预测训练_模型导出_模型导入再预测_导出onnx并预测
需要做点什么 方便广大烟酒生研究生.人工智障炼丹师算法工程师快速使用mxnet,所以特写此文章,默认使用者已有基本的深度学习概念.数据集概念. 系统环境 python 3.7.4 mxnet 1.9. ...
- [深度学习] Pytorch(三)—— 多/单GPU、CPU,训练保存、加载模型参数问题
[深度学习] Pytorch(三)-- 多/单GPU.CPU,训练保存.加载预测模型问题 上一篇实践学习中,遇到了在多/单个GPU.GPU与CPU的不同环境下训练保存.加载使用使用模型的问题,如果保存 ...
- numpy(ndarray)和tensor(GPU上的numpy)速查
类型(Types) Numpy PyTorch np.ndarray torch.Tensor np.float32 torch.float32; torch.float np.float64 tor ...
- liunx速查
文件和目录 Linux 主要目录速查表 /:根目录,一般根目录下只存放目录,在 linux 下有且只有一个根目录,所有的东西都是从这里开始 当在终端里输入 /home,其实是在告诉电脑,先从 /(根目 ...
- CUDA 7.0 速查手册
Create by Jane/Santaizi 03:57:00 3/14/2016 All right reserved. 速查手册基于 CUDA 7.0 toolkit documentation ...
- HTML、CSS、JS、JQ速查笔记
一.HTML 1.编写html文件 a.格式 <!DOCTYPE html> <html> <head> <title>标题</title& ...
- OGC WebGIS 常用服务标准(WMS/WMTS/TMS/WFS)速查
本文只介绍实际工作中常用的 WMS.WMTS.WFS.TMS 四种,WCS.WPS 等其它 OGC WebService 类型请自行查阅官方资料. 目录 0. 参数传递方式 1. WMS 速查 1.1 ...
- 常用的14种HTTP状态码速查手册
分类 1xx \> Information(信息) // 接收的请求正在处理 2xx \> Success(成功) // 请求正常处理完毕 3xx \> Redirection(重定 ...
随机推荐
- Solution -「JLOI 2015」「洛谷 P3262」战争调度
\(\mathcal{Description}\) Link. 给定一棵 \(n\) 层的完全二叉树,你把每个结点染成黑色或白色,满足黑色叶子个数不超过 \(m\).对于一个叶子 \(u\), ...
- MySQL 5.7 基于GTID主从复制+并行复制+半同步复制
环境准备 IP HOSTNAME SERVICE SYSTEM 192.168.131.129 mysql-master1 mysql CentOS7.6 192.168.131.130 mysql- ...
- 【论文考古】联邦学习开山之作 Communication-Efficient Learning of Deep Networks from Decentralized Data
B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, "Communication-Efficient Learni ...
- 实际项目中使用CompletionService提升系统性能的一次实践
随着互联网应用的深入,很多传统行业也都需要接入到互联网.我们公司也是这样,保险核心需要和很多保险中介对接,比如阿里.京东等等.这些公司对于接口服务的性能有些比较高的要求,传统的核心无法满足要求,所以信 ...
- MYSQL 获取最近多少天时间列表
1.首先获取一个最近1000天的时间列表,如果不够可以按规则再加 SELECT adddate(CURDATE(),-(t2.i * 100 + t1.i * 10 + t0.i)) date FRO ...
- jmeter参数化文件路径问题
问题 win下做好的带参数化文件的脚本,放到linux下运行,由于参数化文件路径不正确,导致脚本运行失败,如果解决这个问题呢? 方案一:参数化路径 比如,参数化文件我放到jmeter的bin目录下,参 ...
- 这个数据分析工具秒杀Excel,可视化分析神器!
入门Excel容易,想要精通就很难了,大部分人通过学习能掌握60%的基础操作,但是一些复杂数据可视化分析就需要用到各种技巧,操作理解难度加深 Excel作为一直是使用最广泛的数据表格工具,在数据量日 ...
- 【C# .Net GC】延迟模式 latencyMode 通过API-GC调优
延迟模式 lowlatency 使用环境:后台工作方式只影响第 2 代中的垃圾回收:第 0 代和第 1 代中的垃圾回收始终是非并发的,因为它们完成的速度很快.GC模式是针对进程配置的,进程运行期间不能 ...
- 【C# 线程】RPC中常见的Stub| marshalling怎么理解
RPC服务的基本架构图如上,可以很清楚地看到,一个完整的RPC架构里面包含了四个核心的组件,分别是Client ,Server,Client Stub以及Server Stub,这个Stub大家可以理 ...
- 【windows安全性 之访问控制】 访问控制 详细解说
windows的安全性的两个基本支柱是身份验证(登入)和授权(访问控制). 身份验证是标识用户的过程 ,授权在验证了所标识用户是否可以访问特定资源之后进行. 相关的命名空间和类: System.Sec ...