Guide

multi_gpu_model

import tensorflow as tf

from keras.applications import Xception

from keras.utils import multi_gpu_model

import numpy as np

G = 8

batch_size_per_gpu = 32

batch_size = batch_size_per_gpu * G

num_samples = 1000

height = 224

width = 224

num_classes = 1000

# Instantiate the base model (or "template" model).

# We recommend doing this with under a CPU device scope,

# so that the model's weights are hosted on CPU memory.

# Otherwise they may end up hosted on a GPU, which would

# complicate weight sharing.

with tf.device('/cpu:0'):

    model = Xception(weights=None,

                     input_shape=(height, width, 3),

                     classes=num_classes)

# Replicates the model on 8 GPUs.

# This assumes that your machine has 8 available GPUs.

parallel_model = multi_gpu_model(model, gpus=G)

parallel_model.compile(loss='categorical_crossentropy',

                       optimizer='rmsprop')

# Generate dummy data.

x = np.random.random((num_samples, height, width, 3))

y = np.random.random((num_samples, num_classes))

# This `fit` call will be distributed on 8 GPUs.

# Since the batch size is 256, each GPU will process 32 samples.

parallel_model.fit(x, y, epochs=20, batch_size=batch_size)

# Save model via the template model (which shares the same weights):

model.save('my_model.h5')

results

results from Multi-GPU training with Keras, Python, and deep learning on Onepanel.io

To validate this, we trained MiniGoogLeNet on the CIFAR-10 dataset with 4 V100 GPU.

Using a single GPU we were able to obtain 63 second epochs with a total training time of 74m10s.

However, by using multi-GPU training with Keras and Python we decreased training time to 16 second epochs with a total training time of 19m3s.

4x times speedup!

Reference

History

20190910:: created.

Copyright

Post author: kezunlin
Post link: https://kezunlin.me/post/95370db7/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 3.0 unless stating additionally.

keras使用多GPU并行训练模型 | keras multi gpu training的更多相关文章

Keras使用多个GPU并行
model = Model(inputs=[v_i, v_j], outputs=output_list) model = multi_gpu_model(model,4) model.compile ...
我的Keras使用总结（5）——Keras指定显卡且限制显存用量，常见函数的用法及其习题练习
Keras 是一个高层神经网络API,Keras是由纯Python编写而成并基于TensorFlow,Theano以及CNTK后端.Keras为支持快速实验而生,能够将我们的idea迅速转换为结果.好 ...
我的Keras使用总结（1）——Keras概述与常见问题整理
今天整理了自己所写的关于Keras的博客,有没发布的,有发布的,但是整体来说是有点乱的.上周有空,认真看了一周Keras的中文文档,稍有心得,整理于此.这里附上Keras官网地址: Keras英文文档 ...
Keras官方中文文档：Keras安装和配置指南(Windows)
这里需要说明一下,笔者不建议在Windows环境下进行深度学习的研究,一方面是因为Windows所对应的框架搭建的依赖过多,社区设定不完全:另一方面,Linux系统下对显卡支持.内存释放以及存储空间调 ...
Keras官方中文文档：Keras安装和配置指南(Linux)
关于计算机的硬件配置说明推荐配置如果您是高校学生或者高级研究人员,并且实验室或者个人资金充沛,建议您采用如下配置: 主板:X299型号或Z270型号 CPU: i7-6950X或i7-7700K ...
『MXNet』第七弹_多GPU并行程序设计
资料原文一.概述思路假设一台机器上有个GPU.给定需要训练的模型,每个GPU将分别独立维护一份完整的模型参数. 在模型训练的任意一次迭代中,给定一个小批量,我们将该批量中的样本划分成份并分给每个G ...
六 GPU 并行优化的几种典型策略
前言如何对现有的程序进行并行优化,是 GPU 并行编程技术最为关注的实际问题.本文将提供几种优化的思路,为程序并行优化指明道路方向. 优化前准备首先,要明确优化的目标 - 是要将程序提速 2 倍? ...
五浅谈CPU 并行编程和 GPU 并行编程的区别
前言 CPU 的并行编程技术,也是高性能计算中的热点,也是今后要努力学习的方向.那么它和 GPU 并行编程有何区别呢? 本文将做出详细的对比,分析各自的特点,为将来深入学习 CPU 并行编程技术打下铺 ...
三 GPU 并行编程的运算架构
前言 GPU 是如何实现并行的?它实现的方式较之 CPU 的多线程又有什么分别?本文将做一个较为细致的分析. GPU 并行计算架构 GPU 并行编程的核心在于线程,一个线程就是程序中的一个单一指令流, ...

随机推荐

十ITK读取一张dcm图像然后通过vtk显示
一.功能通过ITK读取一张图片(dcm格式),然后通过vtk显示出来. 版本:VS2019 itk5.0.1 vtk 8.2.0 二.程序主要思路 1-读取dcm格式图片 2-转换为vtk可以读取的 ...
[译]Vulkan教程(08)逻辑设备和队列
[译]Vulkan教程(08)逻辑设备和队列 Introduction 入门 After selecting a physical device to use we need to set up a ...
IT兄弟连 HTML5教程 HTML5表单小结及习题
小结 HTML表单提交的方法有get方法和post方法,get方法的作用是从指定的资源请求数据,post方法的作用是向指定的资源提交要被处理的数据.HTML表单一直都是Web的核心技术之一,有了它我们 ...
win10禁止自动更新的终极方法（亲测有效）
想必用过win10的朋友对其自动更新一定不会陌生,并且深恶痛绝, 有时正专注做一件事,突然就开始自动更新,被杀个措手不及,而且更新时间真的太久了,尤其最近更新频繁,真是伤脑筋, 期间也尝试 ...
C#以对象为成员的例子
using System; using System.Collections.Generic; using System.Text; namespace test { class Program { ...
Java每日一面(Part1:计算机网络)[19/11/02]
作者:故事我忘了￠个人微信公众号:程序猿的月光宝盒 1.TCP的滑动窗口 1.1 RTT和RTO的区别 RTT:发送一个数据包到收到对应的ACK,所花费的时间 RTO:重传时间间隔,TCP在发 ...
Access Editor Settings 访问编辑器设置
This topic demonstrates how to access editors in a Detail View using a View Controller. This Control ...
测试环境部署之填坑记录-Expected one result (or null) to be returned by selectOne(), but found: 2
最近在部署性能测试环境的时候,环境部署好以后,部分功能出现接口查询异常,问题现象: 拿到错误,肯定要先判断是前端还是后端代码的问题,最简单的方式是抓包查看: 以上是报错页面捕获的接口报错,很明显的接 ...
flutter_inner_drawer 使用
版本: flutter_inner_drawer: "^0.2.2" github: https://github.com/Dn-a/flutter_inner_drawer 这 ...
vue非父子关系之间通信传值
第一种方法: 通过给vue实例添加自定义属性 <!DOCTYPE html> <html> <head> <meta charset="utf-8& ...

keras使用多GPU并行训练模型 | keras multi gpu training

Guide

multi_gpu_model

results

Reference

History

Copyright

keras使用多GPU并行训练模型 | keras multi gpu training的更多相关文章

随机推荐

热门专题