运行以下类似代码:

while True:
inputs, outputs = get_AlexNet()
model = tf.keras.Model(inputs=inputs, outputs=outputs) model.summary() adam_opt = tf.keras.optimizers.Adam(learning_rate)
# The compile step specifies the training configuration.
model.compile(optimizer=adam_opt, loss='categorical_crossentropy', metrics=['accuracy']) # load weights from h5 file
model.load_weights('alexnet_weights.h5')

最后会报错:

OP_REQUIRES failed at cwise_ops_common.cc:70 : Resource exhausted: OOM when allocating tensor with shape[9216,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

解决办法:

from keras import backend as K
K.clear_session()

如:

from keras import backend as K

while True:
# 清空之前model占用的内存,防止OOM
K.clear_session() inputs, outputs = get_AlexNet()
model = tf.keras.Model(inputs=inputs, outputs=outputs) model.summary() adam_opt = tf.keras.optimizers.Adam(learning_rate)
# The compile step specifies the training configuration.
model.compile(optimizer=adam_opt, loss='categorical_crossentropy', metrics=['accuracy']) # load weights from h5 file
model.load_weights('alexnet_weights.h5')

详细报错如下:

2019-06-03 21:54:24.789150: W T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:275] Allocator (GPU_0_bfc) ran out of memory trying to allocate 144.00MiB.  Current allocation summary follows.
2019-06-03 21:54:24.804684: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (256): Total Chunks: 243, Chunks in use: 243. 60.8KiB allocated for chunks. 60.8KiB in use in bin. 6.6KiB client-requested in use in bin.
2019-06-03 21:54:24.813190: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (512): Total Chunks: 19, Chunks in use: 19. 14.3KiB allocated for chunks. 14.3KiB in use in bin. 14.3KiB client-requested in use in bin.
2019-06-03 21:54:24.841197: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (1024): Total Chunks: 52, Chunks in use: 52. 62.5KiB allocated for chunks. 62.5KiB in use in bin. 60.6KiB client-requested in use in bin.
2019-06-03 21:54:24.843308: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (2048): Total Chunks: 2, Chunks in use: 2. 5.0KiB allocated for chunks. 5.0KiB in use in bin. 3.0KiB client-requested in use in bin.
2019-06-03 21:54:24.844847: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (4096): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-06-03 21:54:24.846267: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (8192): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-06-03 21:54:24.848125: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (16384): Total Chunks: 31, Chunks in use: 31. 511.0KiB allocated for chunks. 511.0KiB in use in bin. 496.0KiB client-requested in use in bin.
2019-06-03 21:54:24.849356: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (32768): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-06-03 21:54:24.850511: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (65536): Total Chunks: 16, Chunks in use: 16. 1.43MiB allocated for chunks. 1.43MiB in use in bin. 1.42MiB client-requested in use in bin.
2019-06-03 21:54:24.852015: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (131072): Total Chunks: 23, Chunks in use: 23. 3.72MiB allocated for chunks. 3.72MiB in use in bin. 3.46MiB client-requested in use in bin.
2019-06-03 21:54:24.863147: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (262144): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-06-03 21:54:24.864633: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (524288): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-06-03 21:54:24.865992: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (1048576): Total Chunks: 17, Chunks in use: 17. 21.15MiB allocated for chunks. 21.15MiB in use in bin. 19.92MiB client-requested in use in bin.
2019-06-03 21:54:24.867384: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (2097152): Total Chunks: 52, Chunks in use: 52. 144.75MiB allocated for chunks. 144.75MiB in use in bin. 137.86MiB client-requested in use in bin.
2019-06-03 21:54:24.868803: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (4194304): Total Chunks: 3, Chunks in use: 3. 17.16MiB allocated for chunks. 17.16MiB in use in bin. 10.13MiB client-requested in use in bin.
2019-06-03 21:54:24.870144: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (8388608): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-06-03 21:54:24.871061: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (16777216): Total Chunks: 3, Chunks in use: 2. 62.97MiB allocated for chunks. 42.20MiB in use in bin. 37.19MiB client-requested in use in bin.
2019-06-03 21:54:24.871849: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (33554432): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-06-03 21:54:24.874994: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (67108864): Total Chunks: 21, Chunks in use: 21. 1.40GiB allocated for chunks. 1.40GiB in use in bin. 1.31GiB client-requested in use in bin.
2019-06-03 21:54:24.875718: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (134217728): Total Chunks: 20, Chunks in use: 20. 2.98GiB allocated for chunks. 2.98GiB in use in bin. 2.81GiB client-requested in use in bin.
2019-06-03 21:54:24.876800: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (268435456): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-06-03 21:54:24.877455: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:646] Bin for 144.00MiB was 128.00MiB, Chunk State:
2019-06-03 21:54:24.877906: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:665] Chunk at 0000000B03E00000 of size 1280
2019-06-03 21:54:24.878316: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:665] Chunk at 0000000B03E00500 of size 256
2019-06-03 21:54:24.879415: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:665] Chunk at 0000000B03E00600 of size 256
2019-06-03 21:54:24.879816: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:665] Chunk at 0000000B03E00700 of size 256
...
2019-06-03 21:54:24.998647: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:674] 1 Chunks of size 256733696 totalling 244.84MiB
2019-06-03 21:54:24.998857: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:678] Sum Total of in-use chunks: 4.60GiB
2019-06-03 21:54:24.999076: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:680] Stats:
Limit: 4965636505
InUse: 4943860224
MaxInUse: 4943860224
NumAllocs: 2362778
MaxAllocSize: 516972544 2019-06-03 21:54:24.999520: W T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:279] ********x************************************************************************x*****************x
2019-06-03 21:54:25.001526: W T:\src\github\tensorflow\tensorflow\core\framework\op_kernel.cc:1275] OP_REQUIRES failed at cwise_ops_common.cc:70 : Resource exhausted: OOM when allocating tensor with shape[9216,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
2019-06-03 21:54:25.108672: W T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 372.96MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-06-03 21:54:25.129713: W T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 482.40MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-06-03 21:54:25.145367: W T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 331.52MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
Traceback (most recent call last):
File "E:/PycharmProjects/ActiveLearning/AlexNet_AL.py", line 156, in <module>
validation_data=(x_val, y_val))
File "C:\taotao\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1363, in fit
validation_steps=validation_steps)
File "C:\taotao\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training_arrays.py", line 264, in fit_loop
outs = f(ins_batch)
File "C:\taotao\Python\Python36\lib\site-packages\tensorflow\python\keras\backend.py", line 2914, in __call__
fetched = self._callable_fn(*array_vals)
File "C:\taotao\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1382, in __call__
run_metadata_ptr)
File "C:\taotao\Python\Python36\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 519, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[9216,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: training_4/Adam/gradients/dense/kernel/Regularizer/Square_grad/Mul_1 = Mul[T=DT_FLOAT, _class=["loc:@training_4/Adam/gradients/AddN_5"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](dense/kernel/Regularizer/Square/ReadVariableOp, training_4/Adam/gradients/dense/kernel/Regularizer/Square_grad/Mul)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. [[Node: metrics_4/acc/Mean/_1023 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1423_metrics_4/acc/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

References

Keras解决OOM超内存问题 -- silent56_th

Keras 循环训练模型跑数据时内存泄漏的问题解决办法 -- jemmie_w

【tf.keras】Resource exhausted: OOM when allocating tensor with shape [9216,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc的更多相关文章

  1. tensorflow报错 tensorflow Resource exhausted: OOM when allocating tensor with shape

    在使用tensorflow的object detection时,出现以下报错 tensorflow Resource exhausted: OOM when allocating tensor wit ...

  2. Resource exhausted: OOM when allocating tensor with shape[3,3,384,384] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0。。。。。

    报错信息: OP_REQUIRES failed at assign_op.h:111 : Resource exhausted: OOM when allocating tensor with sh ...

  3. OP_REQUIRES failed at conv_ops.cc:386 : Resource exhausted: OOM when allocating tensor with shape..

    tensorflow-gpu验证准确率是报错如上: 解决办法: 1. 加入os.environ['CUDA_VISIBLE_DEVICES']='2' 强制使用CPU验证-----慢 2.'batch ...

  4. 【tf.keras】使用手册

    目录 0. 简介 1. 安装 1.1 安装 CUDA 和 cuDNN 2. 数据集 2.1 使用 tensorflow_datasets 导入公共数据集 2.2 数据集过大导致内存溢出 2.3 加载 ...

  5. 【tf.keras】tf.keras使用tensorflow中定义的optimizer

    Update:2019/09/21 使用 tf.keras 时,请使用 tf.keras.optimizers 里面的优化器,不要使用 tf.train 里面的优化器,不然学习率衰减会出现问题. 使用 ...

  6. 显存不够----ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[4096]

    ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[4096] 类似问题 h ...

  7. 【tf.keras】在 cifar 上训练 AlexNet,数据集过大导致 OOM

    cifar-10 每张图片的大小为 32×32,而 AlexNet 要求图片的输入是 224×224(也有说 227×227 的,这是 224×224 的图片进行大小为 2 的 zero paddin ...

  8. 【tf.keras】tf.keras加载AlexNet预训练模型

    目录 从 PyTorch 中导出模型参数 第 0 步:配置环境 第 1 步:安装 MMdnn 第 2 步:得到 PyTorch 保存完整结构和参数的模型(pth 文件) 第 3 步:导出 PyTorc ...

  9. 【tf.keras】实现 F1 score、precision、recall 等 metric

    tf.keras.metric 里面竟然没有实现 F1 score.recall.precision 等指标,一开始觉得真不可思议.但这是有原因的,这些指标在 batch-wise 上计算都没有意义, ...

随机推荐

  1. 【BZOJ 2138】stone

    Problem Description 话说 \(Nan\) 在海边等人,预计还要等上 \(M\) 分钟.为了打发时间,他玩起了石子. \(Nan\) 搬来了 \(N\) 堆石子,编号为 \(1\) ...

  2. Mac Electron 应用的签名(signature)和公证(notarization)

    背景 在MacOS 10.15之前,应用如果没有签名,那么首次打开时就会弹出这种“恶意软件”的提示框. 这时只要应用签名了,就不会弹这个框. 但在MacOS 10.14.5之后,应用如果没有公证(简单 ...

  3. Vue 中的keep-alive 什么用处?

    keep-alive keep-alive是Vue提供的一个抽象组件,用来对组件进行缓存,从而节省性能,由于是一个抽象组件,所以在v页面渲染完毕后不会被渲染成一个DOM元素 <keep-aliv ...

  4. python快速导出sql语句(mssql)的查询结果到Excel,解决SSMS无法加载大字段的问题

    遇到一个尴尬的问题,SSMS的GridView对于大字段的(varchar(max),text之类的),支持不太友好的,超过8000个长度之外的字符,SSMS的表格是显示不出来的(当然也就看不到了), ...

  5. linux 启动jar包 指定yml配置文件和输入日志文件

    命令为: nohup java -jar project.jar  --spring.config.location=/home/project-conf/application.yml >  ...

  6. dependencyManagement

    maven中的继承是在子类工程中加入父类的groupId,artifactId,version并用parent标签囊括 depenentManagement标签作用: 当父类的pom.xml中没有de ...

  7. JS高程中的垃圾回收机制与常见内存泄露的解决方法

    起因是因为想了解闭包的内存泄露机制,然后想起<js高级程序设计>中有关于垃圾回收机制的解析,之前没有很懂,过一年回头再看就懂了,写篇博客与大家分享一下. #内存的生命周期: 分配你所需要的 ...

  8. [专题总结]初探插头dp

    彻彻底底写到自闭的一个专题. 就是大型分类讨论,压行+宏定义很有优势. 常用滚动数组+哈希表+位运算.当然还有轮廓线. Formula 1: 经过所有格子的哈密顿回路数. 每个非障碍点必须有且仅有2个 ...

  9. 【洛谷5369】[PKUSC2018] 最大前缀和(状压DP)

    点此看题面 大致题意: 对于一个序列,求全排列下最大前缀和之和. 状压\(DP\) 考虑如果单纯按照题目中对于最大前缀和的定义,则一个序列它的最大前缀和是不唯一的. 为了方便统计,我们姑且规定,如果一 ...

  10. Googleplaystore数据分析

    本次所用到的数据分析工具:numpy.pandas.matplotlib.seaborn 一.分析目的 假如接下来需要开发一款APP,想了解开发什么类型的APP会更受欢迎,此次分析可以对下一步计划进行 ...