當(dāng)前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

TensorFlow指定使用GPU 多块gpu

發(fā)布時(shí)間：2023/11/28 生活经验 37 豆豆

生活随笔收集整理的這篇文章主要介紹了 TensorFlow指定使用GPU 多块gpu 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

持續(xù)監(jiān)控GPU使用情況命令：

$ watch -n 10 nvidia-smi
1
一、指定使用某個(gè)顯卡
如果機(jī)器中有多塊GPU，tensorflow會(huì)默認(rèn)吃掉所有能用的顯存，如果實(shí)驗(yàn)室多人公用一臺(tái)服務(wù)器，希望指定使用特定某塊GPU。
可以在文件開頭加入如下代碼：

import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "1" # 使用第二塊GPU（從0開始）
1
2
3
也可以制定使用某幾塊GPU

import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0, 2" # 使用第一, 三塊GPU
1
2
3
禁用GPU

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
1
2
支持的設(shè)備
在一套標(biāo)準(zhǔn)系統(tǒng)中通常有多臺(tái)計(jì)算設(shè)備。TensorFlow 支持 CPU 和 GPU 這兩種設(shè)備。它們均用 strings 表示。例如：

"/cpu:0"：機(jī)器的 CPU。
"/device:GPU:0"：機(jī)器的 GPU（如果有一個(gè)）。
"/device:GPU:1"：機(jī)器的第二個(gè) GPU（以此類推）。
1
2
3
如果 TensorFlow 指令中兼有 CPU 和 GPU 實(shí)現(xiàn)，當(dāng)該指令分配到設(shè)備時(shí)，GPU 設(shè)備有優(yōu)先權(quán)。例如，如果 matmul 同時(shí)存在 CPU 和 GPU 核函數(shù)，在同時(shí)有 cpu:0 和 gpu:0 設(shè)備的系統(tǒng)中，gpu:0 會(huì)被選來運(yùn)行 matmul。

記錄設(shè)備分配方式
要找出您的指令和張量被分配到哪個(gè)設(shè)備，請(qǐng)創(chuàng)建會(huì)話并將 log_device_placement 配置選項(xiàng)設(shè)為 True。

#Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
#Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
#Runs the op.
print(sess.run(c))
1
2
3
4
5
6
7
8
應(yīng)該會(huì)看到以下輸出內(nèi)容：

Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/device:GPU:0
a: /job:localhost/replica:0/task:0/device:GPU:0
MatMul: /job:localhost/replica:0/task:0/device:GPU:0
[[ 22. 28.]
[ 49. 64.]]
1
2
3
4
5
6
7
8
手動(dòng)分配設(shè)備
如果您希望特定指令在您選擇的設(shè)備（而非系統(tǒng)自動(dòng)為您選擇的設(shè)備）上運(yùn)行，您可以使用 with tf.device 創(chuàng)建設(shè)備上下文，這個(gè)上下文中的所有指令都將被分配在同一個(gè)設(shè)備上運(yùn)行。

# Creates a graph.
with tf.device('/cpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))
1
2
3
4
5
6
7
8
9
您會(huì)看到現(xiàn)在 a 和 b 被分配到 cpu:0。由于未明確指定運(yùn)行 MatMul 指令的設(shè)備，因此 TensorFlow 運(yùn)行時(shí)將根據(jù)指令和可用設(shè)備（此示例中的 gpu:0）選擇一個(gè)設(shè)備，并會(huì)根據(jù)要求自動(dòng)復(fù)制設(shè)備間的張量。

Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/cpu:0
a: /job:localhost/replica:0/task:0/cpu:0
MatMul: /job:localhost/replica:0/task:0/device:GPU:0
[[ 22. 28.]
[ 49. 64.]]
1
2
3
4
5
6
7
8
允許增加 GPU 內(nèi)存
默認(rèn)情況下，TensorFlow 會(huì)映射進(jìn)程可見的所有 GPU 的幾乎所有 GPU 內(nèi)存（取決于 CUDA_VISIBLE_DEVICES）。通過減少內(nèi)存碎片，可以更有效地使用設(shè)備上相對(duì)寶貴的 GPU 內(nèi)存資源。

在某些情況下，最理想的是進(jìn)程只分配可用內(nèi)存的一個(gè)子集，或者僅根據(jù)進(jìn)程需要增加內(nèi)存使用量。 TensorFlow 在 Session 上提供兩個(gè) Config 選項(xiàng)來進(jìn)行控制。

第一個(gè)是 allow_growth 選項(xiàng)，它試圖根據(jù)運(yùn)行時(shí)的需要來分配 GPU 內(nèi)存：它剛開始分配很少的內(nèi)存，隨著 Session 開始運(yùn)行并需要更多 GPU 內(nèi)存，我們會(huì)擴(kuò)展 TensorFlow 進(jìn)程所需的 GPU 內(nèi)存區(qū)域。請(qǐng)注意，我們不會(huì)釋放內(nèi)存，因?yàn)檫@可能導(dǎo)致出現(xiàn)更嚴(yán)重的內(nèi)存碎片情況。要開啟此選項(xiàng)，請(qǐng)通過以下方式在 ConfigProto 中設(shè)置選項(xiàng)：

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)
1
2
3
第二個(gè)是 per_process_gpu_memory_fraction 選項(xiàng)，它可以決定每個(gè)可見 GPU 應(yīng)分配到的內(nèi)存占總內(nèi)存量的比例。例如，您可以通過以下方式指定 TensorFlow 僅分配每個(gè) GPU 總內(nèi)存的 40%：

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)
1
2
3
如要真正限制 TensorFlow 進(jìn)程可使用的 GPU 內(nèi)存量，這非常實(shí)用。

在多 GPU 系統(tǒng)中使用單一 GPU
如果您的系統(tǒng)中有多個(gè) GPU，則默認(rèn)情況下將選擇 ID 最小的 GPU。如果您希望在其他 GPU 上運(yùn)行，則需要顯式指定偏好設(shè)置：

# Creates a graph.
with tf.device('/device:GPU:2'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))
1
2
3
4
5
6
7
8
9
如果您指定的設(shè)備不存在，您會(huì)看到 InvalidArgumentError：

InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b':
Could not satisfy explicit device specification '/device:GPU:2'
[[Node: b = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [3,2]
values: 1 2 3...>, _device="/device:GPU:2"]()]]
1
2
3
4
當(dāng)指定設(shè)備不存在時(shí)，如果您希望 TensorFlow 自動(dòng)選擇現(xiàn)有的受支持設(shè)備來運(yùn)行指令，則可以在創(chuàng)建會(huì)話時(shí)將配置選項(xiàng)中的 allow_soft_placement 設(shè)為 True。

# Creates a graph.
with tf.device('/device:GPU:2'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with allow_soft_placement and log_device_placement set
# to True.
sess = tf.Session(config=tf.ConfigProto(
allow_soft_placement=True, log_device_placement=True))
# Runs the op.
print(sess.run(c))
1
2
3
4
5
6
7
8
9
10
11
使用多個(gè) GPU
如果您想要在多個(gè) GPU 上運(yùn)行 TensorFlow，則可以采用多塔式方式構(gòu)建模型，其中每個(gè)塔都會(huì)分配給不同 GPU。例如：

# Creates a graph.
c = []
for d in ['/device:GPU:2', '/device:GPU:3']:
with tf.device(d):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
c.append(tf.matmul(a, b))
with tf.device('/cpu:0'):
sum = tf.add_n(c)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(sum))
1
2
3
4
5
6
7
8
9
10
11
12
13
您會(huì)看到以下輸出內(nèi)容：

Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus
id: 0000:02:00.0
/job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla K20m, pci bus
id: 0000:03:00.0
/job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: Tesla K20m, pci bus
id: 0000:83:00.0
/job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: Tesla K20m, pci bus
id: 0000:84:00.0
Const_3: /job:localhost/replica:0/task:0/device:GPU:3
Const_2: /job:localhost/replica:0/task:0/device:GPU:3
MatMul_1: /job:localhost/replica:0/task:0/device:GPU:3
Const_1: /job:localhost/replica:0/task:0/device:GPU:2
Const: /job:localhost/replica:0/task:0/device:GPU:2
MatMul: /job:localhost/replica:0/task:0/device:GPU:2
AddN: /job:localhost/replica:0/task:0/cpu:0
[[ 44. 56.]
[ 98. 128.]]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cifar10 教程就是個(gè)很好的例子，演示了如何使用多個(gè) GPU 進(jìn)行訓(xùn)練。
見官方教程：https://www.tensorflow.org/programmers_guide/using_gpu?hl=zh-cn
---------------------
作者：永興呵呵噠
來源：CSDN
原文：https://blog.csdn.net/u014106566/article/details/83821669
版權(quán)聲明：本文為博主原創(chuàng)文章，轉(zhuǎn)載請(qǐng)附上博文鏈接！

總結(jié)

以上是生活随笔為你收集整理的TensorFlow指定使用GPU 多块gpu的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： ImportError: DLL loa
下一篇： Tensorflow中tf.Config