Discovering Compute Devices in Tensorflow

from tensorflow.python.client import device_lib

devices = device_lib.list_local_devices()

Each item is a DeviceAttribute, where we can use to find out the device types and names. Somehow this isn't well documented in the Tensorflow's documentation. Of which, we should find the attributes name, device_type and memory_limit most useful.

Attributes

name

A string that can be used to specify the compute device in Tensorflow, eg, tf.device(devices[0].name), to explicitly state device placement.

device_type

A string, it can take the following values to specify the type of device it is.
-CPU for the CPUs
-GPU for GPUs that are visible to Tensorflow
-TPU for Google's own TPUs, probably will never see this (or maybe we will? Google AIY Edge TPUs)

Also, more recently XLA (Accelerated Linear Algebra) Devices XLA_CPU and XLA_GPU

memory_limit

Total memory in bytes. Perhaps this can be used to compute the batch size to use?

Other Attributes

There's also the incarnation and locality attribute. locality isn't meaningful here as we are refering to local devices. Thus it's always an empty dict.

I have no idea what incarnation is.

Practical usage example for using multiple GPUs with Estimators

local_gpus = [d.name for d in devices if d.device_type == 'GPU']

strat = tf.distribute.MirroredStrategy(local_gpus)

runconfig = tf.estimator.RunConfig(train_distribute=strat,
                                   eval_distribute=strat)

est = tf.estimator.Estimator(..., config=runconfig)

Obtaining device lists with tf.Session()

Another way of obtaining compute devices is with:

import tensorflow as tf

with tf.Session() as sess:
  devices = sess.list_devices()

This however is slightly different and the objects returned are of session._DeviceAttributes. It's name string now also includes where the device, and this can also reference remote devices if an address is passed into tf.Session, say a TPU worker maybe.

Show Comments