I would suggest combining the image with tags and randomly crop them together:
import tensorflow as tf def random_crop_and_pad_image_and_labels(image, labels, size): """Randomly crops `image` together with `labels`. Args: image: A Tensor with shape [D_1, ..., D_K, N] labels: A Tensor with shape [D_1, ..., D_K, M] size: A Tensor with shape [K] indicating the crop size. Returns: A tuple of (cropped_image, cropped_label). """ combined = tf.concat([image, labels], axis=2) image_shape = tf.shape(image) combined_pad = tf.image.pad_to_bounding_box( combined, 0, 0, tf.maximum(size[0], image_shape[0]), tf.maximum(size[1], image_shape[1])) last_label_dim = tf.shape(labels)[-1] last_image_dim = tf.shape(image)[-1] combined_crop = tf.random_crop( combined_pad, size=tf.concat([size, [last_label_dim + last_image_dim]], axis=0)) return (combined_crop[:, :, :last_image_dim], combined_crop[:, :, last_image_dim:])
As an example:
cropped_image, cropped_labels = random_crop_and_pad_image_and_labels( image=tf.reshape(tf.range(4*4*3), [4, 4, 3]), labels=tf.reshape(tf.range(4*4), [4, 4, 1]), size=[2, 2]) with tf.Session() as session: print(session.run([cropped_image, cropped_labels]))
Prints something like:
[array([[[30, 31, 32], [33, 34, 35]], [[42, 43, 44], [45, 46, 47]]], dtype=int32), array([[[10], [11]], [[14], [15]]], dtype=int32)]
And the second example with an image with insufficient size:
cropped_image, cropped_labels = random_crop_and_pad_image_and_labels( image=tf.reshape(tf.range(4*1*3), [4, 1, 3]), labels=tf.reshape(tf.range(4*1), [4, 1, 1]), size=[2, 2]) with tf.Session() as session: print(session.run([cropped_image, cropped_labels]))
Print
[array([[[3, 4, 5], [0, 0, 0]], [[6, 7, 8], [0, 0, 0]]], dtype=int32), array([[[1], [0]], [[2], [0]]], dtype=int32)]