How to read images with different sizes in a TFRecord file

Question

How to read images with different sizes in a TFRecord file

I created a dataset and saved it in a TFRecord file. The fact is that the images have different sizes, so I also want to keep the size with the images. So I used TFRecordWriter and defined functions like:

example = tf.train.Example(features=tf.train.Features(feature={ 'rows': _int64_feature(image.shape[0]), 'cols': _int64_feature(image.shape[1]), 'image_raw': _bytes_feature(image_raw)}))

I expected that I can read and decode the image using TFRecordReader, but the fact is that I can not get the value of the rows and columns from the file, because they are tensors. So, how should I do to dynamically read the size and resize the image accordingly. Thanks guys

+5

python deep-learning tensorflow

Tong shen Jan 27 '16 at 3:12

source share

2 answers

bgshi · Answer 1 · 2016-01-27T03:59:09+0000

You can call tf.reshape using the shape dynamic parameter.

 image_rows = tf.cast(features['rows'], tf.int32) image_cols = tf.cast(features['cols'], tf.int32) image_data = tf.decode_raw(features['image_raw'], tf.uint8) image = tf.reshape(image_data, tf.pack([image_rows, image_cols, 3]))

dga · Answer 2 · 2016-01-28T00:28:06+0000

I suggest a workflow such as:

 TARGET_HEIGHT = 500 TARGET_WIDTH = 500 image = tf.image.decode_jpeg(image_buffer, channels=3) image = tf.image.convert_image_dtype(image, dtype=tf.float32) # Choose your bbox here. bbox_begin = ... (should be (h_start, w_start, 0)) bbox_size = tf.constant((TARGET_HEIGHT, TARGET_WIDTH, 3), dtype=tf.int32) cropped_image = tf.slice(image, bbox_begin, bbox_size)

cropped_image has a constant tensor size and can then be thrown into a packet at random.

You can dynamically access the size of the decoded image using tf.shape(image) . You can do the calculations on the resulting subelements and then stitch them back using something like bbox_begin = tf.pack([bbox_h_start, bbox_y_start, 0]) . You just need to insert your own logic to determine the starting points of the crop and what you want to do if the image starts smaller than you want for your pipeline.

If you want to increase the size only if the image is smaller than your target size, you need to use tf.control_flow_ops.cond or the equivalent. But you can use the min and max operations to set the size of the cropping window so that you return the full image if it is smaller than the required size, and then unconditionally resize to 500x500. The cropped image will already be 500x500, so resizing should be an effective no-op.

How to read images with different sizes in a TFRecord file

More articles: