Here is a quick example that I could come up with:
import tensorflow as tf import numpy as np components = np.arange(100).astype(np.int64) dataset = tf.contrib.data.Dataset.from_tensor_slices(components) dataset = dataset.group_by_window(key_func=lambda x: x%2, reduce_func=lambda _, els: els.batch(10), window_size=100) iterator = dataset.make_one_shot_iterator() features = iterator.get_next() sess = tf.Session() sess.run(features)
The first argument to key_func maps each item in the dataset to a key.
window_size determines the size of the bucket that is assigned to reduce_fund .
In reduce_func you get a block of window_size elements. You can shuffle, set a package or folder, but you want to.
EDIT for dynamic filling and balancing using the group_by_window function more details here :
If you have tf.contrib.dataset that contains (sequence, sequence_length, label) , and the sequence is the tf.int64 tensor:
def bucketing_fn(sequence_length, buckets): """Given a sequence_length returns a bucket id""" t = tf.clip_by_value(buckets, 0, sequence_length) return tf.argmax(t) def reduc_fn(key, elements, window_size): """Receives `window_size` elements""" return elements.shuffle(window_size, seed=0)
source share