Tensorflow creates tfrecords file from csv

Question

Tensorflow creates tfrecords file from csv

I am trying to write a csv file (all columns are float) to the tfrecords file and then read them back. All the examples I've seen pack csv columns and then pass it directly to sess.run (), but I can't figure out how instead of function columns and shortcut column instead of tfrecord. How can i do this?

+11

python tensorflow

Nitro Dec 30 '16 at 20:30

source share

3 answers

 def convert_to(): filename = os.path.join(wdir, 'ml-100k' + '.tfrecords') print('Writing', filename) with tf.python_io.TFRecordWriter(filename) as writer: with open("/Users/shishir/Documents/botconnect_Playground/tfRecords/ml-100k.train.rating", "r") as f: line = f.readline() while line != None and line != "": arr = line.split("\t") u, i, l = int(arr[0]), int(arr[1]), int(arr[2]) u_arr = np.reshape(u,[1]).astype('int64') i_arr = np.reshape(i,[1]).astype('int64') l_arr = np.reshape(l,[1]).astype('int64') example = tf.train.Example() example.features.feature["user"].int64_list.value.extend(u_arr) example.features.feature["item"].int64_list.value.extend(i_arr) example.features.feature["label"].int64_list.value.append(int(l_arr)) writer.write(example.SerializeToString()) line = f.readline()

So this is my solution and it works! Hope this helps

Greetings.

0

Shishir narayan Feb 01 '18 at 10:21

source share

The above solution did not work in my case. Another way to read the csv file and create tfRecord is shown below:

Function set column names: Sl.No :, time, height, width, average, standard deviation, variance, heterogeneity, PixelCount, contourCount, Class.

An example of the functions we get from dataset.csv:

Features = [5, 'D', 268, 497, 13,706, 863,4939, 29,385, 0,0427, 39675, 10]

label: medium

 def create_tf_example(features, label): tf_example = tf.train.Example(features=tf.train.Features(feature={ 'Time': tf.train.Feature(bytes_list=tf.train.BytesList(value=[features[1].encode('utf-8')])), 'Height':tf.train.Feature(int64_list=tf.train.Int64List(value=[features[2]])), 'Width':tf.train.Feature(int64_list=tf.train.Int64List(value=[features[3]])), 'Mean':tf.train.Feature(float_list=tf.train.FloatList(value=[features[4]])), 'Std':tf.train.Feature(float_list=tf.train.FloatList(value=[features[5]])), 'Variance':tf.train.Feature(float_list=tf.train.FloatList(value=[features[6]])), 'Non-homogeneity':tf.train.Feature(float_list=tf.train.FloatList(value=[features[7]])), 'PixelCount':tf.train.Feature(int64_list=tf.train.Int64List(value=[features[8]])), 'contourCount':tf.train.Feature(int64_list=tf.train.Int64List(value=[features[9]])), 'Class':tf.train.Feature(bytes_list=tf.train.BytesList(value=[label.encode('utf-8')])), })) return tf_example csv = pd.read_csv("dataset.csv").values with tf.python_io.TFRecordWriter("dataset.tfrecords") as writer: for row in csv: features, label = row[:-1], row[-1] print features, label example = create_tf_example(features, label) writer.write(example.SerializeToString()) writer.close()

For more information, click here. It works for me, hope it works.

0

Nija i pillai May 02 '19 at 7:55

source share

standy · Accepted Answer · 2017-01-04T13:59:44+0000

You will need a separate script to convert your csv file to TFRecords.

Imagine you have a CSV with the following heading:

feature_1, feature_2, ..., feature_n, label

You need to read your CSV with something like pandas , build tf.train.Example manually and then write it to a file using TFRecordWriter

 csv = pandas.read_csv("your.csv").values with tf.python_io.TFRecordWriter("csv.tfrecords") as writer: for row in csv: features, label = row[:-1], row[-1] example = tf.train.Example() example.features.feature["features"].float_list.value.extend(features) example.features.feature["label"].int64_list.value.append(label) writer.write(example.SerializeToString())

Tensorflow creates tfrecords file from csv

More articles: