If using TF2.0 generate_tfrecord.py is not working #119

aafaqin · 2020-10-05T14:16:44Z

Gfile and and python io error is there minor 2 line changes
this is thee working file
`from future import division
from future import print_function
from future import absolute_import

import os
import io
import pandas as pd
import tensorflow as tf
import argparse

from PIL import Image
from tqdm import tqdm
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict

def __split(df, group):
data = namedtuple('data', ['filename', 'object'])
gb = df.groupby(group)
return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]

def create_tf_example(group, path, class_dict):
with tf.io.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
encoded_jpg = fid.read()
encoded_jpg_io = io.BytesIO(encoded_jpg)
image = Image.open(encoded_jpg_io)
width, height = image.size

filename = group.filename.encode('utf8')
image_format = b'jpg'
xmins = []
xmaxs = []
ymins = []
ymaxs = []
classes_text = []
classes = []

for index, row in group.object.iterrows():
if set(['xmin_rel', 'xmax_rel', 'ymin_rel', 'ymax_rel']).issubset(set(row.index)):
xmin = row['xmin_rel']
xmax = row['xmax_rel']
ymin = row['ymin_rel']
ymax = row['ymax_rel']

 elif set(['xmin', 'xmax', 'ymin', 'ymax']).issubset(set(row.index)): xmin = row['xmin'] / width xmax = row['xmax'] / width ymin = row['ymin'] / height ymax = row['ymax'] / height xmins.append(xmin) xmaxs.append(xmax) ymins.append(ymin) ymaxs.append(ymax) classes_text.append(row['class'].encode('utf8')) classes.append(class_dict[row['class']])

tf_example = tf.train.Example(features=tf.train.Features(
feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_jpg),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes), }))
return tf_example

def class_dict_from_pbtxt(pbtxt_path):

open file, strip \n, trim lines and keep only

lines beginning with id or display_name

with open(pbtxt_path, 'r', encoding='utf-8-sig') as f:
data = f.readlines()

name_key = None
if any('display_name:' in s for s in data):
name_key = 'display_name:'
elif any('name:' in s for s in data):
name_key = 'name:'

if name_key is None:
raise ValueError(
"label map does not have class names, provided by values with the 'display_name' or 'name' keys in the contents of the file"
)

data = [l.rstrip('\n').strip() for l in data if 'id:' in l or name_key in l]

ids = [int(l.replace('id:', '')) for l in data if l.startswith('id')]
names = [
l.replace(name_key, '').replace('"', '').replace("'", '').strip() for l in data
if l.startswith(name_key)]

join ids and display_names into a single dictionary

class_dict = {}
for i in range(len(ids)):
class_dict[names[i]] = ids[i]

return class_dict

if name == 'main':
parser = argparse.ArgumentParser(
description='Create a TFRecord file for use with the TensorFlow Object Detection API.',
formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument('csv_input', metavar='csv_input', type=str, help='Path to the CSV input')
parser.add_argument('pbtxt_input',
metavar='pbtxt_input',
type=str,
help='Path to a pbtxt file containing class ids and display names')
parser.add_argument('image_dir',
metavar='image_dir',
type=str,
help='Path to the directory containing all images')
parser.add_argument('output_path',
metavar='output_path',
type=str,
help='Path to output TFRecord')

args = parser.parse_args()

class_dict = class_dict_from_pbtxt(args.pbtxt_input)

writer = tf.io.TFRecordWriter(args.output_path)
path = os.path.join(args.image_dir)
examples = pd.read_csv(args.csv_input)
grouped = __split(examples, 'filename')

for group in tqdm(grouped, desc='groups'):
tf_example = create_tf_example(group, path, class_dict)
writer.write(tf_example.SerializeToString())

writer.close()
output_path = os.path.join(os.getcwd(), args.output_path)
print('Successfully created the TFRecords: {}'.format(output_path))`

Just an update

…gress

class ids are extracted from pbtxt file instead of passed manually

…umented

not raccoon_labels.csv anymore

… num

Modified generate_tfrecords.py & generate_pbtxt.py to be used for Integer Classes as well

Including progress bar

douglasrizzo and others added 30 commits April 17, 2019 17:43

Merge pull request #1 from datitran/master

5f2d4b6

Just an update

script accepts both relative and pixel locations for objects

d6a2ef4

using argparse instead of tf.app.flags and tqdm to watch tfrecord pro…

a599645

…gress

f38b25a

class ids are extracted from pbtxt file instead of passed manually

split is a private function now

373a19a

xml_to_csv now receives command-line arguments, more flexible and doc…

cdca83c

…umented

created script to generate pbtxt from csv or from a simple txt file

5cd1ed7

removed unnecessary files

26b94b8

finished generate_pbtxt

34b66e4

create a script to generate yolo annotations from csv

89877e1

added some example files

6282b7c

rewrote readme

a6975aa

working with examples

dae0c95

fixed help

c4f720d

this has to become a script in the near future

17e84e0

train/eval separation ipynb became a script

efb4501

Update README.md

86085d4

csv is now (barelly) generated from VIA annotations

68314ec

not raccoon_labels.csv anymore

pbtxt generated from txt file

c81f9c3

changed private function to the beginning of file

35da917

Update README.md

1c8fe4a

added links to the example files in the README

0ec025d

Update README.md

e854454

added a DIAGRAM!

0e895bc

added line from csv to generate_tfrecord.py

ae8798f

fix wrong coord conversion in yolo script, add conversion of class to…

5f07596

… num

Modified to use with TF2

d43eb80

Address #4

5bc5c90

switch from 4 to 3 spaces

ed217f1

Update generate_pbtxt.py

5c4c69c

AHTESHAM ZAIDI and others added 12 commits September 24, 2020 23:34

Update generate_tfrecord.py

dc0e69b

Merge pull request #1 from ahtesham007/ahtesham007-patch-1

9010126

Modified generate_tfrecords.py & generate_pbtxt.py to be used for Integer Classes as well

Including progress bar

003e371

Including progress bar

Merge branch 'sjp' into dev

6da0aa0

Merge branch 'ahtesham' into dev

4c65fcb

Merge remote-tracking branch 'Isaac25silva/master' into dev

e848ae8

fix indentation

e451689

Update README.md

8c94edb

Update README.md

b378e45

Update README.md

165e029

Update README.md

0816891

Update README.md

50e862b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

If using TF2.0 generate_tfrecord.py is not working #119

If using TF2.0 generate_tfrecord.py is not working #119

Uh oh!

aafaqin commented Oct 5, 2020

Labels

3 participants

If using TF2.0 generate_tfrecord.py is not working #119

Are you sure you want to change the base?

If using TF2.0 generate_tfrecord.py is not working #119

Uh oh!

Conversation

aafaqin commented Oct 5, 2020

open file, strip \n, trim lines and keep only

lines beginning with id or display_name

join ids and display_names into a single dictionary

Labels

3 participants